Bug 88739 - [gen4] GPU crash - segfault at error in chrome
Summary: [gen4] GPU crash - segfault at error in chrome
Status: RESOLVED WORKSFORME
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-23 09:49 UTC by Victor
Modified: 2015-07-14 17:54 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
i915_error_state and dmesg (154.94 KB, text/plain)
2015-01-31 20:12 UTC, Victor
Details

Description Victor 2015-01-23 09:49:32 UTC
It looks like when I open a page that contains video content in Google Chrome, the page stops responding, then the cursor becomes stalled and stays at that location, while the mouse keeps working moving the focus point instead of the cursor. The focus point highlights the items it's pointing to, and the mouse operates normally while the cursor is lost behind.

If I log off, the login page is corrupted - it's 2-3 inches horizontally offset to the side.

syslog:
Jan 23 12:18:30 sanmateo kernel: [  172.035296] Watchdog[2768]: segfault at 0 ip b611130f sp afb58cc0 error 6 in chrome[b20b2000+5689000]
Jan 23 12:18:43 sanmateo kernel: [  185.122790] Watchdog[3067]: segfault at 0 ip b61b130f sp afbf8cc0 error 6 in chrome[b2152000+5689000]
Jan 23 12:18:56 sanmateo kernel: [  197.528221] Watchdog[3075]: segfault at 0 ip b60fd30f sp afb44cc0 error 6 in chrome[b209e000+5689000]
Jan 23 12:19:00 sanmateo kernel: [  201.816014] [drm] stuck on render ring
Jan 23 12:19:00 sanmateo kernel: [  201.816022] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jan 23 12:19:00 sanmateo kernel: [  201.816024] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jan 23 12:19:00 sanmateo kernel: [  201.816026] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jan 23 12:19:00 sanmateo kernel: [  201.816029] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jan 23 12:19:00 sanmateo kernel: [  201.816031] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jan 23 12:19:00 sanmateo kernel: [  201.818106] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x4697000 ctx 0) at 0x4698a04
Jan 23 12:19:00 sanmateo kernel: [  201.984019] [drm] GMBUS [i915 gmbus vga] timed out, falling back to bit banging on pin 2
Jan 23 12:19:00 sanmateo kernel: [  202.368018] [drm:i915_reset] *ERROR* Failed to reset chip.

/sys/class/drm/card0/error: no error state collected
Comment 1 Victor 2015-01-23 09:54:44 UTC
(In reply to Victor from comment #0)
> Every time when I open a page that contains video content in Google
> Chrome, the page stops responding, then the cursor becomes stalled and stays
> at that location, while the mouse keeps working moving the focus point
> instead of the cursor. The focus point highlights the items it's pointing
> to, and the mouse operates normally while the cursor is lost behind.
> 
> If I log off, the login page is corrupted - it's 2-3 inches horizontally
> offset to the side.
> 
> syslog:
> Jan 23 12:18:30 sanmateo kernel: [  172.035296] Watchdog[2768]: segfault at
> 0 ip b611130f sp afb58cc0 error 6 in chrome[b20b2000+5689000]
> Jan 23 12:18:43 sanmateo kernel: [  185.122790] Watchdog[3067]: segfault at
> 0 ip b61b130f sp afbf8cc0 error 6 in chrome[b2152000+5689000]
> Jan 23 12:18:56 sanmateo kernel: [  197.528221] Watchdog[3075]: segfault at
> 0 ip b60fd30f sp afb44cc0 error 6 in chrome[b209e000+5689000]
> Jan 23 12:19:00 sanmateo kernel: [  201.816014] [drm] stuck on render ring
> Jan 23 12:19:00 sanmateo kernel: [  201.816022] [drm] GPU crash dump saved
> to /sys/class/drm/card0/error
> Jan 23 12:19:00 sanmateo kernel: [  201.816024] [drm] GPU hangs can indicate
> a bug anywhere in the entire gfx stack, including userspace.
> Jan 23 12:19:00 sanmateo kernel: [  201.816026] [drm] Please file a _new_
> bug report on bugs.freedesktop.org against DRI -> DRM/Intel
> Jan 23 12:19:00 sanmateo kernel: [  201.816029] [drm] drm/i915 developers
> can then reassign to the right component if it's not a kernel issue.
> Jan 23 12:19:00 sanmateo kernel: [  201.816031] [drm] The gpu crash dump is
> required to analyze gpu hangs, so please always attach it.
> Jan 23 12:19:00 sanmateo kernel: [  201.818106] [drm:i915_set_reset_status]
> *ERROR* render ring hung inside bo (0x4697000 ctx 0) at 0x4698a04
> Jan 23 12:19:00 sanmateo kernel: [  201.984019] [drm] GMBUS [i915 gmbus vga]
> timed out, falling back to bit banging on pin 2
> Jan 23 12:19:00 sanmateo kernel: [  202.368018] [drm:i915_reset] *ERROR*
> Failed to reset chip.
> 
> /sys/class/drm/card0/error: no error state collected
Comment 2 Victor 2015-01-23 09:59:36 UTC
(In reply to Victor from comment #0)
> Every time when I open a page that contains video content in Google
> Chrome, the page stops responding, then the cursor becomes stalled and stays
> at that location, while the mouse keeps working moving the focus point
> instead of the cursor. The focus point highlights the items it's pointing
> to, and the mouse operates normally while the cursor is lost behind.
> 
> If I log off, the login page is corrupted - it's 2-3 inches horizontally
> offset to the side.
> 
> syslog:
> Jan 23 12:18:30 sanmateo kernel: [  172.035296] Watchdog[2768]: segfault at
> 0 ip b611130f sp afb58cc0 error 6 in chrome[b20b2000+5689000]
> Jan 23 12:18:43 sanmateo kernel: [  185.122790] Watchdog[3067]: segfault at
> 0 ip b61b130f sp afbf8cc0 error 6 in chrome[b2152000+5689000]
> Jan 23 12:18:56 sanmateo kernel: [  197.528221] Watchdog[3075]: segfault at
> 0 ip b60fd30f sp afb44cc0 error 6 in chrome[b209e000+5689000]
> Jan 23 12:19:00 sanmateo kernel: [  201.816014] [drm] stuck on render ring
> Jan 23 12:19:00 sanmateo kernel: [  201.816022] [drm] GPU crash dump saved
> to /sys/class/drm/card0/error
> Jan 23 12:19:00 sanmateo kernel: [  201.816024] [drm] GPU hangs can indicate
> a bug anywhere in the entire gfx stack, including userspace.
> Jan 23 12:19:00 sanmateo kernel: [  201.816026] [drm] Please file a _new_
> bug report on bugs.freedesktop.org against DRI -> DRM/Intel
> Jan 23 12:19:00 sanmateo kernel: [  201.816029] [drm] drm/i915 developers
> can then reassign to the right component if it's not a kernel issue.
> Jan 23 12:19:00 sanmateo kernel: [  201.816031] [drm] The gpu crash dump is
> required to analyze gpu hangs, so please always attach it.
> Jan 23 12:19:00 sanmateo kernel: [  201.818106] [drm:i915_set_reset_status]
> *ERROR* render ring hung inside bo (0x4697000 ctx 0) at 0x4698a04
> Jan 23 12:19:00 sanmateo kernel: [  201.984019] [drm] GMBUS [i915 gmbus vga]
> timed out, falling back to bit banging on pin 2
> Jan 23 12:19:00 sanmateo kernel: [  202.368018] [drm:i915_reset] *ERROR*
> Failed to reset chip.
> 
> /sys/class/drm/card0/error: no error state collected

LM17.1 Rebecca 32 Mate, Core2Duo 6420@2.13GHz, P5B-MX/WIFI-AP, 2GiB RAM, 1Tb SATA Ext4, 10Gib root, 2GiB swap, Logitech Mx-400 mouse
Comment 3 Rodrigo Vivi 2015-01-24 00:51:35 UTC
Please collect and attach i915_error_state from debugfs.
Also please boot your kernel with drm.debug=0xe, grab and attach dmesg as well.
Comment 4 Victor 2015-01-31 20:12:09 UTC
Created attachment 113000 [details]
i915_error_state and dmesg
Comment 5 Victor 2015-01-31 20:13:32 UTC
(In reply to Victor from comment #4)
> Created attachment 113000 [details]
> i915_error_state and dmesg

My system is LM17.1 Rebecca 32 Mate, Core2Duo 6420@2.13GHz, P5B-MX/WIFI-AP, 2GiB RAM, 1Tb SATA Ext4, 10Gib root, 2GiB swap, Logitech Mx-400 mouse
Comment 6 Matt Turner 2015-03-06 20:57:56 UTC
I suspect this may be another duplicate of the bug fixed (worked-around) by this commit:

commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Jan 17 23:21:15 2015 -0800

    i965: Work around mysterious Gen4 GPU hangs with minimal state changes.
    
    Gen4 hardware appears to GPU hang frequently when using Chromium, and
    also when running 'glmark2 -b ideas'.  Most of the error states contain
    3DPRIMITIVE commands in quick succession, with very few state packets
    between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER.
    
    I trimmed an apitrace of the glmark2 hang down to two draw calls with a
    glUniformMatrix4fv call between the two.  Either draw by itself works
    fine, but together, they hang the GPU.  Removing the glUniform call
    makes the hangs disappear.  In the hardware state, this translates to
    removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets.
    
    Flushing before emitting CONSTANT_BUFFER packets also appears to make
    the hangs disappear.  I observed a slowdown in glxgears by doing it all
    the time, so I've chosen to only do it when BRW_NEW_BATCH and
    BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or
    already flushed the whole pipeline).
    
    I'd much rather understand the problem, but at this point, I don't see
    how we'd ever be able to track it down further.  We have no real tools,
    and the hardware people moved on years ago.  I've analyzed 20+ error
    states and read every scrap of documentation I could find.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Acked-by: Matt Turner <mattst88@gmail.com>
    Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>

It's in git, and backports are in Mesa 10.4.x for x > 3.

What version of Mesa do you have?
Comment 7 Victor 2015-03-07 10:12:36 UTC
I installed package libgl1-mesa-dri-dbg (sudo aptitude install libgl1-mesa-dri-dbg), that was advised to me by Matt Turner, and it greatly improved the situation - I can watch YouTube now. But the cure is not complete - there is a website with heavy flash animation -  http://pixelarity.com/aerial - that hangs my Google Chrome browser anyway.
Comment 8 Victor 2015-07-14 17:20:23 UTC
I opened this bug a few months ago when I used Linux Mint 17.1. Now Linux Mint 17.2 rolled out. I installed it - and the bug is gone! Google Chrome works fine and shows all videos fine. So, I figure out the bug was rooted in Linux Mint 17.1 inner architecture. Thank you everybody! Please close the bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.