Bug 100245

Summary: GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck semaphore on render ring, action: continue
Product: DRI Reporter: Aaron Lu <aaron.lu>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
/sys/class/drm/card0/error
none
/sys/class/drm/card0/error on 4.10.9-200.fc25.x86_64 none

Description Aaron Lu 2017-03-17 08:40:10 UTC
Created attachment 130283 [details]
/sys/class/drm/card0/error

Screen freezes for a little while and then I found the following messages in dmesg:

[  623.930398] [drm] GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck semaphore on render ring, action: continue
[  623.930493] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  623.930496] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  623.930499] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  623.930502] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  623.930505] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Comment 1 Chris Wilson 2017-03-17 09:13:19 UTC

*** This bug has been marked as a duplicate of bug 54226 ***
Comment 2 Aaron Lu 2017-04-21 02:30:35 UTC
Not sure if this is the same bug, the machine today already hang twice and I saw the following message in serial log:

Fedora 25 (Workstation Edition)
Kernel 4.10.10-200.fc25.x86_64 on an x86_64 (ttyS0)

aaronlu login: [ 1674.330699] [drm] GPU HANG: ecode 6:0:0x85fffff8, in Xorg [926], reason: Hang on render ring, action: reset
[ 1674.340346] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 1674.349471] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 1674.358250] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 1674.367804] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 1674.376668] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1674.383004] drm/i915: Resetting chip after gpu hang

It's different than previous in that the system can not recover now and I have to hard reset. The problem seems to appear quite often/easily from v4.10.9+ kernels shipped with Fedora 25.

Shall I file a new bug for it?
Comment 3 Aaron Lu 2017-04-21 03:22:51 UTC
Created attachment 130956 [details]
/sys/class/drm/card0/error on 4.10.9-200.fc25.x86_64

Just hit the bug again. This time on v4.10.9 and it managed to recover(killed my desktop session though but at least, no need to hard reset):

[ 2608.203791] [drm] GPU HANG: ecode 6:0:0x85fffff8, in Xorg [926], reason: Hang on render ring, action: reset
[ 2608.213449] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 2608.222579] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 2608.231352] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 2608.240900] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 2608.249759] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 2608.256075] drm/i915: Resetting chip after gpu hang
[ 2619.190601] drm/i915: Resetting chip after gpu hang

Since it recovered, I can copy the said error file and here it is.
Comment 4 Aaron Lu 2017-10-27 06:35:35 UTC
This problem reappeared on 4.13.5-200.fc26.x86_64

[774249.632109] [drm] GPU HANG: ecode 6:0:0x85fffff8, in Xorg [696], reason: Hang on rcs0, action: reset                                      
[774249.632110] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.                                     
[774249.632111] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel                                         
[774249.632111] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.                                
[774249.632111] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.                                        
[774249.632112] [drm] GPU crash dump saved to /sys/class/drm/card0/error                                                                      
[774249.632172] drm/i915: Resetting chip after gpu hang

It recovered without killing my desktop session, please let me know if you need the error file.
Comment 5 Elizabeth 2017-10-30 23:16:37 UTC
(In reply to Aaron Lu from comment #4)
> This problem reappeared on 4.13.5-200.fc26.x86_64
> ...
> It recovered without killing my desktop session, please let me know if you
> need the error file.
Could you share it on bug 54226. Thank you.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.