Bug 89524

Summary: GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
Product: DRI Reporter: john mathews <jmathew6200>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: medium CC: intel-gfx-bugs, jefbed
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: SNB i915 features: GPU hang
Attachments:
Description Flags
card0 error file
none
dmesg.txt
none
card0_error2
none
GPU crash dump none

Description john mathews 2015-03-11 03:56:42 UTC
Created attachment 114211 [details]
card0 error file

Anytime I play video, doesn't matter what format at all, my the framerate will suddenly stop for several seconds before continuing (usually after the point it stopped). This occurs seemingly randomly during the video. My test cases are using mplayer2 to view random MKV files but it seems to happen in the web browser as well with flash players and the like.

Below is a paste from dmesg:

[258873.064703] [drm] GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
[258873.064706] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[258873.064707] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[258873.064707] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[258873.064708] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[258873.064709] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[258933.073438] [drm] GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
[259283.124389] [drm] GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck wait on render ring, action: continue
[259323.130181] [drm] GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck wait on render ring, action: continue

I have also attached the referenced card0/error file.
Comment 1 Chris Wilson 2015-03-11 08:57:17 UTC
Please install xf86-video-intel-2.99.917.
Comment 2 john mathews 2015-03-13 02:09:45 UTC
Current version installed is indeed: 2.99.917
Comment 3 john mathews 2015-03-13 02:11:40 UTC
If it helps its currently installed with the following use flags (I run Gentoo):

dri
sna
udev

Disabled use flags are:

debug
uxa
xvmc
Comment 4 Chris Wilson 2015-03-13 08:06:46 UTC
Then please attach the full error state.
Comment 5 john mathews 2015-03-14 19:54:40 UTC
Created attachment 114315 [details]
dmesg.txt

dmesg.txt
Comment 6 john mathews 2015-03-14 19:55:10 UTC
Created attachment 114316 [details]
card0_error2
Comment 7 john mathews 2015-03-14 19:58:07 UTC
Could you explain how I can do that? I tried to interpret what you meant by "attach the full error state" and came up with the following translation:

-rebuild the intel drivers with the debug flag enabled
-cause the crash
-include, presumably more verbose, error output

So I did that and have included the dmesg output related to the crash and the card0_error2 file which is referenced in the dmesg output. 

If you need any additional information please let me know.
Comment 8 Chris Wilson 2015-03-14 21:09:51 UTC
The first error state was truncated, so I couldn't check the instructions that were suspended on the GPU. The second one confirms that you have the PMSI_CTL w/a which was what I was being hopeful would fix the problem.

Can you please paste the output of lspci -vvvn -s 0:2?
Comment 9 john mathews 2015-03-19 02:35:43 UTC
absolutely!

00:02.0 0300: 8086:0126 (rev 09) (prog-if 00 [VGA controller])
        Subsystem: 17aa:21ce
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 30
        Region 0: Memory at f0000000 (64-bit, non-prefetchable) [size=4M]
        Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at 5000 [size=64]
        Expansion ROM at <unassigned> [disabled]
        Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee00018  Data: 0000
        Capabilities: [d0] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [a4] PCI Advanced Features
                AFCap: TP+ FLR+
                AFCtrl: FLR-
                AFStatus: TP-
        Kernel driver in use: i915
        Kernel modules: i915
Comment 10 john mathews 2015-04-18 13:35:24 UTC
Keep in mind this has been a problem since around 3.10.x. I've been reporting it to distro specific forums but then opened this ticket.
Comment 11 Danilo Pianini 2015-09-02 09:41:56 UTC
I have experienced this issue myself.



[184151.630197] [drm] GPU HANG: ecode 6:-1:0x00000000, reason: Kicking stuck semaphore on render ring, action: continue
[184151.630200] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[184151.630200] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[184151.630201] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[184151.630202] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[184151.630203] [drm] GPU crash dump saved to /sys/class/drm/card0/error

I will attach the crash dump shortly
Comment 12 Danilo Pianini 2015-09-02 09:43:17 UTC
Created attachment 118049 [details]
GPU crash dump
Comment 13 Chris Wilson 2015-12-30 21:43:32 UTC

*** This bug has been marked as a duplicate of bug 54226 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.