Bug 81369 - [SNB gt1 regression] GPU HANG: ecode 0:0x85fffff8 stuck after first render batch with Linux 3.15.5, works with 3.14.11
Summary: [SNB gt1 regression] GPU HANG: ecode 0:0x85fffff8 stuck after first render ba...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: highest blocker
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-15 04:24 UTC by Omen Wild
Modified: 2017-07-24 22:53 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Crash dump (2.07 MB, text/plain)
2014-07-15 04:24 UTC, Omen Wild
no flags Details
Crash dump with 3.15.6 (2.07 MB, text/plain)
2014-07-22 03:30 UTC, Omen Wild
no flags Details

Description Omen Wild 2014-07-15 04:24:10 UTC
Created attachment 102812 [details]
Crash dump

Jul 14 21:04:59 balrog kernel: [  293.552750] [drm] stuck on render ring
Jul 14 21:04:59 balrog kernel: [  293.553208] [drm] GPU HANG: ecode 0:0x85fffff8, in Xorg [14516], reason: Ring hung, action: reset
Jul 14 21:04:59 balrog kernel: [  293.553210] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jul 14 21:04:59 balrog kernel: [  293.553211] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jul 14 21:04:59 balrog kernel: [  293.553212] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jul 14 21:04:59 balrog kernel: [  293.553213] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jul 14 21:04:59 balrog kernel: [  293.553214] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jul 14 21:05:01 balrog kernel: [  295.551577] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
Jul 14 21:05:05 balrog kernel: [  299.553243] [drm] stuck on blitter ring
Jul 14 21:05:05 balrog kernel: [  299.553717] [drm] GPU HANG: ecode 2:0xfffffffe, in Xorg [14516], reason: Ring hung, action: reset
Jul 14 21:05:05 balrog kernel: [  299.553760] [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!
Jul 14 21:05:07 balrog kernel: [  301.552531] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off


To me, this looks pretty similar to bug Bug 78533, except the two patches do not apply cleanly to my 3.15.5 kernel:
1 > cat ../patches/Prevent\ negative\ relocation\ deltas\ from\ causing\ wraparound.patch | patch -l -s -p1 --dry-run                                                                            
3 out of 4 hunks FAILED

1 > cat ../patches/Offsect\ batch\ buffers\ to\ prevent\ delta\ wrapping.patch | patch -l -s -p1 --dry-run                                                                            
2 out of 2 hunks FAILED
Reversed (or previously applied) patch detected!  Assume -R? [n] 
Apply anyway? [n] 
5 out of 5 hunks ignored
Reversed (or previously applied) patch detected!  Assume -R? [n] 
Apply anyway? [n] 
2 out of 2 hunks ignored
5 out of 6 hunks FAILED
Reversed (or previously applied) patch detected!  Assume -R? [n] 
Apply anyway? [n] 
1 out of 1 hunk ignored
Comment 1 Chris Wilson 2014-07-15 05:53:02 UTC
Looks like bug 79996
Comment 2 Omen Wild 2014-07-15 17:38:36 UTC
I downloaded the "Clear all unwanted bits from GT_MODE" patch (from bug 79996) but that does not apply:

0 > cat ../patches/Clear\ all\ unwanted\ bits\ from\ GT_MODE.patch | patch -l -p1 --dry-run
checking file drivers/gpu/drm/i915/intel_pm.c
Hunk #1 FAILED at 5095.
Hunk #2 succeeded at 4751 (offset -363 lines).
1 out of 2 hunks FAILED

Then I looked at just the patch in comment #5:

0 > cat ../patches/1.patch | patch -l -p1 --dry-run 
checking file drivers/gpu/drm/i915/intel_pm.c
Hunk #1 succeeded at 4738 with fuzz 2 (offset -450 lines).

That offset of -450 lines worries me, so I did not apply it.

Is there an updated patch somewhere?
Comment 3 Omen Wild 2014-07-22 03:30:17 UTC
Created attachment 103250 [details]
Crash dump with 3.15.6
Comment 4 Omen Wild 2014-07-22 03:30:48 UTC
I tried upgrading to 3.15.6 but got the same crash.

[100118.550455] [drm] stuck on render ring
[100118.550909] [drm] GPU HANG: ecode 0:0x85fffff8, in Xorg [6630], reason: Ring hung, action: reset
[100118.550912] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[100118.550913] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[100118.550914] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[100118.550915] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[100118.550916] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[100120.549307] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[100124.546978] [drm] stuck on blitter ring
[100124.547452] [drm] GPU HANG: ecode 2:0xfffffffe, in Xorg [6630], reason: Ring hung, action: reset
[100124.547535] [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!
[100126.546089] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
Comment 6 Gordon Jin 2014-09-15 06:52:53 UTC
Omen, could you try Chris's patch?
Comment 7 Rodrigo Vivi 2014-10-15 20:36:14 UTC
*** Bug 80913 has been marked as a duplicate of this bug. ***
Comment 8 Mika Kuoppala 2014-11-05 15:55:12 UTC
Omen, could you try out with:

https://bugs.freedesktop.org/attachment.cgi?id=108894
Comment 9 Daniel Vetter 2014-11-25 09:17:24 UTC
(In reply to Mika Kuoppala from comment #8)
> Omen, could you try out with:
> 
> https://bugs.freedesktop.org/attachment.cgi?id=108894

This patch landed, so let's hope this is fixed. If latest drm-intel-nightly is still affected (patch is cc: stable but might take a while) then please reopen.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.