Bug 80727

Summary: [ivb] GPU hang after hibernation, IPEHR: 0xf000027e
Product: DRI Reporter: Harald Judt <h.judt>
Component: DRM/IntelAssignee: Ville Syrjala <ville.syrjala>
Status: CLOSED INVALID QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
i915 platform: i915 features:
Description Flags
intel-drm-card0-error.out.bz2 none

Description Harald Judt 2014-06-30 18:51:21 UTC
Created attachment 102028 [details]

[ 2148.620052] [drm] stuck on render ring
[ 2148.621068] [drm] GPU HANG: ecode 0:0x0fdffd81, in X [2006], reason: Ring hung, action: reset
[ 2148.621070] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 2148.621071] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 2148.621072] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 2148.621073] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 2148.621074] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 2150.618974] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
[ 2154.616758] [drm] stuck on render ring
[ 2154.618186] [drm] GPU HANG: ecode 0:0x0fdffd81, in X [2006], reason: Ring hung, action: reset
[ 2154.618339] [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!
[ 2156.615661] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off

I am running non-vanilla linux kernel 3.15.2, with other patches applied (tuxonice, uksm, bfs, bfq). This happened after resuming from hibernation (several attempts). After restarting X, everything seems to work fine.

Other symptoms: Sometimes after resuming from hibernation there is a pink horizontal line pattern on parts of the screen (on half of the screen or sometimes only over the xfce panel,...). 

mesa-git (non-dri3, compiled on 2014-06-30 14:54:53)
Comment 1 Harald Judt 2014-06-30 21:22:49 UTC
Please ignore this bug, I believe it is invalid. The hangs and/or the strange flashing pink corruptions seem to be caused by the UKSM patch I applied. After removing this patch, I have tested again several times without being able to reproduce. I apologize for the extra trouble I have caused.
Comment 2 Harald Judt 2014-07-01 08:57:45 UTC
I stand corrected; It was not the UKSM patch. The issue stlil exists, maybe the UKSM patch only caused it to occur more frequently. Reopening.
Comment 3 Harald Judt 2014-07-02 22:02:12 UTC
Update: I've tried finding out where troubles started. 3.13.11 works fine, the next version I tested was 3.14.9. I haven't seen any corruptions yet, but there is one major problem: Most times, resuming from hibernation won't work and will end in a black screen. This same? issue happens with 3.15.2, although much more rarely, but I believe it is still there. tuxonice has a nifty feature here which will allow you to try to resume from the same image. After two or three tries, this usually works. So to sum up:

3.13.11 usually works fine
3.14.9  works somehow, but you'll get your share of black screens and several attempts to resume in a row to get your working session back
3.15.2  only sometimes has the same issue as 3.14.9, but introduces other issues like gpu hangs and corruptions and sometimes freezes on hibernation

Maybe I can do a bisect of 3.13 and 3.14 or 3.14 and 3.15 but there have been quite a few stability issues in 3.14 that make this process tedious and difficult.
Comment 4 Harald Judt 2014-07-06 10:58:41 UTC
My problem was caused by a bad BIOS. Updating to the newest BIOS solved everything (see also bug #80773).

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.