Bug 102534 - GPU hangs after resuming for hibernation, causing gnome to crash
Summary: GPU hangs after resuming for hibernation, causing gnome to crash
Status: CLOSED DUPLICATE of bug 96526
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-04 07:29 UTC by Harish
Modified: 2017-09-07 20:16 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
GPU crash dump (713.50 KB, text/plain)
2017-09-04 09:43 UTC, Harish
no flags Details

Description Harish 2017-09-04 07:29:14 UTC
It appears that gnome crashes frequently after resuming from hibernation. Hibernation seems to be successful seeing that the hibernation image is created and restored successfully according to dmesg. 

dmesg also gives the following messages after resume from hibernation:
> GPU HANG: ecode 8:0:0xcfdf999d, in gnome-shell [874], reason: Hang on render ring, action: reset
> [228.738386] drm/i915: Resetting chip after gpu hang

After which, gnome-shell seems to crash and end session.
Comment 1 Harish 2017-09-04 07:30:40 UTC
Please also see https://bbs.archlinux.org/viewtopic.php?pid=1734193#p1734193
Comment 2 Harish 2017-09-04 07:34:02 UTC
Forgot to mention that this is currently occuring on kernel 4.9.47-1-lts. I am using GNOME Shell 3.24.3. My laptop is thinkpad x250, Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz and intel HD 5500 graphics.
Comment 3 Chris Wilson 2017-09-04 09:21:54 UTC
[  192.832617] [drm] GPU HANG: ecode 8:0:0xcfdf999d, in gnome-shell [874], reason: Hang on render ring, action: reset
[  192.832618] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  192.832618] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  192.832619] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  192.832619] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  192.832620] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Comment 4 Harish 2017-09-04 09:43:43 UTC
Created attachment 133958 [details]
GPU crash dump

Apologies, I have attached it now.
Comment 5 Chris Wilson 2017-09-04 14:17:16 UTC
commit bafb2f7d4755bf1571bd5e9a03b97f3fc4fe69ae
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Sep 21 14:51:08 2016 +0100

    drm/i915/execlists: Reset RING registers upon resume
    
    There is a disparity in the context image saved to disk and our own
    bookkeeping - that is we presume the RING_HEAD and RING_TAIL match our
    stored ce->ring->tail value. However, as we emit WA_TAIL_DWORDS into the
    ring but may not tell the GPU about them, the GPU may be lagging behind
    our bookkeeping. Upon hibernation we do not save stolen pages, presuming
    that their contents are volatile. This means that although we start
    writing into the ring at tail, the GPU starts executing from its HEAD
    and there may be some garbage in between and so the GPU promptly hangs
    upon resume.
    
    Testcase: igt/gem_exec_suspend/basic-S4
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96526
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Link: http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-3-chris@chris-wilson.co.uk

*** This bug has been marked as a duplicate of bug 96526 ***
Comment 6 Harish 2017-09-04 14:34:16 UTC
Thanks for resolving this bug. How do I get this fix on my computer? Is the patch applied to the kernel? If so, please can you tell us which kernel has the correct patch for this? I've looked at all the related/duplicate bug reports and found conflicting information regarding where to find the correct patch. Apologies if this sounds like a noob question.
Comment 7 Harish 2017-09-05 11:17:28 UTC
Just an update. I am using arch linux with kernel 4.12.8 now. It appears that enabling early KMS to load modules intel_agp and i915 solves the blank screen problem here.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.