Bug 101884

Summary: GPU HANG in kscreenlocker_g
Product: DRI Reporter: solitone <solitone>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
GPU crash dump none

Description solitone 2017-07-23 07:14:45 UTC
Created attachment 132846 [details]
GPU crash dump

Happens right after resume from hibernate: after some minutes, the graphical login screen appears, the mouse pointer is visible and moving, but it's not possible to do anything on that login screen. It's possible to switch to a virtual console and do a textual login, though.

Details:

* Laptop: Apple MacBookPro 12,1
* kernel 4.9.30
* Distribution: Debian 9 (Stretch)

~$ lspci  -v -s  $(lspci | grep ' VGA ' | cut -d" " -f 1)
00:02.0 VGA compatible controller: Intel Corporation Iris Graphics 6100 (rev 09) (prog-if 00 [VGA controller])
        Subsystem: Apple Inc. Iris Graphics 6100
        Flags: bus master, fast devsel, latency 0, IRQ 53
        Memory at c0000000 (64-bit, non-prefetchable) [size=16M]
        Memory at b0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 3000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: i915
        Kernel modules: i915


Kernel logs:

Jul 23 08:00:10 alan kernel: [drm] GPU HANG: ecode 8:0:0x980e800f, in kscreenlocker_g [27962], reason: Hang on render ring, action: reset
Jul 23 08:00:10 alan kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jul 23 08:00:10 alan kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jul 23 08:00:10 alan kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jul 23 08:00:10 alan kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jul 23 08:00:10 alan kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jul 23 08:00:10 alan kernel: drm/i915: Resetting chip after gpu hang
Jul 23 08:00:21 alan kernel: drm/i915: Resetting chip after gpu hang
Jul 23 08:00:32 alan kernel: drm/i915: Resetting chip after gpu hang
Jul 23 08:00:43 alan kernel: drm/i915: Resetting chip after gpu hang
Comment 1 Chris Wilson 2017-07-23 11:21:32 UTC
commit bafb2f7d4755bf1571bd5e9a03b97f3fc4fe69ae [v4.10]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Sep 21 14:51:08 2016 +0100

    drm/i915/execlists: Reset RING registers upon resume
    
    There is a disparity in the context image saved to disk and our own
    bookkeeping - that is we presume the RING_HEAD and RING_TAIL match our
    stored ce->ring->tail value. However, as we emit WA_TAIL_DWORDS into the
    ring but may not tell the GPU about them, the GPU may be lagging behind
    our bookkeeping. Upon hibernation we do not save stolen pages, presuming
    that their contents are volatile. This means that although we start
    writing into the ring at tail, the GPU starts executing from its HEAD
    and there may be some garbage in between and so the GPU promptly hangs
    upon resume.
    
    Testcase: igt/gem_exec_suspend/basic-S4
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96526
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Link: http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-3-chris@chris-wilson.co.uk

*** This bug has been marked as a duplicate of bug 96526 ***
Comment 2 solitone 2017-07-25 06:06:46 UTC
Thanks, Chris. May I ask what kernel version include that patch?

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.