Bug 99993

Summary: [hsw] GPU Hang on freeze and restore -- context restore
Product: DRI Reporter: scompo
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: medium CC: intel-gfx-bugs, jens-bugs.freedesktop.org
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: HSW i915 features: GPU hang, power/suspend-resume
Attachments:
Description Flags
/sys/class/drm/card0/error dump none

Description scompo 2017-02-27 22:19:00 UTC
Created attachment 129970 [details]
/sys/class/drm/card0/error dump

Bug description: 

GPU Hang on freeze and restore.

echo 'freeze' > /sys/power/state

gives me this error on dmesg:

[  621.253580] [drm] GPU HANG: ecode 7:0:0x87c3ffff, reason: Hang on render ring, action: reset
[  621.253583] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  621.253585] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  621.253586] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  621.253587] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  621.253589] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  621.253666] drm/i915: Resetting chip after gpu hang

I've attached the crash dump as requested to this bug.

System environment:

This is my notebook specs page.

http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c04394980

-- kernel: 4.9.11-200.fc25.x86_64
-- Linux distribution: Fedora 25 (also on latest ubuntu version)

Reproducing steps:

1- echo 'freeze' > /sys/power/state
2- wake up system
3- system freezes for a while and dmesg on terminal gives me the error

Additional info:

The system also hangs on restore from supsend-to-ram and suspend-to-disk.
Comment 1 scompo 2017-03-01 17:58:11 UTC
If this could help the problem was present also on the following kernels:

- 4.9.8-201.fc25.x86_64
- 4.9.9-200.fc25.x86_64
Comment 2 Chris Wilson 2017-03-22 20:51:36 UTC
*** Bug 100280 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2017-03-23 10:00:40 UTC
commit 5d4bac5503fcc67dd7999571e243cee49371aef7
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Mar 22 20:59:30 2017 +0000

    drm/i915: Restore marking context objects as dirty on pinning
    
    Commit e8a9c58fcd9a ("drm/i915: Unify active context tracking between
    legacy/execlists/guc") converted the legacy intel_ringbuffer submission
    to the same context pinning mechanism as execlists - that is to pin the
    context until the subsequent request is retired. Previously it used the
    vma retirement of the context object to keep itself pinned until the
    next request (after i915_vma_move_to_active()). In the conversion, I
    missed that the vma retirement was also responsible for marking the
    object as dirty. Mark the context object as dirty when pinning
    (equivalent to execlists) which ensures that if the context is swapped
    out due to mempressure or suspend/hibernation, when it is loaded back in
    it does so with the previous state (and not all zero).
    
    Fixes: e8a9c58fcd9a ("drm/i915: Unify active context tracking between legacy/execlists/guc")
    Reported-by: Dennis Gilmore <dennis@ausil.us>
    Reported-by: Mathieu Marquer <mathieu.marquer@gmail.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99993
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100181
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.11-rc1
    Link: http://patchwork.freedesktop.org/patch/msgid/20170322205930.12762-1-chris@chris-wilson.co.uk
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.