Bug 101780 - [SKL] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [10945], reason: Hang on render ring, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [10945], reason: Hang on render...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 102433 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-07-13 12:46 UTC by Milan Bouchet-Valat
Modified: 2018-04-20 14:18 UTC (History)
2 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (31.83 KB, text/plain)
2017-07-13 12:46 UTC, Milan Bouchet-Valat
no flags Details
dmesg (87.36 KB, text/plain)
2017-07-13 12:47 UTC, Milan Bouchet-Valat
no flags Details

Description Milan Bouchet-Valat 2017-07-13 12:46:47 UTC
Created attachment 132661 [details]
/sys/class/drm/card0/error

I get this crash from time to time, without any clear way of reproducing it. I've seen it twice in a row today. I think it only happens when connecting an external monitor via a docking station (and turning off the built-in display).

My GPU is an Intel® HD Graphics 520 (Skylake GT2):
VGA compatible controller [0300]: Intel Corporation HD Graphics 520 [8086:1916] (rev 07)

This is on Fedora 25 with kernel 4.11.6-201 and xorg-x11-drv-intel 2.99.917-26.20160929.


juil. 13 14:33:47 mob01772 /usr/libexec/gdm-x-session[10943]: intel_do_flush_locked failed: Input/output error
[gdm-x-session modeset information]
juil. 13 14:33:47 mob01772 kernel: [drm] GuC firmware load skipped
juil. 13 14:33:47 mob01772 kernel: [drm] RC6 on
juil. 13 14:33:47 mob01772 kernel: drm/i915: Resetting chip after gpu hang
juil. 13 14:33:39 mob01772 kernel: [drm] GuC firmware load skipped
juil. 13 14:33:39 mob01772 kernel: [drm] RC6 on
juil. 13 14:33:39 mob01772 kernel: drm/i915: Resetting chip after gpu hang
juil. 13 14:33:31 mob01772 kernel: [drm] GuC firmware load skipped
juil. 13 14:33:31 mob01772 kernel: [drm] RC6 on
juil. 13 14:33:31 mob01772 kernel: drm/i915: Resetting chip after gpu hang
[gdm-x-session modeset information]
juil. 13 14:33:19 mob01772 kernel: [drm] GuC firmware load skipped
juil. 13 14:33:19 mob01772 kernel: [drm] RC6 on
juil. 13 14:33:19 mob01772 kernel: drm/i915: Resetting chip after gpu hang
juil. 13 14:33:11 mob01772 kernel: [drm] GuC firmware load skipped
juil. 13 14:33:11 mob01772 kernel: [drm] RC6 on
juil. 13 14:33:11 mob01772 kernel: drm/i915: Resetting chip after gpu hang
juil. 13 14:33:11 mob01772 kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
juil. 13 14:33:11 mob01772 kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
juil. 13 14:33:11 mob01772 kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
juil. 13 14:33:11 mob01772 kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
juil. 13 14:33:11 mob01772 kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
juil. 13 14:33:11 mob01772 kernel: [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [10945], reason: Hang on render ring, action: reset
Comment 1 Milan Bouchet-Valat 2017-07-13 12:47:14 UTC
Created attachment 132662 [details]
dmesg
Comment 2 Elizabeth 2017-07-13 20:38:52 UTC
(In reply to Milan Bouchet-Valat from comment #0)
> Created attachment 132661 [details]
> /sys/class/drm/card0/error
> 
> I get this crash from time to time, without any clear way of reproducing it.
> I've seen it twice in a row today. I think it only happens when connecting
> an external monitor via a docking station (and turning off the built-in
> display).
Hello Milan, 
As you mention about the docking station, some supported features for MST devices were added for kernel 4.12. Could you please change from 4.11 to 4.12 and try to reproduce the issue again:
https://www.kernel.org/
If the issue persist attach dmesg with parameter drm.debug=0xe on grub. 
> juil. 13 14:33:47 mob01772 /usr/libexec/gdm-x-session[10943]:
> intel_do_flush_locked failed: Input/output error
> [gdm-x-session modeset information]
> juil. 13 14:33:47 mob01772 kernel: [drm] GuC firmware load skipped
> juil. 13 14:33:47 mob01772 kernel: [drm] RC6 on
> juil. 13 14:33:47 mob01772 kernel: drm/i915: Resetting chip after gpu hang
Also you could try disabling the RC6 and see if it hangs again.
Once you provide more information, please change the tag from "NEEDINFO" to "REOPEN". Thank you.
Comment 3 Elizabeth 2017-09-27 16:16:13 UTC
Hello Milan, any update with this? Thank you.
Comment 4 Milan Bouchet-Valat 2017-09-27 18:06:53 UTC
Unfortunately I only see it quite rarely. I'll report if I see it again on 4.12.
Comment 5 Elizabeth 2017-11-13 18:20:03 UTC
*** Bug 102433 has been marked as a duplicate of this bug. ***
Comment 6 Elizabeth 2017-11-13 18:21:54 UTC
From bug 102433:

(In reply to wettererscheinung from comment #0)
> Created attachment 133815 [details]
> Output xrandr --verbose
> 
> Dear Developers,
> 
> Steps to reproduce:
> * work for at about one week without shutting off notebook or logging out
> user in at least two different KDE activities at the same time (while three
> activities are running). Usually permanent running software (among others):
> Thunderbird, several instances of Okular, LibreOffice, Dolphin, Firefox
> * work in LibreOffice with long document
> * Over the nights close notebook without shutting off or logging out (sleep
> mode)
> 
> result:
> * Suddenly (not after fixed amount of time but always after at about one
> week) no reactions on keyboard entries of any kind
> ** One time, with delay of several seconds the last entered words were
> auto-corrected very slowly, but no new entries with keyboard were possible.
> * Mouse is moving, but no window or button is reacting
> * Then X shuts down and logs me out.
> * Previously running software is not shut down cleanly.
> * I have to log in again and hope, that the last (auto)save was not too long
> ago. 
> 
> How often does this occur?
> * I can't tell for sure, as I do not always run my notebook for more than
> one week without shutting off or logging out.
> * Seems pretty regular to always though (but I can't tell a certain amount
> of time after (re-)starting the notebook. It's usually at about one week.
> ...
> What have I tried?
> * I tried to purge xserver-xorg-video-intel. But it occured again.
(In reply to Chris Wilson from comment #1)
> (In reply to wettererscheinung from comment #0)
> > What have I tried?
> > * I tried to purge xserver-xorg-video-intel. But it occured again.
> 
> It's a gpu hang from using -modesetting...
Comment 7 Jani Saarinen 2018-03-29 07:11:03 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 8 Jani Saarinen 2018-04-20 14:18:50 UTC
Closing, please re-open if still occurs.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.