Bug 99286

Summary: [SNB] GPU HANG: ecode 6:0:0x2a8d8d94, in kscreenlocker_g [20793], reason: Hang on render ring, action: reset
Product: DRI Reporter: Alexandre <alexandre.nunes>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: SNB i915 features: GPU hang
Attachments:
Description Flags
dump from /sys/class/drm/card0/error
none
dmesg
none
Xorg log none

Description Alexandre 2017-01-05 17:19:25 UTC
Created attachment 128776 [details]
dump from /sys/class/drm/card0/error

Attached dmesg, Xorg.log & /sys/class/drm/card0/error
Comment 1 Alexandre 2017-01-05 17:25:11 UTC
Created attachment 128777 [details]
dmesg
Comment 2 Alexandre 2017-01-05 17:25:57 UTC
Created attachment 128778 [details]
Xorg log
Comment 3 Chris Wilson 2017-01-05 21:46:04 UTC
This is strange. According to the error state we retired requests before their seqno were complete.
Comment 4 Alexandre 2017-01-06 19:03:41 UTC
I forgot to mention the bug manifests very frequently after suspend/resume. I do resume like 5 ~ 6 times a day and the frequency it happens is like once every two or three days. With 4.8.14 kernel & xf86-video-intel git @169c74fa6c2cd9c28dd7bfacd9639cd245b8c8a8 the behavior is that if I kill the offending kde screenlock program, the system comes back to work (altought a bit slow and I get frequent over temperature notifications in spite of very low cpu usage, the cpu fan is also on in hi-speed mode, I suspect gpu could be wasting power instead of cpu?).

On older kernel I had to either reboot or ctrl+alt+backspace to kill X to restore functionality - the low performance & over temp was present too.
Comment 5 Elizabeth 2017-06-29 17:30:23 UTC
(In reply to Alexandre from comment #4)
> I forgot to mention the bug manifests very frequently after suspend/resume.
> I do resume like 5 ~ 6 times a day and the frequency it happens is like once
> every two or three days. With 4.8.14 kernel & xf86-video-intel git
> @169c74fa6c2cd9c28dd7bfacd9639cd245b8c8a8 the behavior is that if I kill the
> offending kde screenlock program, the system comes back to work (altought a
> bit slow and I get frequent over temperature notifications in spite of very
> low cpu usage, the cpu fan is also on in hi-speed mode, I suspect gpu could
> be wasting power instead of cpu?).
> 
> On older kernel I had to either reboot or ctrl+alt+backspace to kill X to
> restore functionality - the low performance & over temp was present too.

Hello Alexandre,
Sorry for the delay, are you still able to reproduce the problem on latest Kernel? Have you done any new HW or SW configuration? Thank you.
Comment 6 Alexandre 2017-06-30 13:57:32 UTC
I had to disable DRI 3 and stick with DRI 2. This made the hang go away.

BTW, I installed ubuntu 17.04 vanilla and it also have the same problem with every kernel I tested (from 4.9 to 4.11, both vanilla and ubuntu), with both xserver-xorg-video-intel and builtin modeset drivers. I settled with xserver-xorg-video-intel with sna and tearfree on and dri 2, didn't try modeset since then because "it works for me as is". 

It's actually worse on ubuntu because I can reproduce almost every resume, all I have to do is use DRI 3 and suspend. The system comes back from resume, but when I enter my password to log back in, it freezes. I can't log back in to see if there's a crash log or something.
Comment 7 Alexandre 2017-07-01 18:54:44 UTC
Well, spoke too soon. The same problem happened once. It seems to be way more likely w/ DRI 3, but still happens with DRI 2.
Comment 8 Elizabeth 2017-07-03 20:21:09 UTC
(In reply to Alexandre from comment #7)
> Well, spoke too soon. The same problem happened once. It seems to be way
> more likely w/ DRI 3, but still happens with DRI 2.

Thank you for your answer,
If possible, could you try to reproduce with 4.12 https://www.kernel.org/
Also, please attach new logs with your last SW configuration, and dmesg with the parameter "drm.debug=0xe" on grub. Thank you.
Comment 9 Alexandre 2017-07-06 16:35:57 UTC
I'm using 4.12 for a few days. I didn't suspend many times yet, but since so far there's no hang on resume, it's a clear improvement even if it eventually fails. Last time I had  the system so stable was on 4.9/debian.

I'll collect the info you requested in a few days or after a freeze - whatever happens first.
Comment 10 Ricardo 2017-07-11 13:47:35 UTC
Alexandre several days has passed, I'm going to close this bug as Fixed but please if the hang is reproduce open a new bug with logs included...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.