Bug 92934 - [SKL-U] GPU HANG: ecode 9:0:0x85dfbfff, in chrome [1654]. PIPE_CONTROL with Depth Stall Enabled set
Summary: [SKL-U] GPU HANG: ecode 9:0:0x85dfbfff, in chrome [1654]. PIPE_CONTROL with D...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 94808 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-11-13 10:38 UTC by Timur Alperovich
Modified: 2017-07-24 22:44 UTC (History)
3 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
dmesg output (78.63 KB, text/plain)
2015-11-13 10:38 UTC, Timur Alperovich
no flags Details
crash dump (74.04 KB, text/plain)
2015-11-13 10:39 UTC, Timur Alperovich
no flags Details
gpu-hang.2.gz (79.12 KB, application/gzip)
2015-11-17 22:35 UTC, Timur Alperovich
no flags Details
dmesg-output-hang.2.gz (16.38 KB, application/gzip)
2015-11-17 22:35 UTC, Timur Alperovich
no flags Details
crash.dump.3.gz (78.67 KB, application/gzip)
2015-11-20 01:24 UTC, Timur Alperovich
no flags Details
kern.log.3.gz (1.76 MB, application/gzip)
2015-11-20 01:24 UTC, Timur Alperovich
no flags Details

Description Timur Alperovich 2015-11-13 10:38:38 UTC
Created attachment 119623 [details]
dmesg output

Noticed a GPU hang when using the drm-nightly kernel build. The other components (X, etc) are from the Debian testing distribution.

distribution: Debian (testing)
uname -m: x86_64
uname -r: drm-intel-nightly build; the last commit is: 4c2531304c0a
model: Dell XPS 9350 (i7-6500U)
no external display connected

Attached dmesg debug output and the crash dump.
Comment 1 Timur Alperovich 2015-11-13 10:39:13 UTC
Created attachment 119625 [details]
crash dump
Comment 2 Mika Kuoppala 2015-11-16 10:46:33 UTC
If this is easy to reproduce, could you check if this also happens right after boot without suspend/resume cyclke.
Comment 3 Timur Alperovich 2015-11-17 21:27:08 UTC
I've been trying to reproduce this issue, but so have not been able to. I am now running a kernel with the patches suggested in https://bugs.freedesktop.org/show_bug.cgi?id=92935

I did experience one hang (sysrq interrupts did not function and I could not interact with the machine aside from shutting it down), but unfortunately relevant seemed to be logged (I wasn't running with the debug output on).

I'll update the bug if I figure out a way to reproduce this.
Comment 4 Timur Alperovich 2015-11-17 22:34:29 UTC
Shortly after the last comment, I encountered the same issue. Unfortunately, I was not running with debug output enabled and still don't have a way to reproduce the problem consistently, but experienced the same hang. I attached the dump and the dmesg output. This time there was no suspend/resume cycle involved. If the debug output is valuable, I can try running with it turned on and hope to bump into the same crash eventually.
Comment 5 Timur Alperovich 2015-11-17 22:35:13 UTC
Created attachment 119750 [details]
gpu-hang.2.gz
Comment 6 Timur Alperovich 2015-11-17 22:35:37 UTC
Created attachment 119751 [details]
dmesg-output-hang.2.gz
Comment 7 Timur Alperovich 2015-11-17 22:37:54 UTC
Also, I'm not sure if this is helpful, but after the hang, I was able to still use the X terminal, however, the terminals on all of the other ttys appeared hung and chrome was unusable (impossible to scroll or open new windows). I ended up rebooting at that point.
Comment 8 Timur Alperovich 2015-11-20 01:24:09 UTC
Created attachment 119960 [details]
crash.dump.3.gz
Comment 9 Timur Alperovich 2015-11-20 01:24:45 UTC
Created attachment 119961 [details]
kern.log.3.gz
Comment 10 Timur Alperovich 2015-11-20 01:25:45 UTC
Attached debugging output and another crash dump. Let me know if you need any more information. There were no suspend events in this case.
Comment 11 yann 2016-05-18 15:20:39 UTC
*** Bug 94808 has been marked as a duplicate of this bug. ***
Comment 12 yann 2016-09-20 15:36:39 UTC
(In reply to Timur Alperovich from comment #10)
> Attached debugging output and another crash dump. Let me know if you need
> any more information. There were no suspend events in this case.

We seem to have neglected the bug a bit, apologies.
There were workaround for SKL and improvements pushed in kernel and Mesa that will benefit to your system and fix that issue, so please re-test with latest kernel & Mesa to see if this issue is still occurring.
Comment 13 Gary Wang 2016-09-22 02:37:13 UTC
Hi yann,
May I know any detail workaround in MESA or i915 driver in SKL for our backport in early system? Thanks!
Comment 14 Timur Alperovich 2016-09-22 20:33:17 UTC
With the kernel built from the drm-intel repository, nightly branch (commit: 463d07a32d87742a73e1ed352a6d6daa3f29d0c2), I have not observed this issue. At least in my case it appears resolved.
Comment 15 yann 2016-09-30 16:48:46 UTC
(In reply to Gary Wang from comment #13)
> Hi yann,
> May I know any detail workaround in MESA or i915 driver in SKL for our
> backport in early system? Thanks!

Gary, unfortunately I don't have an exhaustive list of these commits, but our developers are putting quite well in title & summary what is the patch/patchset purpose. I suggest that you are directly checking for : skl, skylake, workaround keywords (this is not limitative)

- For i915 driver, for instance a good start is: 
https://cgit.freedesktop.org/drm-intel/log/drivers/gpu/drm/i915?qt=grep&q=workaround
https://cgit.freedesktop.org/drm-intel/log/drivers/gpu/drm/i915?qt=grep&q=skylake

- For mesa, for example a good start is: 
https://cgit.freedesktop.org/mesa/mesa/log/?qt=grep&q=workaround
https://cgit.freedesktop.org/mesa/mesa/log/?qt=grep&q=skylake
Comment 16 yann 2016-11-14 15:09:14 UTC
(In reply to Timur Alperovich from comment #14)
> With the kernel built from the drm-intel repository, nightly branch (commit:
> 463d07a32d87742a73e1ed352a6d6daa3f29d0c2), I have not observed this issue.
> At least in my case it appears resolved.

closing as fixed


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.