Bug 108013 - GPU hang with i915
Summary: GPU hang with i915
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard: Triaged
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-21 22:13 UTC by Michael Glasgow
Modified: 2019-09-25 19:14 UTC (History)
1 user (show)

See Also:
i915 platform: BDW
i915 features: GPU hang


Attachments
dmesg output (989 bytes, text/plain)
2018-09-21 22:14 UTC, Michael Glasgow
Details
lspci output (1.02 KB, text/plain)
2018-09-21 22:14 UTC, Michael Glasgow
Details
snippet from messages log (1.93 KB, text/plain)
2018-09-21 22:15 UTC, Michael Glasgow
Details
sudo cat /sys/class/drm/card0/error (370.39 KB, application/gzip)
2018-09-21 22:16 UTC, Michael Glasgow
Details
complete list of installed RPMs (73.17 KB, text/plain)
2018-09-21 22:20 UTC, Michael Glasgow
Details
boot dmesg (926.81 KB, text/plain)
2018-10-03 17:07 UTC, Derek Buckley
Details
xorg log (246.60 KB, text/x-log)
2018-10-05 21:33 UTC, Michael Glasgow
Details

Description Michael Glasgow 2018-09-21 22:13:00 UTC
Just following instructions from dmesg output to file a bug here.  Let me know if I can provide any other info.
Comment 1 Michael Glasgow 2018-09-21 22:14:20 UTC
Created attachment 141677 [details]
dmesg output
Comment 2 Michael Glasgow 2018-09-21 22:14:45 UTC
Created attachment 141678 [details]
lspci output
Comment 3 Michael Glasgow 2018-09-21 22:15:10 UTC
Created attachment 141679 [details]
snippet from messages log
Comment 4 Michael Glasgow 2018-09-21 22:16:31 UTC
Created attachment 141680 [details]
sudo cat /sys/class/drm/card0/error
Comment 5 Michael Glasgow 2018-09-21 22:20:29 UTC
Created attachment 141681 [details]
complete list of installed RPMs

Distro is Oracle Linux 7 with kernel-uek 4.1.12-124.19.4.el7uek.x86_64
Comment 6 Lakshmi 2018-09-24 07:07:10 UTC
Michael, can you try to verify this with latest drm-tip (https://cgit.freedesktop.org/drm-tip) and kernel parameters drm.debug=0x1e log_buf_len=4M. 
If the problem persists attach the full dmesg from boot.

How often you see this issue?

How much impact this issue has for you? Is there any particular pattern to see the hang? This is important for us to prioritize bugs.
Comment 7 Derek Buckley 2018-09-25 17:26:31 UTC
I also have this issue with my i915. I can't use my Thinkpad P50 with the dock or it hangs shortly after connecting to external monitors.
Comment 8 Lakshmi 2018-09-28 06:46:06 UTC
(In reply to Derek Buckley from comment #7)
> I also have this issue with my i915. I can't use my Thinkpad P50 with the
> dock or it hangs shortly after connecting to external monitors.

Derek, Have you tried to verify the issue with latest drm-tip?
Comment 9 Derek Buckley 2018-09-28 18:47:19 UTC
(In reply to Lakshmi from comment #8)
> (In reply to Derek Buckley from comment #7)
> > I also have this issue with my i915. I can't use my Thinkpad P50 with the
> > dock or it hangs shortly after connecting to external monitors.
> 
> Derek, Have you tried to verify the issue with latest drm-tip?

With the newest drm-tip it still hangs and crashes
Comment 10 Lakshmi 2018-09-30 19:21:42 UTC
> With the newest drm-tip it still hangs and crashes

Can you send dmesg from boot with kernel parameters drm.debug=0x1e log_buf_len=4M.
Comment 11 Derek Buckley 2018-10-03 17:07:31 UTC
Created attachment 141862 [details]
boot dmesg

The dmesg from boot with kernel parameters
Comment 12 Lakshmi 2018-10-04 08:02:11 UTC
I assume this is a mesa bug. So changing the product and assignee.
Comment 13 Lionel Landwerlin 2018-10-04 09:23:52 UTC
What's the X driver that you're using?
Comment 14 Lionel Landwerlin 2018-10-04 09:26:33 UTC
(In reply to Lionel Landwerlin from comment #13)
> What's the X driver that you're using?

The installed packages seem to imply there is no modesetting X driver, so this is likely the intel ddx (not using mesa).
Comment 15 Derek Buckley 2018-10-04 12:44:24 UTC
(In reply to Lionel Landwerlin from comment #13)
> What's the X driver that you're using?

I use Fedora which defaults to wayland, so I haven't been using X.
Comment 16 Mark Janes 2018-10-04 17:44:52 UTC
To properly investigate, we need:

 - verification that it reproduces with up-to-date mesa and kernel.  Try to
   reproduce with debian testing, arch, fedora, ubuntu, etc.

 - reproduction steps.

If you can't give us reproducible steps on a modern distribution, then you should take the issue up with Oracle's support engineers.
Comment 17 Derek Buckley 2018-10-04 20:07:02 UTC
(In reply to Mark Janes from comment #16)
> To properly investigate, we need:
> 
>  - verification that it reproduces with up-to-date mesa and kernel.  Try to
>    reproduce with debian testing, arch, fedora, ubuntu, etc.
> 
>  - reproduction steps.
> 
> If you can't give us reproducible steps on a modern distribution, then you
> should take the issue up with Oracle's support engineers.

I am on the Fedora 29 Beta with kernel 4.18.11 and Mesa 18.2.1. When docking my Thinkpad P50 with an Intel i7-6700HQ my display will crash with `kernel: [drm] GPU HANG: ecode 9:0:0x87f99ff9, in gnome-shell [2436], reason: hang on rcs0, action: reset` and when I am not docked I still get multiple small freezes with the error `kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang`
Comment 18 Chris Wilson 2018-10-05 12:29:05 UTC
(In reply to Lionel Landwerlin from comment #14)
> (In reply to Lionel Landwerlin from comment #13)
> > What's the X driver that you're using?
> 
> The installed packages seem to imply there is no modesetting X driver, so
> this is likely the intel ddx (not using mesa).

The error state showed it hanging inside a mesa batch...
Comment 19 Lionel Landwerlin 2018-10-05 12:51:12 UTC
(In reply to Chris Wilson from comment #18)
> (In reply to Lionel Landwerlin from comment #14)
> > (In reply to Lionel Landwerlin from comment #13)
> > > What's the X driver that you're using?
> > 
> > The installed packages seem to imply there is no modesetting X driver, so
> > this is likely the intel ddx (not using mesa).
> 
> The error state showed it hanging inside a mesa batch...

Am I missing something?
The error state attached here seems to be running on a Kernel: 4.1.12-124.19.4.el7uek.x86_64 and hanging on a X batch.
Comment 20 Michael Glasgow 2018-10-05 21:24:45 UTC
(In reply to Mark Janes from comment #16)
> If you can't give us reproducible steps on a modern distribution, then you
> should take the issue up with Oracle's support engineers.

It's highly unlikely I'll be able to reproduce, since it's only happened once and I couldn't tell what triggered it.  Interesting that Derek sees a similar hang though, since I too am on a thinkpad with docking station.  (model T450 in my case)

If interested, Oracle's kernel-uek source can be easily obtained.

http://yum.oracle.com/repo/OracleLinux/OL7/UEKR4/archive/x86_64/getPackageSource/kernel-uek-4.1.12-124.19.4.el7uek.src.rpm

I don't expect anyone here to support Oracle's distro, though.  I realize this report may be of limited utility for upstream given the age of the bits.  I only filed the bug here on the off chance it might be a useful data point.  If not, feel free to close it.  Hopefully if it's still a problem in later code, someone else will file a similar bug and then maybe some of this data might be useful to someone.

Lakshmi:  Impact is severe.  X session completely crashed, so I lost some work.
Comment 21 Michael Glasgow 2018-10-05 21:33:13 UTC
Created attachment 141916 [details]
xorg log

Xorg log, since Lionel asked about the X driver in use.
Comment 22 GitLab Migration User 2019-09-25 19:14:07 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1760.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.