Bug 105195 - [skl] GPU HANG: ecode 9:0:0x85dffffb, in Xorg
Summary: [skl] GPU HANG: ecode 9:0:0x85dffffb, in Xorg
Status: RESOLVED DUPLICATE of bug 104411
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-21 15:35 UTC by rs
Modified: 2018-03-01 20:12 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
card1 dump (27.23 KB, application/x-bzip)
2018-02-21 15:35 UTC, rs
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rs 2018-02-21 15:35:35 UTC
Created attachment 137503 [details]
card1 dump

Update "i915 platform" information field as well as "i915 feature" if you can.
The following information about your system:

    -- system architecture: ("uname -m")

x86_64

    -- kernel version: ("uname -r").

4.15.3-300.fc27.x86_64


    -- Linux distribution:

Fedora

    -- Machine or mother board model:

Lenove P50

00:02.0 VGA compatible controller: Intel Corporation HD Graphics P530 (rev 06) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 222e
        Flags: bus master, fast devsel, latency 0, IRQ 134
        Memory at d2000000 (64-bit, non-prefetchable) [size=16M]
        Memory at 90000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 5000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: i915
        Kernel modules: i915

01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M2000M] (rev a2) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 222e
        Flags: fast devsel, IRQ 133
        Memory at d3000000 (32-bit, non-prefetchable) [size=16M]
        Memory at c0000000 (64-bit, prefetchable) [size=256M]
        Memory at d0000000 (64-bit, prefetchable) [size=32M]
        I/O ports at 4000 [size=128]
        Expansion ROM at d4000000 [disabled] [size=512K]
        Capabilities: <access denied>
        Kernel driver in use: nouveau
        Kernel modules: nouveau

    -- Display connector: (such as HDMI, DP, eDP, ...)

Internal laptop display


/var/log/messages snippit from first crash:

Feb 21 07:45:18 titan plasmashell[27211]: QXcbClipboard::setMimeData: Cannot set X11 selection owner
Feb 21 07:45:18 titan plasmashell[27211]: QXcbClipboard::setMimeData: Cannot set X11 selection owner
Feb 21 07:45:19 titan plasmashell[27211]: QXcbClipboard::setMimeData: Cannot set X11 selection owner
Feb 21 07:45:30 titan kernel: [drm] GPU HANG: ecode 9:0:0x85dffffb, in Xorg [26722], reason: Hang on rcs0, action: reset
Feb 21 07:45:30 titan kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Feb 21 07:45:30 titan kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Feb 21 07:45:30 titan kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Feb 21 07:45:30 titan kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Feb 21 07:45:30 titan kernel: [drm] GPU crash dump saved to /sys/class/drm/card1/error
Feb 21 07:45:30 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 21 07:45:38 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 21 07:45:46 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 21 07:45:54 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 21 07:46:02 titan kernel: i915 0000:00:02.0: Resetting rcs0 after gpu hang
Feb 21 07:46:03 titan konsole[29061]: The X11 connection broke: I/O error (code 1)
Feb 21 07:46:03 titan at-spi-bus-launcher[27258]: XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Feb 21 07:46:03 titan at-spi-bus-launcher[27258]:      after 6070 requests (6070 known processed) with 0 events remaining.
Feb 21 07:46:03 titan konsole[28698]: The X11 connection broke: I/O error (code 1)
Feb 21 07:46:03 titan konsole[29189]: The X11 connection broke: I/O error (code 1)
Feb 21 07:46:03 titan konsole[27354]: The X11 connection broke: I/O error (code 1)
Feb 21 07:46:03 titan kaccess[27161]: The X11 connection broke (error 1). Did the X11 server die?
Comment 1 Elizabeth 2018-02-22 17:08:17 UTC
Hello, is this issue replicable? What mesa version are you using? If default, could it be possible for you to try Mesa 18.0.0.rc4? Thank you.
Comment 2 rs 2018-02-23 15:46:35 UTC
Yes, it is 100% reproducable for me. See details in my Redhat bug: https://bugzilla.redhat.com/show_bug.cgi?id=1547612

A few others have reported seeing it too, and one report that "Updating to mesa 17.3.5-1.fc27 did not fix the problem, but reverting to mesa 17.2.4-3.fc27 did".

Do you have a handy link for building mesa? I can try it in a VM..
Comment 3 rs 2018-02-23 15:58:04 UTC
and I just realized that a VM wouldn't be using i915 driver, so that won't work. Unfortunately this is my work laptop, so I can't test. I will ask if someone on the fedor bug can try though..
Comment 5 rs 2018-02-25 16:25:50 UTC
from RedHat bug report, another user says:

"I grabbed mesa-18.0.0-0.1.rc4.fc28 from koji and rebuilt it for F27 in mock. It
does appear to have fixed the lockups for me."
Comment 6 Elizabeth 2018-02-26 16:16:59 UTC
I'll leave this here to try to reproduce later when I get a SKL machine:

https://bugzilla.redhat.com/show_bug.cgi?id=1547612#c1
> Robert Story 2018-02-21 11:33:08 EST
> so I spent some more time trying to reproduce this, and came up with the 
> sequence of events that causes the gpu hang every time:
>
> 1) start emacs
> 2) open a file with at least 4 'pages' of data, where a 'page' is the number 
> of lines that youre current emacs window displays. For testing I created a 
> file of 500 80 character lines and tested with emacs window sizes of 33 and 96.
> 3) press page down once
> 4) press ctrl-space to start selection
> 5) press page down twice to select 2 'pages'
> 6) press ctrl-w to 'cut' selection
> 7) press up arrow to scroll up one line

Can someone bisect form 17.2.4-3 to 17.3.5-1??
(In reply to rs from comment #2)
> Yes, it is 100% reproducable for me. See details in my Redhat bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1547612
> 
> A few others have reported seeing it too, and one report that "Updating to
> mesa 17.3.5-1.fc27 did not fix the problem, but reverting to mesa
> 17.2.4-3.fc27 did".
> 
> Do you have a handy link for building mesa? I can try it in a VM..
Comment 7 Elizabeth 2018-03-01 16:21:17 UTC
Hi again, could someone test new 17.3.6 release? Similar bug 105293 was fixed by it. Thank you.
Comment 8 Clark Williams 2018-03-01 19:37:46 UTC
17.3.6-1.fc27 from updates-testing seems to have fixed this for me.

Was seeing xsession crash on F27+XFCE+emacs, Lenovo T460p with i915 graphics. Same GPU hang message. 

Will keep testing and report back if I hit it in some corner case. 

Thank you!
Comment 9 Mark Janes 2018-03-01 20:12:53 UTC

*** This bug has been marked as a duplicate of bug 104411 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.