Bug 97284 - [SKL] GPU HANG: ecode 9:0:0x84dffff8, in X [1079], reason: Engine(s) hung, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x84dffff8, in X [1079], reason: Engine(s) hung, ac...
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Thomas Schneider
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-10 20:57 UTC by Thomas Schneider
Modified: 2017-02-10 22:32 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
crash dump (378.76 KB, text/plain)
2016-08-10 20:57 UTC, Thomas Schneider
Details

Description Thomas Schneider 2016-08-10 20:57:40 UTC
Created attachment 125688 [details]
crash dump

i915 crashes and/or hangs:
Aug 10 22:05:29 coruscant kernel: [drm] stuck on render ring
Aug 10 22:05:29 coruscant kernel: [drm] GPU HANG: ecode 9:0:0x84dffff8, in X [1079], reason: Engine(s) hung, action: reset
Aug 10 22:05:29 coruscant kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Aug 10 22:05:29 coruscant kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Aug 10 22:05:29 coruscant kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Aug 10 22:05:29 coruscant kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Aug 10 22:05:29 coruscant kernel: [drm] GPU crash dump saved to /sys/class/drm/card1/error
Aug 10 22:05:29 coruscant kernel: drm/i915: Resetting chip after gpu hang
Aug 10 22:05:31 coruscant kernel: [drm] RC6 on
Aug 10 22:05:39 coruscant kernel: [drm] stuck on render ring
Aug 10 22:05:39 coruscant kernel: [drm] GPU HANG: ecode 9:0:0x84dffff8, in X [1079], reason: Engine(s) hung, action: reset
Aug 10 22:05:39 coruscant kernel: [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
Aug 10 22:05:39 coruscant kernel: drm/i915: Resetting chip after gpu hang
Aug 10 22:05:40 coruscant kernel: [drm] RC6 on

This is Linux 4.7.0-gentoo, though such hangs have been happening in earlier kernels as well.  The system noticeable hangs for a few seconds and after that, Chrome and other programs lag considerably in every action.
Comment 1 yann 2016-08-30 12:00:58 UTC
Assigning to Mesa product.

From this error dump, hung is happening in render ring batch with active head
at 0xfe3de9ec, with 0x7b000005 (3DPRIMITIVE) as IPEHR.

Batch extract (around 0xfe3de9ec):

0xfe3de9cc:      0x78090005: 3DSTATE_VERTEX_ELEMENTS
0xfe3de9d0:      0x02000000:    buffer 0: invalid, type 0x0000, src offset 0x0000 bytes
0xfe3de9d4:      0x22220000:    (0.0, 0.0, 0.0, 0.0), dst offset 0x00 bytes
0xfe3de9d8:      0x02f60000:    buffer 0: invalid, type 0x00f6, src offset 0x0000 bytes
0xfe3de9dc:      0x11230000:    (X, Y, 0.0, 1.0), dst offset 0x00 bytes
0xfe3de9e0:      0x02f60004:    buffer 0: invalid, type 0x00f6, src offset 0x0004 bytes
0xfe3de9e4:      0x11230000:    (X, Y, 0.0, 1.0), dst offset 0x00 bytes
Bad length 7 in (null), expected 6-6
0xfe3de9e8:      0x7b000005: 3DPRIMITIVE: fail sequential
0xfe3de9ec:      0x00000000:    vertex count
0xfe3de9f0:      0x00000003:    start vertex
0xfe3de9f4:      0x000000c0:    instance count
0xfe3de9f8:      0x00000001:    start instance
0xfe3de9fc:      0x00000000:    index bias
0xfe3dea00:      0x00000000: MI_NOOP
Comment 2 yann 2016-11-04 15:00:44 UTC
Please test a new version of Mesa (12 or 13) and mark as REOPENED
if you can reproduce and RESOLVED/* if you cannot reproduce.
Comment 3 Thomas Schneider 2016-11-05 09:19:09 UTC
Same bug just reappeared.
Mesa:      Installed versions:  13.0.0_rc1^d(13:40:13 2016-10-22)(classic dri3 egl gallium gbm gles2 llvm nptl wayland -bindist -d3d9 -debug -gles1 -opencl -openmax -osmesa -pax_kernel -pic -selinux -vaapi -valgrind -vdpau -xa -xvmc ABI_MIPS="-n32 -n64 -o32" ABI_PPC="-32 -64" ABI_S390="-32 -64" ABI_X86="32 64 -x32" KERNEL="-FreeBSD" VIDEO_CARDS="intel nouveau -freedreno -i915 -i965 -ilo -r100 -r200 -r300 -r600 -radeon -radeonsi -vc4 -vmware")

Nov 05 10:10:29 coruscant kernel: [drm] stuck on render ring
Nov 05 10:10:29 coruscant kernel: [drm] GPU HANG: ecode 9:0:0x84dffff8, in X [1194], reason: Engine(s) hung, action: reset
Nov 05 10:10:29 coruscant kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Nov 05 10:10:29 coruscant kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Nov 05 10:10:29 coruscant kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Nov 05 10:10:29 coruscant kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Nov 05 10:10:29 coruscant kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Nov 05 10:10:29 coruscant kernel: drm/i915: Resetting chip after gpu hang
Nov 05 10:10:29 coruscant kernel: nouveau 0000:02:00.0: DRM: resuming kernel object tree...
Nov 05 10:10:29 coruscant kernel: nouveau 0000:02:00.0: priv: HUB0: 6013d4 00005700 (1f408200)
Nov 05 10:10:29 coruscant kernel: nouveau 0000:02:00.0: priv: HUB0: 10ecc0 ffffffff (1f40822c)
Nov 05 10:10:29 coruscant kernel: nouveau 0000:02:00.0: DRM: resuming client object trees...
Nov 05 10:10:31 coruscant kernel: [drm] RC6 on
Nov 05 10:10:34 coruscant kernel: ACPI Warning: \_SB.PCI0.PEG2.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160422/nsarguments-95)
Nov 05 10:10:34 coruscant kernel: ACPI: \_SB_.PCI0.PEG2.PEGP: failed to evaluate _DSM
Nov 05 10:10:34 coruscant kernel: ACPI Warning: \_SB.PCI0.PEG2.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160422/nsarguments-95)
Nov 05 10:10:34 coruscant kernel: nouveau 0000:02:00.0: DRM: evicting buffers...
Nov 05 10:10:34 coruscant kernel: nouveau 0000:02:00.0: DRM: waiting for kernel channels to go idle...
Nov 05 10:10:34 coruscant kernel: nouveau 0000:02:00.0: DRM: suspending client object trees...
Nov 05 10:10:34 coruscant kernel: nouveau 0000:02:00.0: DRM: suspending kernel object tree...

I can’t exactly reproduce it, it just happens sometimes.
Comment 4 Mark Janes 2016-12-07 18:00:57 UTC
Thomas,  please verify that you can reproduce this using the modesetting DDX.  There are known issues with xf86-video-intel:

/etc/X11/xorg.conf.d/20-modesetting.conf

Section "Device"
    Identifier  "Intel Graphics"
    Driver      "modesetting"
    Option      "AccelMethod"    "glamor"
    Option      "DRI"            "3"
EndSection
Comment 5 Annie 2017-02-10 22:32:45 UTC
Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.