Bug 98999 - [SKL] GPU HANG: ecode 9:0:0x84dffff8, in Xorg [538], reason: Hang on render ring, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x84dffff8, in Xorg [538], reason: Hang on render r...
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 100188 100738 100746 100772 100852 100870 100888 101606 101836 103478 103523 104018 108485 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-12-05 14:52 UTC by svenne
Modified: 2018-10-22 07:51 UTC (History)
15 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
crashlog from /sys/class/drm/card0/error (124.67 KB, text/plain)
2016-12-05 14:52 UTC, svenne
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description svenne 2016-12-05 14:52:07 UTC
Created attachment 128347 [details]
crashlog from /sys/class/drm/card0/error

Intel Corporation HD Graphics 520
Lenovo X1 yoga

Crash log attached as per dmesg request.
Comment 1 Chris Wilson 2016-12-05 15:06:24 UTC
A pattern is forming. 2 3DSTATE_VERTEX_ELEMENTS with no intervening 3DPRIMITIVE -> hang?
Comment 2 yann 2016-12-06 09:05:11 UTC
There are constant improvements pushed in kernel and Mesa that will benefit to your system, so please re-test with latest kernel & Mesa (12 or 13) and mark as REOPENED if you can reproduce (and attach fresh gpu error dump & kernel log) and RESOLVED/* if you cannot reproduce.
 
In parallel, assigning to Mesa product.

Kernel: 4.8.11-1-ARCH
Platform: SKL (pci id: 0x1916, pci revision: 0x07, pci subsystem: 17aa:2238)
Mesa: [Please confirm your mesa version]

From this error dump, hung is happening in render ring batch with active head at 0xfe3de74c, with 0x7b000005 (3DPRIMITIVE) as IPEHR.

Batch extract (around 0xfe3de74c):

0xfe3de72c:      0x78090005: 3DSTATE_VERTEX_ELEMENTS
0xfe3de730:      0x02000000:    buffer 0: invalid, type 0x0000, src offset 0x0000 bytes
0xfe3de734:      0x22220000:    (0.0, 0.0, 0.0, 0.0), dst offset 0x00 bytes
0xfe3de738:      0x02f60000:    buffer 0: invalid, type 0x00f6, src offset 0x0000 bytes
0xfe3de73c:      0x11230000:    (X, Y, 0.0, 1.0), dst offset 0x00 bytes
0xfe3de740:      0x02f60004:    buffer 0: invalid, type 0x00f6, src offset 0x0004 bytes
0xfe3de744:      0x11230000:    (X, Y, 0.0, 1.0), dst offset 0x00 bytes
Bad length 7 in (null), expected 6-6
0xfe3de748:      0x7b000005: 3DPRIMITIVE: fail sequential
0xfe3de74c:      0x00000000:    vertex count
0xfe3de750:      0x00000003:    start vertex
0xfe3de754:      0x000000aa:    instance count
0xfe3de758:      0x00000001:    start instance
0xfe3de75c:      0x00000000:    index bias
0xfe3de760:      0x00000000: MI_NOOP
0xfe3de764:      0x05000000: MI_BATCH_BUFFER_END
Comment 3 Mark Janes 2016-12-06 18:35:43 UTC
svenne, can you please attach the contents of your xorg config files?
Comment 4 Matt Turner 2016-12-06 19:27:39 UTC
I think we just want /var/log/Xorg.0.log really
Comment 5 Mark Janes 2016-12-07 17:12:24 UTC
According to mesa engineers, mesa only emits 3DSTATE_VERTEX_ELEMENTS on-demand right before 3DPRIMITIVE.

Chris has changed SNA to emit a dummy primitive between VertexElements in  4acd4a7d3d2f41227022fa7581cfb85a0b124eae.

Yann, please do not assign any more xorg gpu hang bugs to Mesa unless they reproduce with modesetting.

Svenne, switching to modesetting will probably resolve your issue.  For example:
https://bbs.archlinux.org/viewtopic.php?id=211792

You could also update xf86-video-intel to anything after Chris's fix.
Comment 6 Mark Janes 2017-01-12 17:30:12 UTC
*** Bug 99325 has been marked as a duplicate of this bug. ***
Comment 7 Chris Wilson 2017-03-13 20:47:17 UTC
*** Bug 100188 has been marked as a duplicate of this bug. ***
Comment 8 Chris Wilson 2017-03-13 20:48:14 UTC
According to the results elsewhere, the dummy vertex flush appears to work.

commit 4acd4a7d3d2f41227022fa7581cfb85a0b124eae
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Dec 5 15:13:24 2016 +0000

    sna/gen9: Emit a dummy primitive between VertexElements
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=98999
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 9 Chris Wilson 2017-04-20 14:44:03 UTC
*** Bug 100738 has been marked as a duplicate of this bug. ***
Comment 10 Chris Wilson 2017-04-21 07:08:14 UTC
*** Bug 100746 has been marked as a duplicate of this bug. ***
Comment 11 Chris Wilson 2017-04-24 12:04:03 UTC
*** Bug 100772 has been marked as a duplicate of this bug. ***
Comment 12 Chris Wilson 2017-04-27 13:38:36 UTC
*** Bug 100852 has been marked as a duplicate of this bug. ***
Comment 13 Chris Wilson 2017-04-28 14:32:42 UTC
*** Bug 100870 has been marked as a duplicate of this bug. ***
Comment 14 Chris Wilson 2017-05-02 09:48:37 UTC
*** Bug 100888 has been marked as a duplicate of this bug. ***
Comment 15 Chris Wilson 2017-06-27 12:33:16 UTC
*** Bug 101606 has been marked as a duplicate of this bug. ***
Comment 16 Chris Wilson 2017-06-27 17:51:36 UTC
*** Bug 101606 has been marked as a duplicate of this bug. ***
Comment 17 Chris Wilson 2017-10-06 17:08:38 UTC
*** Bug 101836 has been marked as a duplicate of this bug. ***
Comment 18 Chris Wilson 2017-10-27 08:51:12 UTC
*** Bug 103478 has been marked as a duplicate of this bug. ***
Comment 19 Chris Wilson 2017-10-31 16:50:05 UTC
*** Bug 103523 has been marked as a duplicate of this bug. ***
Comment 20 Chris Wilson 2017-12-01 16:25:59 UTC
*** Bug 104018 has been marked as a duplicate of this bug. ***
Comment 21 Chris Wilson 2018-10-22 07:51:30 UTC
*** Bug 108485 has been marked as a duplicate of this bug. ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.