Bug 96497 - [SKL] GPU HANG: ecode 9:0:0x87f99ff9, in Talos [29014], reason: Ring hung, action: reset
Summary: [SKL] GPU HANG: ecode 9:0:0x87f99ff9, in Talos [29014], reason: Ring hung, ac...
Status: RESOLVED WORKSFORME
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 12.0
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-11 22:34 UTC by Armin K
Modified: 2016-11-03 16:58 UTC (History)
2 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
Error log (552.01 KB, text/x-log)
2016-06-11 22:34 UTC, Armin K
Details

Description Armin K 2016-06-11 22:34:19 UTC
Created attachment 124479 [details]
Error log

While playing The Talos Principle, my GPU was hung several times. Linux 4.6.2, xorg-server-1.18.3, GNOME 3.20, xf86-video-intel git ccs May 13th w/ UXA + DRI3

OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 520 (Skylake GT2) 
OpenGL core profile version string: 4.3 (Core Profile) Mesa 12.0.0-rc2 (git-a7649ab)
OpenGL core profile shading language version string: 4.30

Relevant parts from dmesg:

[18745.295358] [drm] stuck on render ring
[18745.297107] [drm] GPU HANG: ecode 9:0:0x87f99ff9, in Talos [29014], reason: Ring hung, action: reset
[18745.297108] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[18745.297109] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[18745.297110] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[18745.297110] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[18745.297111] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[18745.297133] ------------[ cut here ]------------
[18745.297142] WARNING: CPU: 2 PID: 31435 at drivers/gpu/drm/i915/intel_display.c:11385 intel_mmio_flip_work_func+0x6d/0x361
[18745.297142] WARN_ON(__i915_wait_request(mmio_flip->req, mmio_flip->crtc->reset_counter, false, ((void *)0), &mmio_flip->i915->rps.mmioflips))
[18745.297144] Modules linked in:

[18745.297147] CPU: 2 PID: 31435 Comm: kworker/2:3 Not tainted 4.6.2-krejzi #1
[18745.297148] Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.07 01/25/2016
[18745.297151] Workqueue: events intel_mmio_flip_work_func
[18745.297153]  0000000000000286 00000000aad189d1 ffffffff8127aef0 ffff8801d031bda0
[18745.297155]  0000000000000000 ffffffff8109c9fd ffff88000d5f89c0 ffff8801d031be00
[18745.297157]  ffff8800788ac340 ffff8800788ac340 ffff880260498300 0000000000000000
[18745.297159] Call Trace:
[18745.297162]  [<ffffffff8127aef0>] ? dump_stack+0x46/0x59
[18745.297165]  [<ffffffff8109c9fd>] ? __warn+0xc8/0xe1
[18745.297182]  [<ffffffff8109ca6b>] ? warn_slowpath_fmt+0x55/0x71
[18745.297184]  [<ffffffff813cb3dc>] ? intel_mmio_flip_work_func+0x6d/0x361
[18745.297186]  [<ffffffff810aee1a>] ? process_one_work+0x139/0x1d4
[18745.297187]  [<ffffffff810af38d>] ? worker_thread+0x1d6/0x2a1
[18745.297188]  [<ffffffff810af1b7>] ? rescuer_thread+0x2db/0x2db
[18745.297190]  [<ffffffff810b2ffa>] ? kthread+0xa5/0xad
[18745.297193]  [<ffffffff81746902>] ? ret_from_fork+0x22/0x40
[18745.297194]  [<ffffffff810b2f55>] ? init_completion+0x1d/0x1d
[18745.297196] ---[ end trace f640bd76551d7fd9 ]---
[18745.299397] drm/i915: Resetting chip after gpu hang
[18747.295538] [drm] RC6 on
[20047.296499] [drm] stuck on render ring
[20047.298633] [drm] GPU HANG: ecode 9:0:0x85dfdfff, in Talos [29014], reason: Ring hung, action: reset
[20047.298673] ------------[ cut here ]------------
[20047.298680] WARNING: CPU: 3 PID: 31948 at drivers/gpu/drm/i915/intel_display.c:11385 intel_mmio_flip_work_func+0x6d/0x361
[20047.298681] WARN_ON(__i915_wait_request(mmio_flip->req, mmio_flip->crtc->reset_counter, false, ((void *)0), &mmio_flip->i915->rps.mmioflips))
[20047.298682] Modules linked in:

[20047.298686] CPU: 3 PID: 31948 Comm: kworker/3:1 Tainted: G        W       4.6.2-krejzi #1
[20047.298687] Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.07 01/25/2016
[20047.298690] Workqueue: events intel_mmio_flip_work_func
[20047.298692]  0000000000000286 00000000483fccf9 ffffffff8127aef0 ffff880114317da0
[20047.298695]  0000000000000000 ffffffff8109c9fd ffff8800021e4540 ffff880114317e00
[20047.298697]  ffff880216f6e540 ffff880216f6e540 ffff8802604d8300 0000000000000000
[20047.298699] Call Trace:
[20047.298704]  [<ffffffff8127aef0>] ? dump_stack+0x46/0x59
[20047.298707]  [<ffffffff8109c9fd>] ? __warn+0xc8/0xe1
[20047.298710]  [<ffffffff8109ca6b>] ? warn_slowpath_fmt+0x55/0x71
[20047.298712]  [<ffffffff813cb3dc>] ? intel_mmio_flip_work_func+0x6d/0x361
[20047.298730]  [<ffffffff810aee1a>] ? process_one_work+0x139/0x1d4
[20047.298731]  [<ffffffff810af38d>] ? worker_thread+0x1d6/0x2a1
[20047.298733]  [<ffffffff810af1b7>] ? rescuer_thread+0x2db/0x2db
[20047.298735]  [<ffffffff810b2ffa>] ? kthread+0xa5/0xad
[20047.298738]  [<ffffffff81746902>] ? ret_from_fork+0x22/0x40
[20047.298740]  [<ffffffff810b2f55>] ? init_completion+0x1d/0x1d
[20047.298742] ---[ end trace f640bd76551d7fda ]---
[20047.300546] drm/i915: Resetting chip after gpu hang
[20049.300603] [drm] RC6 on
[21947.299985] [drm] stuck on render ring
[21947.301838] [drm] GPU HANG: ecode 9:0:0x86dfdff9, in Talos [29014], reason: Ring hung, action: reset
[21947.302017] ------------[ cut here ]------------
[21947.302023] WARNING: CPU: 2 PID: 32564 at drivers/gpu/drm/i915/intel_display.c:11385 intel_mmio_flip_work_func+0x6d/0x361
[21947.302024] WARN_ON(__i915_wait_request(mmio_flip->req, mmio_flip->crtc->reset_counter, false, ((void *)0), &mmio_flip->i915->rps.mmioflips))
[21947.302025] Modules linked in:

[21947.302028] CPU: 2 PID: 32564 Comm: kworker/2:0 Tainted: G        W       4.6.2-krejzi #1
[21947.302029] Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.07 01/25/2016
[21947.302032] Workqueue: events intel_mmio_flip_work_func
[21947.302033]  0000000000000286 00000000902982bb ffffffff8127aef0 ffff880003883da0
[21947.302035]  0000000000000000 ffffffff8109c9fd ffff880225e1fc00 ffff880003883e00
[21947.302037]  ffff8801451c1f40 ffff8801451c1f40 ffff880260498300 0000000000000000
[21947.302039] Call Trace:
[21947.302043]  [<ffffffff8127aef0>] ? dump_stack+0x46/0x59
[21947.302045]  [<ffffffff8109c9fd>] ? __warn+0xc8/0xe1
[21947.302047]  [<ffffffff8109ca6b>] ? warn_slowpath_fmt+0x55/0x71
[21947.302049]  [<ffffffff813cb3dc>] ? intel_mmio_flip_work_func+0x6d/0x361
[21947.302051]  [<ffffffff810aee1a>] ? process_one_work+0x139/0x1d4
[21947.302052]  [<ffffffff810af38d>] ? worker_thread+0x1d6/0x2a1
[21947.302053]  [<ffffffff810af1b7>] ? rescuer_thread+0x2db/0x2db
[21947.302055]  [<ffffffff810b2ffa>] ? kthread+0xa5/0xad
[21947.302058]  [<ffffffff81746902>] ? ret_from_fork+0x22/0x40
[21947.302059]  [<ffffffff810b2f55>] ? init_completion+0x1d/0x1d
[21947.302060] ---[ end trace f640bd76551d7fdb ]---
[21947.303921] drm/i915: Resetting chip after gpu hang
[21949.296001] [drm] RC6 on

Attached is the mentioned error file
Comment 1 yann 2016-09-01 12:40:46 UTC
Assigning to Mesa product.

From this error dump, hung is happening in render ring batch with active head
at 0xdc42cc8c, with 0x78260000 (3DSTATE_BINDING_TABLE_POINTERS_VS) as IPEHR.

Batch extract (around 0xdc42cc8c):

0xdc42cc58:      0x78170009: 3D UNKNOWN: 3d_965 opcode = 0x7817
0xdc42cc5c:      0x00000000: MI_NOOP
0xdc42cc60:      0x00000007: MI_NOOP
0xdc42cc64:      0x00000000: MI_NOOP
0xdc42cc68:      0x00000000: MI_NOOP
0xdc42cc6c:      0x00000000: MI_NOOP
0xdc42cc70:      0x00000000: MI_NOOP
0xdc42cc74:      0xdc42fde0: UNKNOWN
0xdc42cc78:      0x00000000: MI_NOOP
0xdc42cc7c:      0x00000000: MI_NOOP
0xdc42cc80:      0x00000000: MI_NOOP
0xdc42cc84:      0x78260000: 3DSTATE_BINDING_TABLE_POINTERS_VS
0xdc42cc88:      0x00000000:    dword 1
0xdc42cc8c:      0x782a0000: 3DSTATE_BINDING_TABLE_POINTERS_PS
0xdc42cc90:      0x00004dc0:    dword 1
0xdc42cc94:      0x78500003: 3D UNKNOWN: 3d_965 opcode = 0x7850
Comment 2 Matt Turner 2016-11-03 03:07:59 UTC
Can you try mesa-13.0.0? If you can reproduce, please capture and upload an apitrace (https://github.com/apitrace/apitrace) so that we can easily reproduce as well.
Comment 3 Armin K 2016-11-03 11:37:38 UTC
After half a hour of gameplay on mesa-13.0.0 and kernel-4.8.6, I could not reproduce the issue.
Comment 4 Matt Turner 2016-11-03 16:58:36 UTC
Thank you for testing. Please reopen if you discover the hang is still present.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.