Bug 107809 - [CI][DRMTIP] igt@gem_exec_capture@capture-render - dmesg-fail - Failed assertion: !wedged / Failed to reset chip
Summary: [CI][DRMTIP] igt@gem_exec_capture@capture-render - dmesg-fail - Failed assert...
Status: CLOSED DUPLICATE of bug 106078
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-03 14:29 UTC by Martin Peres
Modified: 2019-03-08 12:13 UTC (History)
1 user (show)

See Also:
i915 platform: G33
i915 features:


Attachments

Description Martin Peres 2018-09-03 14:29:25 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_97/fi-blb-e6850/igt@gem_exec_capture@capture-render.html

(gem_exec_capture:1218) igt_gt-CRITICAL: Test assertion failure function igt_force_gpu_reset, file ../lib/igt_gt.c:374:
(gem_exec_capture:1218) igt_gt-CRITICAL: Failed assertion: !wedged
Subtest capture-render failed.

[   56.264072] Setting dangerous option reset - tainting kernel
[   56.294173] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[   61.133603] i915 0000:00:02.0: Failed to reset chip
[   61.310006] i915 0000:00:02.0: i915_reset_device timed out, cancelling all in-flight rendering.
[   71.549613] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:32:pipe A] flip_done timed out
[   71.549904] [drm:intel_check_cpu_fifo_underruns [i915]] *ERROR* pipe A underrun
[   81.789613] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:32:pipe A] flip_done timed out
[   92.029605] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CONNECTOR:38:VGA-1] flip_done timed out
[  102.269606] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:28:plane A] flip_done timed out
[  112.509598] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:32:pipe A] flip_done timed out
[  122.749598] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:32:pipe A] flip_done timed out
[  132.989604] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CONNECTOR:38:VGA-1] flip_done timed out
[  143.229603] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:28:plane A] flip_done timed out
[  153.469602] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:32:pipe A] flip_done timed out
[  163.709600] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:32:pipe A] flip_done timed out
[  173.949594] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CONNECTOR:38:VGA-1] flip_done timed out
[  184.189603] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:28:plane A] flip_done timed out
[  194.429603] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:32:pipe A] flip_done timed out
Comment 1 Chris Wilson 2018-09-03 14:42:42 UTC
<5>[   56.294173] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
<7>[   56.296291] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   56.797734] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   57.300501] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   57.910710] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   58.412420] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   58.915581] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   59.526644] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   60.027767] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<7>[   60.530446] [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING
<3>[   61.133603] i915 0000:00:02.0: Failed to reset chip
<7>[   61.133777] i915_gem_set_wedged rcs0
<7>[   61.133784] i915_gem_set_wedged 	current seqno 2332c, last 1, hangcheck 2332c [4868 ms]
<7>[   61.133789] i915_gem_set_wedged 	Reset count: 0 (global 1)
<7>[   61.133821] i915_gem_set_wedged 	Requests:
<7>[   61.133885] i915_gem_set_wedged 		first  1! [4:2374a] @ 4868ms: rcs0
<7>[   61.133892] i915_gem_set_wedged 		last   1! [4:2374a] @ 4868ms: rcs0
<7>[   61.133956] i915_gem_set_wedged 	RING_START: 0x00002000
<7>[   61.133961] i915_gem_set_wedged 	RING_HEAD:  0x00000000
<7>[   61.133966] i915_gem_set_wedged 	RING_TAIL:  0x00000000
<7>[   61.133972] i915_gem_set_wedged 	RING_CTL:   0x00000400 [waiting]
<7>[   61.133978] i915_gem_set_wedged 	RING_MODE:  0x00000100
<7>[   61.133983] i915_gem_set_wedged 	ACTHD:  0x00000000_0081eac0
<7>[   61.133989] i915_gem_set_wedged 	BBADDR: 0x00000000_00000000
<7>[   61.133994] i915_gem_set_wedged 	DMA_FADDR: 0x00000000_0081eb80
<7>[   61.133999] i915_gem_set_wedged 	IPEIR: 0x00000000
<7>[   61.134005] i915_gem_set_wedged 	IPEHR: 0x00000000
<7>[   61.134039] i915_gem_set_wedged 		E 1! [4:2374a] @ 4868ms: rcs0
<7>[   61.134094] i915_gem_set_wedged 		Queue priority: -2147483648
<7>[   61.134224] i915_gem_set_wedged IRQ? 0x0 (breadcrumbs? no)
<7>[   61.134229] i915_gem_set_wedged HWSP:
<7>[   61.134237] i915_gem_set_wedged [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7>[   61.134242] i915_gem_set_wedged *
<7>[   61.134249] i915_gem_set_wedged [00c0] 0002332c 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7>[   61.134255] i915_gem_set_wedged [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7>[   61.134260] i915_gem_set_wedged *
<7>[   61.134268] i915_gem_set_wedged Idle? no

That the gpu stops responding appears to be just one of those things (e.g. RING_CTL we do clear which should undo the wait).

But the big standout is that seqno=0x2332c but we have had previously reset the next seqno (i.e. the current request has seqno 1 and believes to be completed). That the seqno is stale (and so we treat new requests as completed even before they run on hw) may well explain some of the other bogosity.
Comment 2 Chris Wilson 2018-09-14 20:44:14 UTC

*** This bug has been marked as a duplicate of bug 106078 ***
Comment 3 Lakshmi 2018-10-15 11:30:00 UTC
Closed as Duplicate.
Comment 4 CI Bug Log 2019-03-08 12:13:07 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.