Bug 112042 - [CI][BAT] igt@i915_selftest@live_gem_contexts - dmesg-fail - igt_shared_ctx_exec failed with error -5
Summary: [CI][BAT] igt@i915_selftest@live_gem_contexts - dmesg-fail - igt_shared_ctx_e...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-17 20:26 UTC by Lakshmi
Modified: 2019-11-01 06:03 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lakshmi 2019-10-17 20:26:56 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7099/fi-icl-dsi/igt@i915_selftest@live_gem_contexts.html

(i915_selftest:5079) igt_kmod-WARNING: i915/i915_gem_context_live_selftests: igt_shared_ctx_exec failed with error -5
(i915_selftest:5079) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling always-on
(i915_selftest:5079) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling DC off
(i915_selftest:5079) igt_kmod-WARNING: [drm:gen9_set_dc_state [i915]] Setting DC state from 02 to 00
(i915_selftest:5079) igt_kmod-WARNING: [drm:intel_combo_phy_init [i915]] Combo PHY A already enabled, won't reprogram it.
(i915_selftest:5079) igt_kmod-WARNING: [drm:intel_combo_phy_init [i915]] Combo PHY B already enabled, won't reprogram it.
(i915_selftest:5079) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling power well 2
(i915_selftest:5079) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling power well 3
(i915_selftest:5079) igt_kmod-WARNING: [drm:intel_power_well_enable [i915]] enabling power well 4
(i915_selftest:5079) igt_kmod-WARNING: i915: probe of 0000:00:02.0 failed with error -5
(i915_selftest:5079) igt_kmod-CRITICAL: Test assertion failure function igt_kselftest_execute, file ../lib/igt_kmod.c:548:
(i915_selftest:5079) igt_kmod-CRITICAL: Failed assertion: err == 0
(i915_selftest:5079) igt_kmod-CRITICAL: kselftest "i915 igt__31__live_gem_contexts=1 live_selftests=-1 disable_display=1 st_filter=" failed: Input/output error [5]
Subtest live_gem_contexts failed.
Comment 2 Chris Wilson 2019-10-17 20:36:45 UTC
The basic pattern is like

<7> [659.573044] hangcheck rcs0
<7> [659.573066] hangcheck 	Awake? 6
<7> [659.573070] hangcheck 	Hangcheck: 5954 ms ago
<7> [659.573074] hangcheck 	Reset count: 0 (global 0)
<7> [659.573077] hangcheck 	Requests:
<7> [659.573084] hangcheck 	MMIO base:  0x00002000
<7> [659.573879] hangcheck 	RING_START: 0x001d0000
<7> [659.573886] hangcheck 	RING_HEAD:  0x00000028
<7> [659.573893] hangcheck 	RING_TAIL:  0x00000068
<7> [659.573903] hangcheck 	RING_CTL:   0x00003001
<7> [659.573925] hangcheck 	RING_MODE:  0x00000000
<7> [659.574685] hangcheck 	RING_IMR: 00000000
<7> [659.574700] hangcheck 	ACTHD:  0x00000000_001773e4
<7> [659.574714] hangcheck 	BBADDR: 0x00000000_001773e5
<7> [659.574755] hangcheck 	DMA_FADDR: 0x00000000_001775c0
<7> [659.574762] hangcheck 	IPEIR: 0x00000000
<7> [659.574770] hangcheck 	IPEHR: 0xf77d32ef
<7> [659.575542] hangcheck 	Execlist status: 0x00202098 00000020, entries 12
<7> [659.575548] hangcheck 	Execlist CSB read 2, write 2, tasklet queued? no (enabled)
<7> [659.575561] hangcheck 		Active[0]: ring:{start:001d4000, hwsp:fae192c0, seqno:00000000}, rq:  20fd7:2  prio=3 @ 7541ms: [i915]
<7> [659.575567] hangcheck 		Active[1]: rq:  20fd6:4!+  prio=2 @ 7541ms: signaled
<7> [659.575734] hangcheck 		E  20fd7:2  prio=3 @ 7541ms: [i915]
<7> [659.575814] hangcheck 		Queue priority hint: 3
<7> [659.575820] hangcheck 		Q  20fd8:2  prio=3 @ 7540ms: [i915]
<7> [659.575826] hangcheck 		Q  20fd9:2  prio=3 @ 7540ms: [i915]
<7> [659.575832] hangcheck 		Q  20fda:2  prio=3 @ 7540ms: [i915]
<7> [659.575838] hangcheck 		Q  20fdb:2  prio=3 @ 7538ms: [i915]
<7> [659.575844] hangcheck 		Q  20fd7:4-  prio=2 @ 7541ms: [i915]
<7> [659.575850] hangcheck 		Q  20fd8:4  prio=2 @ 7540ms: [i915]
<7> [659.575855] hangcheck 		Q  20fd9:4  prio=2 @ 7540ms: [i915]
<7> [659.575861] hangcheck 		...skipping 2 queued requests...
<7> [659.575867] hangcheck 		Q  20fdc:2  prio=2 @ 7538ms: [i915]
<7> [659.575897] hangcheck HWSP:
<7> [659.575905] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [659.575909] hangcheck *
<7> [659.575915] hangcheck [0040] 10008002 00000000 10008002 00000000 10008002 00000020 10000014 00000060
<7> [659.575921] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000020 10000001 00000000
<7> [659.575926] hangcheck [0080] 10008002 00000040 10000014 00000040 10008002 00000060 10000014 00000060
<7> [659.575931] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000002
<7> [659.575937] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [659.575941] hangcheck *
<7> [659.575946] hangcheck Idle? no

The GPU is not executing the same context as is active, and the HEAD is off in nowhere land. Naturally it dies.
Comment 3 Francesco Balestrieri 2019-11-01 06:02:50 UTC
Last seen two weeks ago with a repro rate of 5 / 142 runs (3.5%)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.