Bug 111034

Summary: [CI][SHARDS] igt@i915_hangman@error-state-capture-bcs0 - fail - Failed assertion: found
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: HSW, SNB i915 features: GEM/Other

Description Martin Peres 2019-07-01 09:01:52 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5073/shard-hsw7/igt@i915_hangman@error-state-capture-bcs0.html

Starting subtest: error-state-capture-bcs0
(i915_hangman:6596) CRITICAL: Test assertion failure function check_error_state, file ../tests/i915/i915_hangman.c:199:
(i915_hangman:6596) CRITICAL: Failed assertion: found
Comment 2 Chris Wilson 2019-07-01 09:48:40 UTC
<7> [807.223529] hangcheck bcs0
<7> [807.223532] hangcheck 	Awake? 1
<7> [807.223534] hangcheck 	Hangcheck: 12032 ms ago
<7> [807.223535] hangcheck 	Reset count: 0 (global 1023)
<7> [807.223537] hangcheck 	Requests:
<7> [807.223541] hangcheck 	RING_START: 0x00021000
<7> [807.223544] hangcheck 	RING_HEAD:  0x00001368
<7> [807.223546] hangcheck 	RING_TAIL:  0x00001378
<7> [807.223549] hangcheck 	RING_CTL:   0x0001f001
<7> [807.223552] hangcheck 	RING_MODE:  0x00000000
<7> [807.223554] hangcheck 	RING_IMR: ffbfffff
<7> [807.223557] hangcheck 	ACTHD:  0x00000000_7fff0d2c
<7> [807.223559] hangcheck 	BBADDR: 0x00000000_7fff052b
<7> [807.223561] hangcheck 	DMA_FADDR: 0x00000000_7fff0ec0
<7> [807.223564] hangcheck 	IPEIR: 0x00000008
<7> [807.223566] hangcheck 	IPEHR: 0x00000000
<7> [807.223570] hangcheck 		E  6:a9949-  @ 12049ms: i915_hangman[5167]
<7> [807.223572] hangcheck HWSP:
<7> [807.223575] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [807.223576] hangcheck *
<7> [807.223579] hangcheck [0100] 000a9925 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [807.223581] hangcheck [0120] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [807.223582] hangcheck *

The seqno suffered a panic attack and went backwards. And because the seqno did not point to a valid request, we ended up with no ringbuffer to inspect.
Comment 3 Francesco Balestrieri 2019-08-06 05:02:51 UTC
This was seen 4 times in 4 different machines on one day, and then not again for more than a month. We should keep monitoring it, but at least it doesn't seem to warrant high priority.
Comment 4 Martin Peres 2019-11-29 19:15:24 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/322.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.