Bug 111034 - [CI][SHARDS] igt@i915_hangman@error-state-capture-bcs0 - fail - Failed assertion: found
Summary: [CI][SHARDS] igt@i915_hangman@error-state-capture-bcs0 - fail - Failed assert...
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-01 09:01 UTC by Martin Peres
Modified: 2019-11-29 19:15 UTC (History)
1 user (show)

See Also:
i915 platform: HSW, SNB
i915 features: GEM/Other


Attachments

Description Martin Peres 2019-07-01 09:01:52 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5073/shard-hsw7/igt@i915_hangman@error-state-capture-bcs0.html

Starting subtest: error-state-capture-bcs0
(i915_hangman:6596) CRITICAL: Test assertion failure function check_error_state, file ../tests/i915/i915_hangman.c:199:
(i915_hangman:6596) CRITICAL: Failed assertion: found
Comment 2 Chris Wilson 2019-07-01 09:48:40 UTC
<7> [807.223529] hangcheck bcs0
<7> [807.223532] hangcheck 	Awake? 1
<7> [807.223534] hangcheck 	Hangcheck: 12032 ms ago
<7> [807.223535] hangcheck 	Reset count: 0 (global 1023)
<7> [807.223537] hangcheck 	Requests:
<7> [807.223541] hangcheck 	RING_START: 0x00021000
<7> [807.223544] hangcheck 	RING_HEAD:  0x00001368
<7> [807.223546] hangcheck 	RING_TAIL:  0x00001378
<7> [807.223549] hangcheck 	RING_CTL:   0x0001f001
<7> [807.223552] hangcheck 	RING_MODE:  0x00000000
<7> [807.223554] hangcheck 	RING_IMR: ffbfffff
<7> [807.223557] hangcheck 	ACTHD:  0x00000000_7fff0d2c
<7> [807.223559] hangcheck 	BBADDR: 0x00000000_7fff052b
<7> [807.223561] hangcheck 	DMA_FADDR: 0x00000000_7fff0ec0
<7> [807.223564] hangcheck 	IPEIR: 0x00000008
<7> [807.223566] hangcheck 	IPEHR: 0x00000000
<7> [807.223570] hangcheck 		E  6:a9949-  @ 12049ms: i915_hangman[5167]
<7> [807.223572] hangcheck HWSP:
<7> [807.223575] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [807.223576] hangcheck *
<7> [807.223579] hangcheck [0100] 000a9925 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [807.223581] hangcheck [0120] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [807.223582] hangcheck *

The seqno suffered a panic attack and went backwards. And because the seqno did not point to a valid request, we ended up with no ringbuffer to inspect.
Comment 3 Francesco Balestrieri 2019-08-06 05:02:51 UTC
This was seen 4 times in 4 different machines on one day, and then not again for more than a month. We should keep monitoring it, but at least it doesn't seem to warrant high priority.
Comment 4 Martin Peres 2019-11-29 19:15:24 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/322.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.