Bug 111598 - [CI][SHARDS] igt@gem_sync@basic-all - fail - Failed assertion: !"GPU hung"
Summary: [CI][SHARDS] igt@gem_sync@basic-all - fail - Failed assertion: !"GPU hung"
Status: NEEDINFO
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high not set
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-09-09 08:16 UTC by Martin Peres
Modified: 2019-09-10 14:33 UTC (History)
1 user (show)

See Also:
i915 platform: SNB
i915 features: GEM/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-09-09 08:16:18 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3416/shard-snb4/igt@gem_sync@basic-all.html

Starting subtest: basic-all
(gem_sync:4753) igt_aux-CRITICAL: Test assertion failure function sig_abort, file ../lib/igt_aux.c:502:
(gem_sync:4753) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
Comment 1 CI Bug Log 2019-09-09 08:18:02 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SNB: igt@gem_sync@basic-all - fail - Failed assertion: !"GPU hung"
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_3416/shard-snb4/igt@gem_sync@basic-all.html
Comment 2 Chris Wilson 2019-09-09 08:25:25 UTC
<7> [2140.194405] hangcheck rcs0
<7> [2140.194409] hangcheck 	Awake? 1
<7> [2140.194411] hangcheck 	Hangcheck: 12032 ms ago
<7> [2140.194413] hangcheck 	Reset count: 0 (global 49)
<7> [2140.194414] hangcheck 	Requests:
<7> [2140.194417] hangcheck 	MMIO base:  0x00002000
<7> [2140.194420] hangcheck 	CCID: 0x7fff610d
<7> [2140.194423] hangcheck 	RING_START: 0x00001000
<7> [2140.194425] hangcheck 	RING_HEAD:  0x00001350
<7> [2140.194428] hangcheck 	RING_TAIL:  0x00000638
<7> [2140.194432] hangcheck 	RING_CTL:   0x00003001
<7> [2140.194435] hangcheck 	RING_MODE:  0x00004040
<7> [2140.194438] hangcheck 	RING_IMR: fffffffe
<7> [2140.194440] hangcheck 	ACTHD:  0x00000000_fa101a00
<7> [2140.194443] hangcheck 	BBADDR: 0x00000000_fa104879
<7> [2140.194446] hangcheck 	DMA_FADDR: 0x00000000_fa107400
<7> [2140.194448] hangcheck 	IPEIR: 0x00000000
<7> [2140.194451] hangcheck 	IPEHR: 0x0042001e
<7> [2140.194455] hangcheck 		E  3:1e3294-  prio=2147483647 @ 13752ms: [i915]
<7> [2140.194457] hangcheck HWSP:
<7> [2140.194460] hangcheck [0000] 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [2140.194463] hangcheck [0020] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [2140.194464] hangcheck *
<7> [2140.194466] hangcheck [0100] 001e3292 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [2140.194469] hangcheck [0120] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [2140.194470] hangcheck *
<7> [2140.194475] hangcheck Idle? no
<7> [2140.194477] hangcheck Signals:
<7> [2140.194479] hangcheck 	[3:1e3294] @ 13752ms

IPEHR doesn't correspond to anything natural. Looks like another instance where we have strange behaviour on the ringbuffer overshooting its TAIL?
Comment 3 Chris Wilson 2019-09-10 14:33:44 UTC
Stuck in

commit 0efa99dd58754d23e884b9ba41cd601f01b58c3d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Sep 9 12:30:18 2019 +0100

    drm/i915/ringbuffer: Flush writes before RING_TAIL update

to see if that makes any difference. Come back in 6-12 months time to find out!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.