Bug 100046

Summary: [BAT][HSW] igt@gem_busy@basic-hang-default
Product: DRI Reporter: Ander Conselvan de Oliveira <conselvan2>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: dorota.czaplejewicz, intel-gfx-bugs
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: HSW i915 features:

Description Ander Conselvan de Oliveira 2017-03-03 10:26:26 UTC
Out	
IGT-Version: 1.17-gd91fec7 (x86_64) (Linux: 4.10.0-CI-CI_DRM_2272+ x86_64)
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [basic+0x113]
  #2 [_start+0x0]
Subtest basic-hang-default: FAIL (129.960s)

Err	
(gem_busy:7012) CRITICAL: Test assertion failure function basic, file gem_busy.c:496:
(gem_busy:7012) CRITICAL: Failed assertion: igt_seconds_elapsed(&tv) < timeout
(gem_busy:7012) CRITICAL: Hanging batch did not complete within 120s
Subtest basic-hang-default failed.
**** DEBUG ****
(gem_busy:7012) DEBUG: Test requirement passed: gem_has_ring(fd, e->exec_id | e->flags)
(gem_busy:7012) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
(gem_busy:7012) igt-dummyload-DEBUG: Test requirement passed: nengine
(gem_busy:7012) igt-debugfs-DEBUG: Opening debugfs dir /sys/kernel/debug/dri/0
(gem_busy:7012) igt-debugfs-DEBUG: i915_engine_info:
GT awake? yes
Global active requests: 1
render ring
	current seqno 167, last 168, hangcheck 0 [-85937 ms], inflight 1
	Requests:
		first  168 [1:168] prio=-2147483648 @ 129922ms: [global]
		last   168 [1:168] prio=-2147483648 @ 129922ms: [global]
		active 168 [1:168] prio=-2147483648 @ 129922ms: [global]
		[head a750, postfix a788, tail a7c0, batch 0x00000000_7fdec000]
	RING_START: 0x008cc000 [0x008cc000]
	RING_HEAD:  0x0000a778 [0x00004400]
	RING_TAIL:  0x0000a7c0 [0x0000a7c0]
	RING_CTL:   0x0001f001 []
	ACTHD:  0x00000000_7fdec000
	BBADDR: 0x00000000_7fdec001
	PP_DIR_BASE: 0x7fe00000
	PP_DIR_BASE_READ: 0x00000000
	PP_DIR_DCLV: 0xffffffff

blitter ring
	current seqno b0, last b0, hangcheck 0 [-85936 ms], inflight 0
	Requests:
	RING_START: 0x008ed000 [0x00000000]
	RING_HEAD:  0x00005280 [0x00000000]
	RING_TAIL:  0x00005280 [0x00000000]
	RING_CTL:   0x0001f001 []
	ACTHD:  0x00000000_00005280
	BBADDR: 0x00000000_7fdd9004
	PP_DIR_BASE: 0x7fe00000
	PP_DIR_BASE_READ: 0x00000000
	PP_DIR_DCLV: 0xffffffff

bsd ring
	current seqno b0, last b0, hangcheck 0 [-85936 ms], inflight 0
	Requests:
	RING_START: 0x0090e000 [0x00000000]
	RING_HEAD:  0x00005280 [0x00000000]
	RING_TAIL:  0x00005280 [0x00000000]
	RING_CTL:   0x0001f001 []
	ACTHD:  0x00000000_00005280
	BBADDR: 0x00000000_7fdd9004
	PP_DIR_BASE: 0x7fe00000
	PP_DIR_BASE_READ: 0x00000000
	PP_DIR_DCLV: 0xffffffff

video enhancement ring
	current seqno b0, last b0, hangcheck 0 [-85936 ms], inflight 0
	Requests:
	RING_START: 0x0092f000 [0x00000000]
	RING_HEAD:  0x00005280 [0x00000000]
	RING_TAIL:  0x00005280 [0x00000000]
	RING_CTL:   0x0001f001 []
	ACTHD:  0x00000000_00005280
	BBADDR: 0x00000000_7fdd9004
	PP_DIR_BASE: 0x7fe00000
	PP_DIR_BASE_READ: 0x00000000
	PP_DIR_DCLV: 0xffffffff
(gem_busy:7012) igt-debugfs-DEBUG: Opening debugfs dir /sys/kernel/debug/dri/0
(gem_busy:7012) igt-debugfs-DEBUG: i915_hangcheck_info:
Hangcheck active, fires in -127360ms
GT active? yes
render ring:
	seqno = 0 [current 167, last 168], inflight 1
	waiters? no, fake irq active? no, stalled? no
	ACTHD = 0x00000000 [current 0x7fdec000]
	action = idle(0) -85936 ms ago
	instdone read =
		INSTDONE: 0xffd7ffff
		SC_INSTDONE: 0xffffffff
		SAMPLER_INSTDONE[0][0]: 0xffffffff
		ROW_INSTDONE[0][0]: 0xffffffff
	instdone accu =
		INSTDONE: 0x00000000
		SC_INSTDONE: 0x00000000
		SAMPLER_INSTDONE[0][0]: 0x00000000
		ROW_INSTDONE[0][0]: 0x00000000
blitter ring:
	seqno = 0 [current b0, last b0], inflight 0
	waiters? no, fake irq active? no, stalled? no
	ACTHD = 0x00000000 [current 0x00005280]
	action = idle(0) -85936 ms ago
bsd ring:
	seqno = 0 [current b0, last b0], inflight 0
	waiters? no, fake irq active? no, stalled? no
	ACTHD = 0x00000000 [current 0x00005280]
	action = idle(0) -85936 ms ago
video enhancement ring:
	seqno = 0 [current b0, last b0], inflight 0
	waiters? no, fake irq active? no, stalled? no
	ACTHD = 0x00000000 [current 0x00005280]
	action = idle(0) -85936 ms ago
(gem_busy:7012) CRITICAL: Test assertion failure function basic, file gem_busy.c:496:
(gem_busy:7012) CRITICAL: Failed assertion: igt_seconds_elapsed(&tv) < timeout
(gem_busy:7012) CRITICAL: Hanging batch did not complete within 120s
****  END  ****


https://intel-gfx-ci.01.org/CI/CI_DRM_2272/fi-hsw-4770r/igt@gem_busy@basic-hang-default.html

Also:

https://intel-gfx-ci.01.org/CI/Patchwork_4034/fi-hsw-4770r/igt@gem_busy@basic-hang-default.html
Comment 1 Marta Löfstedt 2017-03-10 11:02:06 UTC
In above run I dmesg has:
[   84.105787] [IGT] gem_busy: starting subtest basic-hang-default
[  150.099693] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  150.099743] 	0-...: (1 GPs behind) idle=c4c/0/0 softirq=6778/6778 fqs=0 
[  150.099797] 	1-...: (1 GPs behind) idle=ffc/0/0 softirq=6688/6688 fqs=0 
[  150.099810] 	2-...: (0 ticks this GP) idle=264/0/0 softirq=7681/7681 fqs=0 
[  150.099823] 	3-...: (1 GPs behind) idle=7e4/0/0 softirq=6616/6616 fqs=0 
[  150.099836] 	4-...: (11 GPs behind) idle=018/0/0 softirq=6439/6439 fqs=0 
[  150.099848] 	7-...: (0 ticks this GP) idle=864/0/0 softirq=6884/6884 fqs=0 
[  150.099861] 	(detected by 6, t=65002 jiffies, g=3493, c=3492, q=1)

However, in passing runs we typically see:
[   80.657663] [IGT] gem_busy: starting subtest basic-hang-default
[   98.721063] [drm] GPU HANG: ecode 7:0:0xe757feff, in gem_busy [7019], reason: No progress on render ring, action: reset
[   98.721313] [drm:i915_reset_and_wakeup [i915]] resetting chip
[   98.721386] drm/i915: Resetting chip after gpu hang
[   98.722371] [drm:i915_gem_reset [i915]] context gem_busy[7019]/0 marked guilty (score 10) banned? no
[   98.722402] [drm:i915_gem_reset [i915]] resetting render ring to restart from tail of request 0x168
[   98.722566] [drm:intel_print_rc6_info [i915]] Enabling RC6 states: RC6 on
[   98.730562] [drm:init_workarounds_ring [i915]] render ring: Number of context specific w/a: 0
[   98.730648] [drm:intel_guc_setup [i915]] GuC fw status: path (null), fetch NONE, load NONE
[   98.730894] [IGT] gem_busy: exiting, ret=0
Comment 2 Dorota Czaplejewicz 2017-03-20 09:08:36 UTC
Nothing over 160 runs on HSW-4770R.

kernel: drm-tip: 2017y-03m-17d-23h-25m-27s
config: 2367
Comment 3 Jani Saarinen 2017-03-28 06:58:51 UTC
Statistics: Failure rate 1/141 run(s) (0%)
Whitelisting for CI

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.