Bug 94305

Summary: [BAT ILK] drv_hangman no error state collected
Product: DRI Reporter: Tvrtko Ursulin <tvrtko.ursulin>
Component: DRM/IntelAssignee: Tvrtko Ursulin <tvrtko.ursulin>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: high CC: intel-gfx-bugs, ville.syrjala
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: ILK i915 features:
Attachments:
Description Flags
dmesg log none

Description Tvrtko Ursulin 2016-02-26 12:09:38 UTC
Results for igt@drv_hangman@error-state-basic
Overview

Result: fail

Back to summary
Details
Detail 	Value
Returncode 	99
Time 	0:00:00.217366
Stdout 	

IGT-Version: 1.13-gf27d295 (x86_64) (Linux: 4.5.0-rc5-gfxbench+ x86_64)
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [_assert_dfs_entry.constprop.1+0x1a7]
  #2 [__real_main291+0x3b8]
  #3 [main+0x23]
  #4 [__libc_start_main+0xf0]
  #5 [_start+0x29]
Subtest error-state-basic: FAIL (0.005s)

Stderr 	

(drv_hangman:6035) CRITICAL: Test assertion failure function _assert_dfs_entry, file drv_hangman.c:126:
(drv_hangman:6035) CRITICAL: Failed assertion: !((__extension__ (__builtin_constant_p (l) && ((__builtin_constant_p (tmp) && strlen (tmp) < ((size_t) (l))) || (__builtin_constant_p (s) && strlen (s) < ((size_t) (l)))) ? __extension__ ({ size_t __s1_len, __s2_len; (__builtin_constant_p (tmp) && __builtin_constant_p (s) && (__s1_len = strlen (tmp), __s2_len = strlen (s), (!((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) || __s1_len >= 4) && (!((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) || __s2_len >= 4)) ? __builtin_strcmp (tmp, s) : (__builtin_constant_p (tmp) && ((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) && (__s1_len = strlen (tmp), __s1_len < 4) ? (__builtin_constant_p (s) && ((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) ? __builtin_strcmp (tmp, s) : (__extension__ ({ const unsigned char *__s2 = (const unsigned char *) (const char *) (s); int __result = (((const unsigned char *) (const char *) (tmp))[0] - __s2[0]); if (__s1_len > 0 && __result == 0) { __result = (((const unsigned char *) (const char *) (tmp))[1] - __s2[1]); if (__s1_len > 1 && __result == 0) { __result = (((const unsigned char *) (const char *) (tmp))[2] - __s2[2]); if (__s1_len > 2 && __result == 0) __result = (((const unsigned char *) (const char *) (tmp))[3] - __s2[3]); } } __result; }))) : (__builtin_constant_p (s) && ((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) && (__s2_len = strlen (s), __s2_len < 4) ? (__builtin_constant_p (tmp) && ((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) ? __builtin_strcmp (tmp, s) : (- (__extension__ ({ const unsigned char *__s2 = (const unsigned char *) (const char *) (tmp); int __result = (((const unsigned char *) (const char *) (s))[0] - __s2[0]); if (__s2_len > 0 && __result == 0) { __result = (((const unsigned char *) (const char *) (s))[1] - __s2[1]); if (__s2_len > 1 && __result == 0) { __result = (((const unsigned char *) (const char *) (s))[2] - __s2[2]); if (__s2_len > 2 && __result == 0) __result = (((const unsigned char *) (const char *) (s))[3] - __s2[3]); } } __result; })))) : __builtin_strcmp (tmp, s)))); }) : strncmp (tmp, s, l))) == 0)
(drv_hangman:6035) CRITICAL: contents of i915_error_state: 'no error state collected' (expected not 'no error state collected'
Subtest error-state-basic failed.
**** DEBUG ****
(drv_hangman:6035) drmtest-DEBUG: Test requirement passed: fd >= 0
(drv_hangman:6035) DEBUG: dfs entry i915_error_state read 'no error state collected'
(drv_hangman:6035) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring_id)
(drv_hangman:6035) ioctl-wrappers-DEBUG: Test requirement passed: has_ban_period
(drv_hangman:6035) igt-gt-DEBUG: Test requirement passed: has_gpu_reset(fd)
(drv_hangman:6035) igt-gt-DEBUG: Test requirement passed: ctx == 0 || ring == I915_EXEC_RENDER
(drv_hangman:6035) DEBUG: dfs entry i915_error_state read 'no error state collected'
(drv_hangman:6035) CRITICAL: Test assertion failure function _assert_dfs_entry, file drv_hangman.c:126:
(drv_hangman:6035) CRITICAL: Failed assertion: !((__extension__ (__builtin_constant_p (l) && ((__builtin_constant_p (tmp) && strlen (tmp) < ((size_t) (l))) || (__builtin_constant_p (s) && strlen (s) < ((size_t) (l)))) ? __extension__ ({ size_t __s1_len, __s2_len; (__builtin_constant_p (tmp) && __builtin_constant_p (s) && (__s1_len = strlen (tmp), __s2_len = strlen (s), (!((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) || __s1_len >= 4) && (!((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) || __s2_len >= 4)) ? __builtin_strcmp (tmp, s) : (__builtin_constant_p (tmp) && ((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) && (__s1_len = strlen (tmp), __s1_len < 4) ? (__builtin_constant_p (s) && ((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) ? __builtin_strcmp (tmp, s) : (__extension__ ({ const unsigned char *__s2 = (const unsigned char *) (const char *) (s); int __result = (((const unsigned char *) (const char *) (tmp))[0] - __s2[0]); if (__s1_len > 0 && __result == 0) { __result = (((const unsigned char *) (const char *) (tmp))[1] - __s2[1]); if (__s1_len > 1 && __result == 0) { __result = (((const unsigned char *) (const char *) (tmp))[2] - __s2[2]); if (__s1_len > 2 && __result == 0) __result = (((const unsigned char *) (const char *) (tmp))[3] - __s2[3]); } } __result; }))) : (__builtin_constant_p (s) && ((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) && (__s2_len = strlen (s), __s2_len < 4) ? (__builtin_constant_p (tmp) && ((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) ? __builtin_strcmp (tmp, s) : (- (__extension__ ({ const unsigned char *__s2 = (const unsigned char *) (const char *) (tmp); int __result = (((const unsigned char *) (const char *) (s))[0] - __s2[0]); if (__s2_len > 0 && __result == 0) { __result = (((const unsigned char *) (const char *) (s))[1] - __s2[1]); if (__s2_len > 1 && __result == 0) { __result = (((const unsigned char *) (const char *) (s))[2] - __s2[2]); if (__s2_len > 2 && __result == 0) __result = (((const unsigned char *) (const char *) (s))[3] - __s2[3]); } } __result; })))) : __builtin_strcmp (tmp, s)))); }) : strncmp (tmp, s, l))) == 0)
(drv_hangman:6035) CRITICAL: contents of i915_error_state: 'no error state collected' (expected not 'no error state collected'
****  END  ****

Environment 	

PIGLIT_SOURCE_DIR="/opt/igt/piglit" PIGLIT_PLATFORM="mixed_glx_egl"

Command 	/opt/igt/tests/drv_hangman --run-subtest error-state-basic
dmesg
Comment 1 Chris Wilson 2016-03-02 10:03:54 UTC
This has not occurred for me on my x201s. We need a lot more information (debug=7 dmesg + ftrace) here to diagnose how the kernel apparently skipped the wait on the recursive batch.
Comment 2 yann 2016-04-21 13:28:51 UTC
tvrtko.ursulin@linux.intel.com, can you provide further information as advised by Chris: debug=7 dmesg + ftrace?

If this is issue doesn't occur anymore, please resolved as "worksforme"
Comment 3 Tvrtko Ursulin 2016-04-21 13:43:40 UTC
I can't see it happening in the CI history at /archive/results/CI_IGT_test/igt@drv_module_reload_basic.html .

I can close it, but it would be good if we knew what fixed it and put it in here as reference.
Comment 4 Chris Wilson 2016-04-21 13:54:51 UTC
I honestly don't think we broke ilk this badly that it failed to execute a batch or do relocations correctly.
Comment 5 yann 2016-04-25 09:59:52 UTC
closing this bug as it is fixed now. Please reopen if it appears again and require investigation
Comment 6 Ville Syrjala 2016-05-12 11:58:04 UTC
Still seeing this on ro-ilk1-i5-650
Comment 7 Ville Syrjala 2016-05-12 11:58:16 UTC
*** Bug 95364 has been marked as a duplicate of this bug. ***
Comment 8 Daniela Prodan 2016-05-13 12:11:25 UTC
Created attachment 123670 [details]
dmesg log
Comment 9 Daniela Prodan 2016-05-13 12:11:50 UTC
Still fails on ILK:
/archive/results/CI_IGT_test/RO_CI_DRM_369/ro-ilk1-i5-650/html/ro-ilk1-i5-650@RO_CI_DRM_369@1/igt@drv_hangman@error-state-basic.html
Comment 10 Chris Wilson 2016-05-13 12:16:13 UTC
I still need at least drm.debug=7 to check that we are executing what we expect to be.
Comment 11 Jari Tahvanainen 2016-07-04 09:17:56 UTC
priority aligned for igt basic tests on gen7 to High+Critical
Comment 12 Jari Tahvanainen 2016-09-09 07:39:17 UTC
Failure has not been visible on ILK in any of the CI testing runs on last 64 execution rounds (~almost 1 month).
The latest results from today are showing:
CI_DRM_1622/fi-ilk-650 - Result: pass
CI_DRM_1622/fi-ilk-m540 - Result: pass
	
IGT-Version: 1.16-ge4d74f2 (x86_64) (Linux: 4.8.0-rc5-CI-CI_DRM_1622+ x86_64)
Subtest error-state-basic: SUCCESS (10.673s)

Based on the previous I would propose this to be marked as resolved+worksforme. Please comment if you disagree.
Comment 13 Jari Tahvanainen 2016-09-30 09:42:59 UTC
Marking as resolved, since failure has not been visible on fi-ilk-650 or fi-ilk-m540. Unfortunately don't know what commit actually fixed the issue.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.