Bug 101735

Summary: [BAT][ILK] igt@gem_exec_parallel@basic - Failed assertion: map[i] == i
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: high CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: ILK i915 features: GEM/Other

Description Martin Peres 2017-07-10 10:38:40 UTC
When running IGT (4258cc8 igt/kms: Do not wait for fence completion during commit) on CI_DRM_2813 on fi-ilk-650, the test igt@gem_exec_parallel@basic failed the following assertion:

(gem_exec_parallel:1762) CRITICAL: Test assertion failure function check_bo, file gem_exec_parallel.c:54:
(gem_exec_parallel:1762) CRITICAL: Failed assertion: map[i] == i
(gem_exec_parallel:1762) CRITICAL: error: 0 != 918
Subtest basic failed.
**** DEBUG ****
(gem_exec_parallel:1762) DEBUG: Test requirement passed: nengine
(gem_exec_parallel:1762) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_exec_parallel:1762) igt-debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(gem_exec_parallel:1762) DEBUG: Verifying result (pass=0, handle=1)
(gem_exec_parallel:1762) DEBUG: Verifying result (pass=1, handle=2)
(gem_exec_parallel:1762) CRITICAL: Test assertion failure function check_bo, file gem_exec_parallel.c:54:
(gem_exec_parallel:1762) CRITICAL: Failed assertion: map[i] == i
(gem_exec_parallel:1762) CRITICAL: error: 0 != 918
****  END  ****

Full logs: http://benchsrv.fi.intel.com/archive/results/CI_IGT_test/IGT_1310/fi-ilk-650/igt@gem_exec_parallel@basic.html
Comment 1 Chris Wilson 2017-07-11 11:36:28 UTC
One off? We have a race somewhere, my money is on the gpu relocations.
Comment 2 Martin Peres 2017-07-11 11:37:48 UTC
(In reply to Chris Wilson from comment #1)
> One off? We have a race somewhere, my money is on the gpu relocations.

So far, it is. We have had this machine in CI for a couple of months already. Maybe we can catch this more easily with extended.
Comment 3 Chris Wilson 2017-07-28 15:41:18 UTC
A missing reloc (cache flush) is still my best guess. gem_exec_whisper would be the other candidate for stressing this, but would need a forced reloc path. Hmm.
Comment 4 Martin Peres 2017-10-05 16:18:27 UTC
(In reply to Martin Peres from comment #2)
> (In reply to Chris Wilson from comment #1)
> > One off? We have a race somewhere, my money is on the gpu relocations.
> 
> So far, it is. We have had this machine in CI for a couple of months
> already. Maybe we can catch this more easily with extended.

It actually happened a second time. Here is the full history: http://benchsrv.fi.intel.com/cibuglog/?action_failures_history=-1&failures_test=igt@gem_exec_parallel@basic&failures_machine=fi-ilk-650
Comment 5 Marta Löfstedt 2017-11-01 13:41:01 UTC
Never seen again. Closing

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.