Bug 102582

Summary: [BAT][GDG] Failed assertion: gem_bo_busy(fd, obj.handle) in igt@gem_exec_reloc@basic-[gtt|cpu|read]-active; clflush?
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: high CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: I915G i915 features: GEM/Other

Description Martin Peres 2017-09-07 11:40:51 UTC
On CI_DRM_3054, the machine fi-gdg-551 hit the following assert when running igt@gem_exec_reloc@basic-gtt-read-active and igt@gem_exec_reloc@basic-gtt-cpu-active:

(gem_exec_reloc:1703) CRITICAL: Test assertion failure function basic_reloc, file gem_exec_reloc.c:394:
(gem_exec_reloc:1703) CRITICAL: Failed assertion: gem_bo_busy(fd, obj.handle)
Subtest basic-gtt-cpu-active failed.

Full logs:
 - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3054/fi-gdg-551/igt@gem_exec_reloc@basic-gtt-cpu-active.html
 - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3054/fi-gdg-551/igt@gem_exec_reloc@basic-gtt-read-active.html
Comment 1 Chris Wilson 2017-09-07 11:59:39 UTC
Definitely on the impossible list.
Comment 3 Chris Wilson 2017-09-11 14:12:30 UTC
And the opposite problem in #102654 (instead of the buffer not being busy, it doesn't become idle). If they are indeed two sides of the same coin, that says the spinbatch is as dodgy as an 14-sided pound coin.
Comment 5 Marta Löfstedt 2017-10-18 06:13:09 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3254/fi-gdg-551/igt@gem_exec_reloc@basic-write-gtt-active.html

(gem_exec_reloc:1681) CRITICAL: Test assertion failure function basic_reloc, file gem_exec_reloc.c:394:
(gem_exec_reloc:1681) CRITICAL: Failed assertion: gem_bo_busy(fd, obj.handle)
Subtest basic-write-gtt-active failed.
Comment 6 Marta Löfstedt 2017-11-10 06:48:01 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3329/fi-gdg-551/igt@gem_exec_reloc@basic-write-cpu-active.html

(gem_exec_reloc:1678) igt-dummyload-CRITICAL: Test assertion failure function __igt_spin_batch_new, file igt_dummyload.c:183:
(gem_exec_reloc:1678) igt-dummyload-CRITICAL: Failed assertion: gem_bo_busy(fd, spin->handle)
Subtest basic-write-cpu-active failed.
Comment 7 Chris Wilson 2017-12-21 12:05:03 UTC
We've set noclflush on the gdg cmdline, so now we wait for a reoccurrence.
Comment 8 Marta Löfstedt 2018-01-12 08:40:50 UTC
(In reply to Chris Wilson from comment #7)
> We've set noclflush on the gdg cmdline, so now we wait for a reoccurrence.

Chris, I was going to close and archive this, since the issue hasn't been seen since:
CI_DRM_3494: 2017-12-10 / 172 run ago.

However, you seem to want more info on this, so I leave it to you to close or keep it open. I will archive from cibuglog anyways.
Comment 9 Chris Wilson 2018-02-17 11:22:37 UTC
I've no further ideas on how to determine the affected processors or why clflush isn't quite functioning correctly here. Given the limited impact, let's sweep this under the carpet and leave the noclflush w/a inplace.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.