On two separate machines, we got a performance issue that we never caught before. This is likely the result of some background activity while running the tests. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4409_1/fi-cfl-8109u/igt@gem_mmap_gtt@basic-wc.html (gem_mmap_gtt:2981) CRITICAL: Test assertion failure function test_wc, file ../tests/gem_mmap_gtt.c:282: (gem_mmap_gtt:2981) CRITICAL: Failed assertion: gtt_writes > 2*gtt_reads (gem_mmap_gtt:2981) CRITICAL: Write-Combined writes are expected to be much faster than reads: read=171.86MiB/s, write=337.03MiB/s Subtest basic-wc failed. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4370_120/fi-bsw-n3050/igt@gem_mmap_gtt@basic-wc.html (gem_mmap_gtt:3044) CRITICAL: Test assertion failure function test_wc, file ../tests/gem_mmap_gtt.c:286: (gem_mmap_gtt:3044) CRITICAL: Failed assertion: gtt_writes > cpu_writes/2 (gem_mmap_gtt:3044) CRITICAL: Write-Combined writes are expected to be roughly equivalent to WB writes: WC (gtt)=665.21MiB/s, WB (cpu)=1352.17MiB/s Subtest basic-wc failed.
The DUTs have had cron enabled for maintenance duties, and this has probably been the extra background activity. The crons are now disabled, and periodical anacron is used instead.
Based on last comment, resolved in CI not in i915. Please re-open if still issue.
This bug is noticed a month ago. Closing this bug.
This issue occurred only twice with a frequency of 40 rounds of CI_DRM execution. To ensure that it is really fixed we don't close this defect. For now this was not seen since ~300 rounds. We keep this defect open for few more rounds and then we close this defect. This doesn't mean it needs a fix.
This issue was seen 499 (of CI DRM) rounds ago. Closing this issue as Resolved/fixed. Re-open if this issue persists.
And funnily-enough, it came back in the next repeat run on same BSW and fi-byt-n2820: Starting subtest: basic-wc (gem_mmap_gtt:2536) CRITICAL: Test assertion failure function test_wc, file ../tests/i915/gem_mmap_gtt.c:286: (gem_mmap_gtt:2536) CRITICAL: Failed assertion: gtt_writes > cpu_writes/2 (gem_mmap_gtt:2536) CRITICAL: Write-Combined writes are expected to be roughly equivalent to WB writes: WC (gtt)=966.15MiB/s, WB (cpu)=2058.96MiB/s Subtest basic-wc failed.
Platform = All is a bit of an overstatement, so far this happened on BSW, BYT and CFL. Updating accordingly.
A CI Bug Log filter associated to this bug has been updated: {- All machines: igt@gem_mmap_gtt@basic-wc - fail - Failed assertion: gtt_writes > 2*gtt_reads -} {+ BWR CFL: igt@gem_mmap_gtt@basic-wc - fail - Failed assertion: gtt_writes > 2*gtt_reads +} No new failures caught with the new filter
A CI Bug Log filter associated to this bug has been updated: {- All machines: igt@gem_mmap_gtt@basic-wc - fail - Failed assertion: gtt_writes > cpu_writes/2 -} {+ BYT BSW ICL: igt@gem_mmap_gtt@basic-wc - fail - Failed assertion: gtt_writes > cpu_writes/2 +} No new failures caught with the new filter
This had a sudden burst on BYT just today, see e.g.: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6535/fi-byt-clapper/igt@gem_mmap_gtt@basic-wc.html Did something change that made it suddenly worse?
The secret is that test was always meant to fail on Baytrail. It was an oddity that it didn't fail for CI; but it looks like the kernel got quicker for WB (in particular) and now we are able to see the snafu consistently. The machines identified in the original report are outliers where it is more likely that the scheduler threw off the timings.
To cancel the hijacking, drop the bug as the test is removed from BAT. If it occurs again on sparse runs, be sure to separate out byt for its known HW deficiencies.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.