https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4028/shard-apl6/igt@perf_pmu@semaphore-wait-vcs0.html (perf_pmu:11040) CRITICAL: Test assertion failure function sema_wait, file perf_pmu.c:433: (perf_pmu:11040) CRITICAL: Failed assertion: (double)(val[1] - val[0]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1] - val[0]) >= (1.0 - (tolerance)) * (double)(slept) (perf_pmu:11040) CRITICAL: 'val[1] - val[0]' != 'slept' (455000000.000000 not within 5.000000% tolerance of 500107097.000000) Subtest semaphore-wait-vcs0 failed.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3438/shard-apl1/igt@perf_pmu@semaphore-wait-bcs0.html (perf_pmu:2639) CRITICAL: Test assertion failure function sema_wait, file perf_pmu.c:433: (perf_pmu:2639) CRITICAL: Failed assertion: (double)(val[1] - val[0]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1] - val[0]) >= (1.0 - (tolerance)) * (double)(slept) (perf_pmu:2639) CRITICAL: 'val[1] - val[0]' != 'slept' (450000000.000000 not within 5.000000% tolerance of 500083573.000000) Subtest semaphore-wait-bcs0 failed.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3445/shard-glkb6/igt@perf_pmu@semaphore-wait-vecs0.html (perf_pmu:3452) CRITICAL: Test assertion failure function sema_wait, file perf_pmu.c:433: (perf_pmu:3452) CRITICAL: Failed assertion: (double)(val[1] - val[0]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1] - val[0]) >= (1.0 - (tolerance)) * (double)(slept) (perf_pmu:3452) CRITICAL: 'val[1] - val[0]' != 'slept' (450000000.000000 not within 5.000000% tolerance of 500341224.000000) Subtest semaphore-wait-vecs0 failed.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3439/shard-apl5/igt@perf_pmu@semaphore-wait-vecs0.html (perf_pmu:9937) CRITICAL: Test assertion failure function sema_wait, file perf_pmu.c:433: (perf_pmu:9937) CRITICAL: Failed assertion: (double)(val[1] - val[0]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1] - val[0]) >= (1.0 - (tolerance)) * (double)(slept) (perf_pmu:9937) CRITICAL: 'val[1] - val[0]' != 'slept' (450000000.000000 not within 5.000000% tolerance of 500105128.000000) Subtest semaphore-wait-vecs0 failed.
reference: https://patchwork.freedesktop.org/series/34818/
commit c7b20c950276a41badd994324e1983760e44842b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Dec 2 00:14:52 2017 +0000 igt/perf_pmu: Tighten semaphore-wait measurement Record the before/after semaphore-wait values around the sleep to try to reduce the inaccuracy from scheduler delays. Previously, the samples were taken before submitting the batch and then after synchronising its completion. The measurement will then be the total that the semaphore was being sampled, but with the extra syscalls intervening may have drifted from the sleep duration. To further reduce the disparity, wait for the batch to start executing before taking our samples. References: https://bugs.freedesktop.org/show_bug.cgi?id=104013 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Once again, hopefully the last tweak required.
Fix included in CI_DRM_3451, this has been quite flip/flippy I'll let it simmer for a while.
The fix doesn't appear to work: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3463/shard-glkb6/igt@perf_pmu@semaphore-wait-rcs0.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3462/shard-glkb2/igt@perf_pmu@semaphore-wait-rcs0.html (perf_pmu:1948) CRITICAL: Test assertion failure function sema_wait, file perf_pmu.c:443: (perf_pmu:1948) CRITICAL: Failed assertion: (double)(val[1] - val[0]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1] - val[0]) >= (1.0 - (tolerance)) * (double)(slept) (perf_pmu:1948) CRITICAL: 'val[1] - val[0]' != 'slept' (450000000.000000 not within 5.000000% tolerance of 500397783.000000) Subtest semaphore-wait-rcs0 failed.
commit 4900727d35bb20028f9bd83146ec4bf78afffe30 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jan 11 07:30:31 2018 +0000 drm/i915/pmu: Reconstruct active state on starting busy-stats We have a hole in our busy-stat accounting if the pmu is enabled during a long running batch, the pmu will not start accumulating busy-time until the next context switch. This then fails tests that are only sampling a single batch. v2: Count each active port just once (context in/out events are only on the first and last assignment to a port). v3: Avoid hardcoding knowledge of 2 submission ports Fixes: 30e17b7847f5 ("drm/i915: Engine busy time tracking") Testcase: igt/perf_pmu/busy-start Testcase: igt/perf_pmu/busy-double-start Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180111073031.14614-1-chris@chris-wilson.co.uk
Fix integrated in CI_DRM_3622 tests are green.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.