Summary: | [CI] igt@perf_pmu@busy-accuracy-* - fail - Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)((double)target_busy_pct / 100.0) && (double)(busy_r) >= (1.0 - (0.15)) * (double)((double)target_busy_pct / 100.0) | ||
---|---|---|---|
Product: | DRI | Reporter: | Marta Löfstedt <marta.lofstedt> |
Component: | DRM/Intel | Assignee: | Francesco Balestrieri <francesco.balestrieri> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs, martin.peres |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | BSW/CHT, BXT, CFL, GLK, KBL, SKL | i915 features: | Perf/PMU |
Description
Marta Löfstedt
2018-02-19 07:05:54 UTC
I don't know about pin-pointing GLKB1: Here are some OK runs from that machine: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3788/shard-glkb1/igt@perf_pmu@busy-accuracy-2-bcs0.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3794/shard-glkb1/igt@perf_pmu@busy-accuracy-2-bcs0.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3785/shard-glkb1/igt@perf_pmu@busy-accuracy-2-vecs0.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3786/shard-glkb1/igt@perf_pmu@busy-accuracy-50-vecs0.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3787/shard-glkb1/igt@perf_pmu@busy-accuracy-50-vecs0.html This is NOT only on GLKB1: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3798/shard-glkb2/igt@perf_pmu@busy-accuracy-50-bcs0.html (perf_pmu:1650) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1544: (perf_pmu:1650) CRITICAL: Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)((double)target_busy_pct / 100.0) && (double)(busy_r) >= (1.0 - (0.15)) * (double)((double)target_busy_pct / 100.0) (perf_pmu:1650) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1650) CRITICAL: 'busy_r' != '(double)target_busy_pct / 100.0' (0.420954 not within +15.000000%/-15.000000% tolerance of 0.500000) Subtest busy-accuracy-50-bcs0 failed. https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4271/shard-glkb6/igt@perf_pmu@busy-accuracy-2-rcs0.html (perf_pmu:1505) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1550: (perf_pmu:1505) CRITICAL: Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)((double)target_busy_pct / 100.0) && (double)(busy_r) >= (1.0 - (0.15)) * (double)((double)target_busy_pct / 100.0) (perf_pmu:1505) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1505) CRITICAL: 'busy_r' != '(double)target_busy_pct / 100.0' (0.016381 not within +15.000000%/-15.000000% tolerance of 0.020000) Subtest busy-accuracy-2-rcs0 failed. https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4273/shard-apl6/igt@perf_pmu@busy-accuracy-2-bcs0.html (perf_pmu:1586) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1550: (perf_pmu:1586) CRITICAL: Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)((double)target_busy_pct / 100.0) && (double)(busy_r) >= (1.0 - (0.15)) * (double)((double)target_busy_pct / 100.0) (perf_pmu:1586) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1586) CRITICAL: 'busy_r' != '(double)target_busy_pct / 100.0' (0.016973 not within +15.000000%/-15.000000% tolerance of 0.020000) Subtest busy-accuracy-2-bcs0 failed. We're trying a new method to see how that fares: commit 1ecc978a69a531858ba799425770062ebeb13888 (upstream/master) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 20 13:00:37 2018 +0000 igt/perf_pmu: Use a self-correcting busy pwm patch integrated in IGT_4281 still no CI_DRM_ run but should be in CI_DRM_3820. The frequency looks quite low for this issue, but we'll see... Patch from Comment #5 is in: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3820/shard-apl2/igt@perf_pmu@busy-accuracy-98-vecs0.html (perf_pmu:1953) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1545: (perf_pmu:1953) CRITICAL: Failed assertion: (double)(1 - busy_r) <= (1.0 + (0.15)) * (double)(1 - expected) && (double)(1 - busy_r) >= (1.0 - (0.15)) * (double)(1 - expected) (perf_pmu:1953) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1953) CRITICAL: '1 - busy_r' != '1 - expected' (0.023063 not within +15.000000%/-15.000000% tolerance of 0.019999) Subtest busy-accuracy-98-vecs0 failed. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3824/shard-glkb6/igt@perf_pmu@busy-accuracy-98-vcs0.html (perf_pmu:2875) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1545: (perf_pmu:2875) CRITICAL: Failed assertion: (double)(1 - busy_r) <= (1.0 + (0.15)) * (double)(1 - expected) && (double)(1 - busy_r) >= (1.0 - (0.15)) * (double)(1 - expected) (perf_pmu:2875) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:2875) CRITICAL: '1 - busy_r' != '1 - expected' (0.023030 not within +15.000000%/-15.000000% tolerance of 0.020000) Subtest busy-accuracy-98-vcs0 failed. This test is failing on CFL QA igt@perf_pmu@busy-accuracy-2-vcs0 IGT-Version: 1.21-ga2664f8 (x86_64) (Linux: 4.16.0-rc2-drm-intel-qa-ww9-commit-01a067a+ x86_64) (perf_pmu:2528) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1544: (perf_pmu:2528) CRITICAL: Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)(expected) && (double)(busy_r) >= (1.0 - (0.15)) * (double)(expected) (perf_pmu:2528) CRITICAL: Last errno: 9, Bad file descriptor (perf_pmu:2528) CRITICAL: 'busy_r' != 'expected' (0.023442 not within +15.000000%/-15.000000% tolerance of 0.020000) Subtest busy-accuracy-2-vcs0 failed. **** DEBUG **** (perf_pmu:2528) DEBUG: Test requirement passed: gem_has_execlists(gem_fd) (perf_pmu:2528) INFO: calibration=1000000us, test=1000000us; ratio=2.00% (2500us/122500us) (perf_pmu:2528) DEBUG: Test requirement passed: !(fd < 0 && errno == ENODEV) (perf_pmu:2528) INFO: error=17.21% (2.34% vs 2.00%) (perf_pmu:2528) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1544: (perf_pmu:2528) CRITICAL: Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)(expected) && (double)(busy_r) >= (1.0 - (0.15)) * (double)(expected) (perf_pmu:2528) CRITICAL: Last errno: 9, Bad file descriptor (perf_pmu:2528) CRITICAL: 'busy_r' != 'expected' (0.023442 not within +15.000000%/-15.000000% tolerance of 0.020000) (perf_pmu:2528) igt-core-INFO: Stack trace: (perf_pmu:2528) igt-core-INFO: #0 [__igt_fail_assert+0x101] (perf_pmu:2528) igt-core-INFO: #1 [__real_main1548+0x2bf2] (perf_pmu:2528) igt-core-INFO: #2 [main+0x23] (perf_pmu:2528) igt-core-INFO: #3 [__libc_start_main+0xf1] (perf_pmu:2528) igt-core-INFO: #4 [_start+0x29] (perf_pmu:2528) igt-core-INFO: #5 [<unknown>+0x29] **** END new subtest on glk: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3846/shard-glkb2/igt@perf_pmu@busy-accuracy-98-bcs0.html (perf_pmu:2037) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1545: (perf_pmu:2037) CRITICAL: Failed assertion: (double)(1 - busy_r) <= (1.0 + (0.15)) * (double)(1 - expected) && (double)(1 - busy_r) >= (1.0 - (0.15)) * (double)(1 - expected) (perf_pmu:2037) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:2037) CRITICAL: '1 - busy_r' != '1 - expected' (0.023298 not within +15.000000%/-15.000000% tolerance of 0.019966) Subtest busy-accuracy-98-bcs0 failed. https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4350/shard-glkb3/igt@perf_pmu@busy-accuracy-50-vcs0.html (perf_pmu:1769) CRITICAL: Test assertion failure function accuracy, file ../tests/perf_pmu.c:1606: (perf_pmu:1769) CRITICAL: Failed assertion: (double)(100.0 * busy_r) <= ((double)(100.0 * expected) + (2)) && (double)(100.0 * busy_r) >= ((double)(100.0 * expected) - (2)) (perf_pmu:1769) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1769) CRITICAL: 47.652167 not within +2.000000/-2.000000 of 50.004811! ('100.0 * busy_r' vs '100.0 * expected') Subtest busy-accuracy-50-vcs0 failed. *** Bug 105462 has been marked as a duplicate of this bug. *** This test fails on CFL QA igt@perf_pmu@busy-accuracy-2-bcs0 IGT-Version: 1.22-g5d71d77 (x86_64) (Linux: 4.16.0-rc4-drm-intel-qa-ww11-commit-73f9dfa+ x86_64) calibration=1000000us, test=1000000us; ratio=2.00% (2500us/122500us) 0: busy 21295us, idle 1043658us: 2.00% (target: 2%) 1: busy 40179us, idle 1968831us: 2.00% (target: 2%) error=-25.32% (1.49% vs 2.00%) Stack trace: #0 [__igt_fail_assert+0x101] #1 [accuracy+0x779] #2 [__real_main1597+0x2009] #3 [main+0x23] #4 [__libc_start_main+0xf1] #5 [_start+0x29] #6 [<unknown>+0x29] Subtest busy-accuracy-2-bcs0: FAIL (3.083s) Test requirement not met in function gem_require_engine, file ./../lib/igt_gt.h:120: Test requirement: gem_has_engine(gem_fd, class, instance) Test requirement not met in function gem_require_engine, file ./../lib/igt_gt.h:120: Test requirement: gem_has_engine(gem_fd, class, instance) (perf_pmu:1536) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1593: (perf_pmu:1536) CRITICAL: Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)(expected) && (double)(busy_r) >= (1.0 - (0.15)) * (double)(expected) (perf_pmu:1536) CRITICAL: Last errno: 9, Bad file descriptor (perf_pmu:1536) CRITICAL: 'busy_r' != 'expected' (0.014935 not within +15.000000%/-15.000000% tolerance of 0.020000) Subtest busy-accuracy-2-bcs0 failed. **** DEBUG **** (perf_pmu:1536) DEBUG: Test requirement passed: gem_has_execlists(gem_fd) (perf_pmu:1536) INFO: calibration=1000000us, test=1000000us; ratio=2.00% (2500us/122500us) (perf_pmu:1536) DEBUG: Test requirement passed: !(fd < 0 && errno == ENODEV) (perf_pmu:1536) INFO: error=-25.32% (1.49% vs 2.00%) (perf_pmu:1536) CRITICAL: Test assertion failure function accuracy, file perf_pmu.c:1593: (perf_pmu:1536) CRITICAL: Failed assertion: (double)(busy_r) <= (1.0 + (0.15)) * (double)(expected) && (double)(busy_r) >= (1.0 - (0.15)) * (double)(expected) (perf_pmu:1536) CRITICAL: Last errno: 9, Bad file descriptor (perf_pmu:1536) CRITICAL: 'busy_r' != 'expected' (0.014935 not within +15.000000%/-15.000000% tolerance of 0.020000) (perf_pmu:1536) igt-core-INFO: Stack trace: (perf_pmu:1536) igt-core-INFO: #0 [__igt_fail_assert+0x101] (perf_pmu:1536) igt-core-INFO: #1 [accuracy+0x779] (perf_pmu:1536) igt-core-INFO: #2 [__real_main1597+0x2009] (perf_pmu:1536) igt-core-INFO: #3 [main+0x23] (perf_pmu:1536) igt-core-INFO: #4 [__libc_start_main+0xf1] (perf_pmu:1536) igt-core-INFO: #5 [_start+0x29] (perf_pmu:1536) igt-core-INFO: #6 [<unknown>+0x29] **** END **** https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3932/shard-kbl6/igt@perf_pmu@busy-accuracy-50-vcs1.html (perf_pmu:3688) CRITICAL: Test assertion failure function accuracy, file ../tests/perf_pmu.c:1606: (perf_pmu:3688) CRITICAL: Failed assertion: (double)(100.0 * busy_r) <= ((double)(100.0 * expected) + (2)) && (double)(100.0 * busy_r) >= ((double)(100.0 * expected) - (2)) (perf_pmu:3688) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:3688) CRITICAL: 47.999050 not within +2.000000/-2.000000 of 49.999746! ('100.0 * busy_r' vs '100.0 * expected') Subtest busy-accuracy-50-vcs1 failed. Some stuff from the shardlist on BAT machines run: Note overview of this run is here: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-kbl-7500u/igt@perf_pmu@busy-accuracy-50-bcs0.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-skl-6600u/igt@perf_pmu@busy-accuracy-50-bcs0.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-skl-guc/igt@perf_pmu@busy-accuracy-50-bcs0.html Some hope of https://patchwork.freedesktop.org/series/40662/ could improve these ones, just need to re-spin it to include fewer of the proposed changes. commit d502f055ac4500cada758876a512ac4f14b34851 Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Wed Apr 4 10:51:52 2018 +0100 tests/perf_pmu: Avoid RT thread for accuracy test Realtime scheduling interferes with execlists submission (tasklet) so try to simplify the PWM loop in a few ways: * Drop RT. * Longer batches for smaller systematic error. * More truthful test duration calculation. * Less clock queries. * No self-adjust - instead just report the achieved cycle and let the parent check against it. * Report absolute cycle error. (In reply to Chris Wilson from comment #18) > commit d502f055ac4500cada758876a512ac4f14b34851 > Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Date: Wed Apr 4 10:51:52 2018 +0100 > > tests/perf_pmu: Avoid RT thread for accuracy test > > Realtime scheduling interferes with execlists submission (tasklet) so try > to simplify the PWM loop in a few ways: > > * Drop RT. > * Longer batches for smaller systematic error. > * More truthful test duration calculation. > * Less clock queries. > * No self-adjust - instead just report the achieved cycle and let the > parent check against it. > * Report absolute cycle error. Still visible every single run... https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_43/fi-bxt-j4205/igt@perf_pmu@busy-accuracy-50-rcs0.html (perf_pmu:1698) CRITICAL: Test assertion failure function accuracy, file ../tests/perf_pmu.c:1651: (perf_pmu:1698) CRITICAL: Failed assertion: (double)(100.0 * busy_r) <= ((double)(100.0 * expected) + (2)) && (double)(100.0 * busy_r) >= ((double)(100.0 * expected) - (2)) (perf_pmu:1698) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1698) CRITICAL: 51.924130 not within +2.000000/-2.000000 of 49.911359! ('100.0 * busy_r' vs '100.0 * expected') Subtest busy-accuracy-50-rcs0 failed. https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_43/fi-kbl-guc/igt@perf_pmu@busy-accuracy-50-vecs0.html (perf_pmu:1262) CRITICAL: Test assertion failure function accuracy, file ../tests/perf_pmu.c:1651: (perf_pmu:1262) CRITICAL: Failed assertion: (double)(100.0 * busy_r) <= ((double)(100.0 * expected) + (2)) && (double)(100.0 * busy_r) >= ((double)(100.0 * expected) - (2)) (perf_pmu:1262) CRITICAL: Last errno: 9, Bad file descriptor (perf_pmu:1262) CRITICAL: 54.022868 not within +2.000000/-2.000000 of 49.843047! ('100.0 * busy_r' vs '100.0 * expected') Subtest busy-accuracy-50-vecs0 failed. Also seen on GLK: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4468/shard-glk6/igt@perf_pmu@busy-accuracy-50-rcs0.html (perf_pmu:1295) CRITICAL: Test assertion failure function accuracy, file ../tests/perf_pmu.c:1652: (perf_pmu:1295) CRITICAL: Failed assertion: (double)(100.0 * busy_r) <= ((double)(100.0 * expected) + (2)) && (double)(100.0 * busy_r) >= ((double)(100.0 * expected) - (2)) (perf_pmu:1295) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1295) CRITICAL: 52.936172 not within +2.000000/-2.000000 of 50.592464! ('100.0 * busy_r' vs '100.0 * expected') Subtest busy-accuracy-50-rcs0 failed. Also seen on BSW: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_95/fi-bsw-kefka/igt@perf_pmu@busy-accuracy-50-rcs0.html https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_95/fi-bsw-kefka/igt@perf_pmu@busy-accuracy-50-bcs0.html (perf_pmu:1270) CRITICAL: Test assertion failure function accuracy, file ../tests/perf_pmu.c:1655: (perf_pmu:1270) CRITICAL: Failed assertion: (double)(100.0 * busy_r) <= ((double)(100.0 * expected) + (2)) && (double)(100.0 * busy_r) >= ((double)(100.0 * expected) - (2)) (perf_pmu:1270) CRITICAL: Last errno: 2, No such file or directory (perf_pmu:1270) CRITICAL: 52.409599 not within +2.000000/-2.000000 of 49.659115! ('100.0 * busy_r' vs '100.0 * expected') Subtest busy-accuracy-50-bcs0 failed. Once more into the breach, commit 1754cbd35005605a80b06d808b4f891555a151cd Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 8 14:20:26 2018 +0100 igt/perf_pmu: Aim for a fixed number of iterations for calibrating accuracy Our observation is that the systematic error is proportional to the number of iterations we perform; the suspicion is that it directly correlates with the number of sleeps. Reduce the number of iterations, to try and keep the error in check. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (In reply to Chris Wilson from comment #23) > Once more into the breach, > > commit 1754cbd35005605a80b06d808b4f891555a151cd > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Wed Aug 8 14:20:26 2018 +0100 > > igt/perf_pmu: Aim for a fixed number of iterations for calibrating > accuracy > > Our observation is that the systematic error is proportional to the > number of iterations we perform; the suspicion is that it directly > correlates with the number of sleeps. Reduce the number of iterations, > to try and keep the error in check. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> That finally fixed it! Thanks :) |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.