Bug 103886 - [CI] igt@perf_pmu@multi-client-vecs0 - fail - Failed assertion: (double)(val[1]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1]) >= (1.0 - (tolerance)) * (double)(slept)
Summary: [CI] igt@perf_pmu@multi-client-vecs0 - fail - Failed assertion: (double)(val[...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 103926 103931 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-11-24 09:25 UTC by Marta Löfstedt
Modified: 2017-11-27 10:07 UTC (History)
1 user (show)

See Also:
i915 platform: GLK
i915 features: Perf/OA


Attachments

Description Marta Löfstedt 2017-11-24 09:25:48 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3377/shard-glkb6/igt@perf_pmu@multi-client-vecs0.html

(perf_pmu:4132) CRITICAL: Test assertion failure function multi_client, file perf_pmu.c:650:
(perf_pmu:4132) CRITICAL: Failed assertion: (double)(val[1]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1]) >= (1.0 - (tolerance)) * (double)(slept)
(perf_pmu:4132) CRITICAL: 'val[1]' != 'slept' (190889525.000000 not within 5.000000% tolerance of 166780280.000000)
Subtest multi-client-vecs0 failed.

Note this was previously filed on bug 103857. However, that bug was fixed for a lot of other tests and machines.
Comment 1 Chris Wilson 2017-11-24 13:40:13 UTC
Final cleanup (we hope):

commit 8ee4f19c47031f23340055da4d9f2af537de23f4
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Fri Nov 24 09:49:59 2017 +0000

    drm/i915/pmu: Stop averaging with the previous sample
    
    Averaging with the previous sample brings a small statistical improvement
    to sampling counters, but can leek a little bit of state from a current
    client to the next which mulls the border between past and present for
    observing clients.
    
    This is because on event enable clients record the current counter value
    and use it as reference, but with rapid off-on event cycles, and due the
    delayed nature of sampling timer self-disarm, previous sample value does
    not get cleared under these circumstances.
    
    Solution is to stop averaging with the previous sample. This has a small
    downside of losing some precision with short and spiky signals, but the
    alternatives look too complicated for the benefit.
    
    Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
    Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
    Link: https://patchwork.freedesktop.org/patch/msgid/20171124094959.10725-1-tvrtko.ursulin@linux.intel.com
Comment 2 Chris Wilson 2017-11-27 09:22:53 UTC
*** Bug 103932 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2017-11-27 09:22:58 UTC
*** Bug 103929 has been marked as a duplicate of this bug. ***
Comment 4 Chris Wilson 2017-11-27 09:23:09 UTC
*** Bug 103926 has been marked as a duplicate of this bug. ***
Comment 5 Chris Wilson 2017-11-27 09:23:40 UTC
*** Bug 103931 has been marked as a duplicate of this bug. ***
Comment 6 Marta Löfstedt 2017-11-27 10:07:37 UTC
OK, all the tests are now green I will close and archive and vet the tests.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.