Bug 103929 - [CI] igt@perf_pmu@rc6 - fail - Failed assertion: (double)(idle - prev) <= (1.0 + (tolerance)) * (double)(slept) && (double)(idle - prev) >= (1.0 - (tolerance)) * (double)(slept)
Summary: [CI] igt@perf_pmu@rc6 - fail - Failed assertion: (double)(idle - prev) <= (1....
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Marta Löfstedt
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-27 08:40 UTC by Marta Löfstedt
Modified: 2017-12-07 09:50 UTC (History)
1 user (show)

See Also:
i915 platform: KBL
i915 features: Perf/OA


Attachments

Description Marta Löfstedt 2017-11-27 08:40:03 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3379/shard-kbl2/igt@perf_pmu@rc6.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3380/shard-kbl7/igt@perf_pmu@rc6.html


(perf_pmu:2963) CRITICAL: Test assertion failure function test_rc6, file perf_pmu.c:1006:
(perf_pmu:2963) CRITICAL: Failed assertion: (double)(idle - prev) <= (1.0 + (tolerance)) * (double)(slept) && (double)(idle - prev) >= (1.0 - (tolerance)) * (double)(slept)
(perf_pmu:2963) CRITICAL: 'idle - prev' != 'slept' (1887054080.000000 not within 5.000000% tolerance of 2000207863.000000)
Subtest rc6 failed.
Comment 1 Chris Wilson 2017-11-27 09:22:58 UTC

*** This bug has been marked as a duplicate of bug 103886 ***
Comment 2 Marta Löfstedt 2017-11-30 08:30:23 UTC
reproduced:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3413/shard-kbl2/igt@perf_pmu@rc6.html


(perf_pmu:1511) CRITICAL: Test assertion failure function test_rc6, file perf_pmu.c:1006:
(perf_pmu:1511) CRITICAL: Failed assertion: (double)(idle - prev) <= (1.0 + (tolerance)) * (double)(slept) && (double)(idle - prev) >= (1.0 - (tolerance)) * (double)(slept)
(perf_pmu:1511) CRITICAL: 'idle - prev' != 'slept' (1808775680.000000 not within 5.000000% tolerance of 2001331815.000000)
Subtest rc6 failed.

It was also failing on:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3407/shard-kbl5/igt@perf_pmu@rc6.html
and 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3412/shard-kbl4/igt@perf_pmu@rc6.html

but that was shadowed by the 4.15.0.rc1 fire.
Comment 3 Chris Wilson 2017-11-30 08:55:23 UTC
Once more into the breach, (sleepily by accident pushed):

commit ba6c4e6e94f43857c6fa13e5c4cfddad780dc42b (upstream/master, origin/master, origin/HEAD)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 30 07:48:46 2017 +0000

    igt/perf_pmu: Increase delay for rc6 to start
    
    I was thinking of the RC6 threshold parameter, but needed to consider
    the RC6 evaluation interval instead. RC6 doesn't enable until activity
    is below the threshold inside an evaluation interval, therefore we need
    to wait at least 2 EI after idling before we can expect RC6 to be
    enabled.
    
    Fixes: 55a17bc2d040 ("igt/perf_pmu: Reduce arbitrary delays before rc6")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Comment 4 Marta Löfstedt 2017-11-30 09:09:23 UTC
There is no IGT run for this yet, so I will have to wait to close it again...
Comment 5 Marta Löfstedt 2017-11-30 13:40:41 UTC
Fix integrated in CI_DRM_3415
Comment 6 Marta Löfstedt 2017-12-01 09:22:47 UTC
This is looking green from CI_DRM_3415, I will close
Comment 7 Marta Löfstedt 2017-12-05 08:09:55 UTC
Meh:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3452/shard-kbl5/igt@perf_pmu@rc6.html
	

(perf_pmu:1553) CRITICAL: Test assertion failure function test_rc6, file perf_pmu.c:1029:
(perf_pmu:1553) CRITICAL: Failed assertion: (double)(idle - prev) <= (1.0 + (tolerance)) * (double)(slept) && (double)(idle - prev) >= (1.0 - (tolerance)) * (double)(slept)
(perf_pmu:1553) CRITICAL: 'idle - prev' != 'slept' (1317283840.000000 not within 5.000000% tolerance of 2000169218.000000)
Subtest rc6 failed.
Comment 8 Chris Wilson 2017-12-05 09:45:31 UTC
(In reply to Marta Löfstedt from comment #7)
> Meh:
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3452/shard-kbl5/
> igt@perf_pmu@rc6.html
> 	
> 
> (perf_pmu:1553) CRITICAL: Test assertion failure function test_rc6, file
> perf_pmu.c:1029:
> (perf_pmu:1553) CRITICAL: Failed assertion: (double)(idle - prev) <= (1.0 +
> (tolerance)) * (double)(slept) && (double)(idle - prev) >= (1.0 -
> (tolerance)) * (double)(slept)
> (perf_pmu:1553) CRITICAL: 'idle - prev' != 'slept' (1317283840.000000 not
> within 5.000000% tolerance of 2000169218.000000)
> Subtest rc6 failed.

The previous ones where just outside the tolerance; this is 65% of the target, definitely something a bit more major than the previous systematic sampling errors.
Comment 9 Chris Wilson 2017-12-05 13:08:47 UTC
Next attempt,

commit 55da075197d1d7810043655d952e60a20b9e8fa2 (HEAD, upstream/master)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 5 10:37:13 2017 +0000

    igt/perf_pmu: Replace hard-coded sleep before rc6 with a probe
    
    Instead of trying to sleep for 2 evaluations intervals and then assuming
    that rc6 is working, poll the rc6 residency instead.
    
    v2: dce
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=103929
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Comment 10 Marta Löfstedt 2017-12-07 09:50:45 UTC
Fix included in CI_DRM_3459, has been green since, closing


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.