Bug 76877 - [BYT] igt/pm_rc6_residency/*-accuracy isn't an accurate testcase.
Summary: [BYT] igt/pm_rc6_residency/*-accuracy isn't an accurate testcase.
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: lowest minor
Assignee: Jesse Barnes
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 85391 85392 85902 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-04-01 04:03 UTC by wendy.wang
Modified: 2017-10-06 14:38 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Debug patch (613 bytes, patch)
2015-05-28 07:49 UTC, Ander Conselvan de Oliveira
no flags Details | Splinter Review

Description wendy.wang 2014-04-01 04:03:03 UTC
Test on BYT with kernel: 3.14.0-rc8_drm-intel-nightly_19f629_20140331+
Run below script:

	BEFORE=`cat /sys/class/drm/card0/power/rc6_residency_ms`
	time sleep 10	
	AFTER=`cat /sys/class/drm/card0/power/rc6_residency_ms`	
	echo BEFORE = $BEFORE	
	echo AFTER = $AFTER	
	DELTA=$((AFTER - BEFORE))	
	echo DELTA = $DELTA	
	echo "scale=2; ($DELTA / 1000) / 10" | bc -l
	
Show the result:

real 0m10.002s
user 0m0.000s
sys 0m0.002s
BEFORE = 1281484
AFTER = 1292928
DELTA = 11444
1.14

So there should be something is wrong with Sysfs RC6 residency counter on BYT.

Cannot reproduce this issue on HSW and IVB platform.
Comment 1 Daniel Vetter 2014-04-11 14:40:13 UTC

*** This bug has been marked as a duplicate of bug 76542 ***
Comment 2 Ben Widawsky 2014-04-16 16:12:58 UTC
Daniel, I do not believe this is a duplicate. In this bug they enter RC6, but the residency counter ends up with a longer time than the system had slept.
Comment 3 Daniel Vetter 2014-05-19 16:33:41 UTC
Is this still an issue with latest kernel+igt?
Comment 4 liulei 2014-05-20 00:52:58 UTC
(In reply to comment #3)
> Is this still an issue with latest kernel+igt?

Yeah, I use the same script to test on -nightly_f79ba7_20140519.

real    0m10.001s
user    0m0.000s
sys     0m0.002s
BEFORE = 2198119
AFTER = 2209562
DELTA = 11443
1.14
Comment 5 Guo Jinxian 2014-06-19 08:00:22 UTC
This bug is able to reproduce on HSW on latest -fixes(223a6f2b975ab35d93270ea1d4fb6e0ac6b27fe6)

BEFORE = 221873
AFTER = 226900
DELTA = 5027
1.00540
Comment 6 Jesse Barnes 2014-06-25 19:57:48 UTC
I'm not seeing this here on my BYT or HSW.  Does it happen everytime for you when idle?  Is the screen on or off?

I'm not sure if these regs are guaranteed to be accurate; they may be saved & restored by the PUnit with some sort of compensating value added, so I'm not sure if this is a real bug or not.
Comment 7 Jesse Barnes 2014-06-26 17:06:48 UTC
There are a few things going on here:
  - the rc6 residency counters don't have a high precision (some of the data feeding into it is only updated at 1ms intervals)
  - reading from userspace introduces potential error, since the reads may occur further apart than the "time" and "sleep" commands indicate due to scheduling etc.

So I'm closing this as WONTFIX.  We'll have to tolerate some imprecision in these counters.  Since they're for debug only, that should be fine.
Comment 8 Daniel Vetter 2014-11-27 19:11:09 UTC
*** Bug 85902 has been marked as a duplicate of this bug. ***
Comment 9 Daniel Vetter 2014-11-27 19:13:08 UTC
I've recently looked at this together with Tim Gore from vpg and the testcase checks for something that just doesn't work. This isn't just inaccuracy in counters but a broken testcase.

So reopening. One solution might be to revert the testcase to a state before the accuracy checks where added.
Comment 10 wendy.wang 2015-01-12 01:07:48 UTC
Checked with Jesse, he agreed that " the time and sleep commands won't give a script enough precision to evaluate the residency difference accurately."

So, pls revert the rc6_residency check patch, thanks.
Comment 11 Ben Widawsky 2015-01-12 01:54:10 UTC
I think Guo conflated the issue. The numbers reported can't be higher than the actual time slept, regardless of clock granularity. Okay, if you round up sometimes, it can be off by a little, but the amount we're off is actually huge. 

In other words, what Guo reported in #5 is likely just that, approximation error which can easily by fixed with a tiny bit of math to give some amount of fudge in the positive direction. However, the original numbers that Wendy reported aren't multiple reads, there are exactly 2 reads, and the application completed in exactly 0m10.002s. I don't believe this explanation for the 1.14


real 0m10.002s
user 0m0.000s
sys 0m0.002s
BEFORE = 1281484
AFTER = 1292928
DELTA = 11444
1.14
Comment 12 Rodrigo Vivi 2015-01-23 00:17:35 UTC
*** Bug 85391 has been marked as a duplicate of this bug. ***
Comment 13 Rodrigo Vivi 2015-01-23 00:17:47 UTC
*** Bug 85392 has been marked as a duplicate of this bug. ***
Comment 14 Jesse Barnes 2015-03-10 19:01:23 UTC
Does this still happen now that Deepak's patch to add some fuzz has been applied?  If that's not enough we may need to increase the fuzz a bit more, since there's not much we can do about incorrect residency counters (which are really just a debug feature anyway).
Comment 15 wendy.wang 2015-03-11 06:34:48 UTC
(In reply to Jesse Barnes from comment #14)
> Does this still happen now that Deepak's patch to add some fuzz has been
> applied?  If that's not enough we may need to increase the fuzz a bit more,
> since there's not much we can do about incorrect residency counters (which
> are really just a debug feature anyway).

Jesse, would you give the link of Deepak's patch, then we can check it? Thanks.
Comment 16 wendy.wang 2015-03-19 05:54:19 UTC
Update BYT platform measured RC6 residency counter number, which was based on latest igt version.

root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./pm_rc6_residency
IGT-Version: 1.10-ga172676 (x86_64) (Linux: 4.0.0-rc4_drm-intel-nightly_3753ea_20150318+ x86_64)
The residency counter: 1.125574
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:130:
Failed assertion: (flag_counter != 0) && (counter_result <=1)
Sysfs RC6 residency counter is inaccurate.
Subtest rc6-accuracy failed.
**** DEBUG ****
The residency counter: 1.125574
Test requirement passed: !(flag_support == 0)
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:130:
Failed assertion: (flag_counter != 0) && (counter_result <=1)
Sysfs RC6 residency counter is inaccurate.
****  END  ****
Subtest rc6-accuracy: FAIL (0.000s)
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:101:
Failed assertion: diff <= (SLEEP_DURATION + RC6_FUDGE)
Diff was too high. That is unpossible
Subtest media-rc6-accuracy failed.
**** DEBUG ****
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:101:
Failed assertion: diff <= (SLEEP_DURATION + RC6_FUDGE)
Diff was too high. That is unpossible
****  END  ****
Subtest media-rc6-accuracy: FAIL (0.000s)
This platform doesn't support RC6p
Subtest rc6p-accuracy: SKIP (0.001s)
This platform doesn't support RC6pp
Subtest rc6pp-accuracy: SKIP (0.000s)
Comment 17 lu hua 2015-05-28 02:32:04 UTC
It still fails on BYT, but it works well on other platforms.
output on BYT:
IGT-Version: 1.10-g308b0e8 (x86_64) (Linux: 4.1.0-rc3_drm-intel-nightly_056608_20150519+ x86_64)
Residency in rc6 or deeper state: 3340 ms (ratio to expected duration: 1.10)
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:109:
Failed assertion: ratio > 0.9 && ratio <= 1
Sysfs RC6 residency counter is inaccurate.
Stack trace:
  #0 [__igt_fail_assert+0xf1]
  #1 [readit+0x0]
  #2 [__real_main178+0x10d]
  #3 [main+0x29]
  #4 [__libc_start_main+0xf5]
  #5 [_start+0x29]
  #6 [<unknown>+0x29]
Subtest rc6-accuracy failed.
**** DEBUG ****
Test requirement passed: !(!(rc6_mask & RC6_ENABLED))
Residency in rc6 or deeper state: 3340 ms (ratio to expected duration: 1.10)
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:109:
Failed assertion: ratio > 0.9 && ratio <= 1
Sysfs RC6 residency counter is inaccurate.
****  END  ****
Subtest rc6-accuracy: FAIL (0.002s)
Residency in media_rc6 or deeper state: 3341 ms (ratio to expected duration: 1.10)
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:109:
Failed assertion: ratio > 0.9 && ratio <= 1
Last errno: 22, Invalid argument
Sysfs RC6 residency counter is inaccurate.
Stack trace:
  #0 [__igt_fail_assert+0xf1]
  #1 [readit+0x0]
  #2 [__real_main178+0x190]
  #3 [main+0x29]
  #4 [__libc_start_main+0xf5]
  #5 [_start+0x29]
  #6 [<unknown>+0x29]
Subtest media-rc6-accuracy failed.
**** DEBUG ****
Test requirement passed: !(!((rc6_mask & RC6_ENABLED) && (IS_VALLEYVIEW(devid) || IS_CHERRYVIEW(devid))))
Residency in media_rc6 or deeper state: 3341 ms (ratio to expected duration: 1.10)
Test assertion failure function residency_accuracy, file pm_rc6_residency.c:109:
Failed assertion: ratio > 0.9 && ratio <= 1
Last errno: 22, Invalid argument
Sysfs RC6 residency counter is inaccurate.
****  END  ****
Subtest media-rc6-accuracy: FAIL (0.003s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:209:
Test requirement: !(!(rc6_mask & RC6P_ENABLED))
Last errno: 22, Invalid argument
Subtest rc6p-accuracy: SKIP (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:214:
Test requirement: !(!(rc6_mask & RC6PP_ENABLED))
Last errno: 22, Invalid argument
Subtest rc6pp-accuracy: SKIP (0.000s)
Comment 18 Ander Conselvan de Oliveira 2015-05-28 07:49:33 UTC
Created attachment 116110 [details] [review]
Debug patch

Could you run the test again with the attached patched applied on igt? That should provide a bit of additional information.
Comment 19 Imre Deak 2015-05-28 08:44:23 UTC
This should be fixed by the following patch, could you give it a try?:

http://lists.freedesktop.org/archives/intel-gfx/2015-May/066871.html
Comment 20 wendy.wang 2015-05-29 12:01:20 UTC
(In reply to Imre Deak from comment #19)
> This should be fixed by the following patch, could you give it a try?:
> 
> http://lists.freedesktop.org/archives/intel-gfx/2015-May/066871.html

Yes, this patch can fix this issue on BYT, other platforms also pass.

root@x-byt01:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./pm_rc6_residency
IGT-Version: 1.10-gf497238 (x86_64) (Linux: 4.1.0-rc5_kcloud_b03656_20150529+ x86_64)
Residency in rc6 or deeper state: 3001 ms (ratio to expected duration: 0.98)
Subtest rc6-accuracy: SUCCESS (0.000s)
Residency in media_rc6 or deeper state: 3001 ms (ratio to expected duration: 0.98)
Subtest media-rc6-accuracy: SUCCESS (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:209:
Test requirement: !(!(rc6_mask & RC6P_ENABLED))
Subtest rc6p-accuracy: SKIP (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:214:
Test requirement: !(!(rc6_mask & RC6PP_ENABLED))
Subtest rc6pp-accuracy: SKIP (0.000s)

root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./pm_rc6_residency
IGT-Version: 1.10-gf497238 (x86_64) (Linux: 4.1.0-rc5_kcloud_b03656_20150529+ x86_64)
Residency in rc6 or deeper state: 3009 ms (ratio to expected duration: 0.99)
Subtest rc6-accuracy: SUCCESS (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:204:
Test requirement: !(!((rc6_mask & RC6_ENABLED) && (IS_VALLEYVIEW(devid) || IS_CHERRYVIEW(devid))))
Subtest media-rc6-accuracy: SKIP (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:209:
Test requirement: !(!(rc6_mask & RC6P_ENABLED))
Subtest rc6p-accuracy: SKIP (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:214:
Test requirement: !(!(rc6_mask & RC6PP_ENABLED))
Subtest rc6pp-accuracy: SKIP (0.000s)

[root@x-ivb9 tests]# ./pm_rc6_residency
IGT-Version: 1.10-g308b0e8 (x86_64) (Linux: 4.1.0-rc5_kcloud_b03656_20150529+ x86_64)
Residency in rc6 or deeper state: 2975 ms (ratio to expected duration: 0.98)
Subtest rc6-accuracy: SUCCESS (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:204:
Test requirement: !(!((rc6_mask & RC6_ENABLED) && (IS_VALLEYVIEW(devid) || IS_CHERRYVIEW      (devid))))
Subtest media-rc6-accuracy: SKIP (0.000s)
Residency in rc6p or deeper state: 2975 ms (ratio to expected duration: 0.98)
Subtest rc6p-accuracy: SUCCESS (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:214:
Test requirement: !(!(rc6_mask & RC6PP_ENABLED))
Subtest rc6pp-accuracy: SKIP (0.000s)
Comment 21 Imre Deak 2015-05-29 12:27:55 UTC
(In reply to wendy.wang from comment #20)
> (In reply to Imre Deak from comment #19)
> > This should be fixed by the following patch, could you give it a try?:
> > 
> > http://lists.freedesktop.org/archives/intel-gfx/2015-May/066871.html
> 
> Yes, this patch can fix this issue on BYT, other platforms also pass.

Thanks for testing it. Note that the status should be changed to 'RESOLVED/FIXED' only when the fix gets merged. The above fix is still lacking a reviewed-by, so I'll follow up on it on the mailing list. Meanwhile reopening this.
Comment 22 Ander Conselvan de Oliveira 2015-06-15 10:35:08 UTC
Fixed in -nightly by

commit 66c826a1754c07012e29fbe9be7013e92a5acbac
Author: Imre Deak <imre.deak@intel.com>
Date:   Mon Jun 1 10:32:01 2015 +0300

    drm/i915/vlv: fix RC6 residency time calculation
Comment 23 ye.tian 2015-06-16 02:20:05 UTC
Verified latest drm-intel-nightly_8207b9_20150616+ kernel on BDW, this problem has been fixed, So closed.

output info:
------------------------
root@x-byt01:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./pm_rc6_residency
IGT-Version: 1.11-gc83299d (x86_64) (Linux: 4.1.0-rc8_drm-intel-nightly_8207b9_20150616+ x86_64)
Residency in rc6 or deeper state: 3000 ms (ratio to expected duration: 0.98)
Subtest rc6-accuracy: SUCCESS (0.000s)
Residency in media_rc6 or deeper state: 3001 ms (ratio to expected duration: 0.98)
Subtest media-rc6-accuracy: SUCCESS (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:209:
Test requirement: !(!(rc6_mask & RC6P_ENABLED))
Subtest rc6p-accuracy: SKIP (0.000s)
Test requirement not met in function __real_main178, file pm_rc6_residency.c:214:
Test requirement: !(!(rc6_mask & RC6PP_ENABLED))
Subtest rc6pp-accuracy: SKIP (0.000s)
Comment 24 Elizabeth 2017-10-06 14:38:50 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.