Bug 48756

Summary: [IVB]I-G-T/sysfs_rc6_residency fail
Product: DRI Reporter: Guang Yang <guang.a.yang>
Component: DRM/IntelAssignee: Ben Widawsky <ben>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium CC: ben, chris, daniel, eugeni, florian, jbarnes
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
chiefriver booting dmesg none

Description Guang Yang 2012-04-16 01:34:34 UTC
System Environment:
--------------------------
Platform:        Ivybridge
Kernel: (drm-intel-next-queued)75a24eab7c21f76620878ebfc51f8a754411b741
Bug detailed description:
-------------------------
   On IVB platform ,running sysfs_rc6_residency of the Intel-gpu-tools will fail.
   on destop,the error is:Diff was too high. That is unpossible.
   on mobile,the error is:GPU was not in RC6 long enough. Check that the GPU is as idle as possible (ie. no X, running and running no other tests)
Comment 1 Daniel Vetter 2012-04-16 01:51:06 UTC
Quick check: Does this also happen when you run the test alone, on an otherwise completely idle system (not even X running)?
Comment 2 Guang Yang 2012-04-16 01:56:15 UTC
(In reply to comment #1)
> Quick check: Does this also happen when you run the test alone, on an otherwise
> completely idle system (not even X running)?
I have try with booting the system on chiefriver platform, then run that case alone,the error is:
GPU was not in RC6 long enough. Check that the GPU is as idle as possible (ie. no X, running and running no other tests)
Comment 3 Ben Widawsky 2012-04-16 08:44:12 UTC
Can you post the following
dmesg after boot
cat /sys/module/i915/parameters/i915_enable_rc6
cat /sys/kernel/debug/dri/0/i915_drpc_info
Comment 4 Guang Yang 2012-04-16 18:48:31 UTC
Created attachment 60132 [details]
chiefriver booting dmesg

   Here is the dmesg after booting on chiefriver,run the case alone shows the same error before:
  GPU was not in RC6 long enough. Check that the GPU is as idle as possible (ie. no X, running and running no other tests)
  do cat /sys/module/i915/parameters/i915_enable_rc6,the result is: -1
  do cat /sys/kernel/debug/dri/0/i915_drpc_info the result is:
     RC information accurate: yes
     Video Turbo Mode: yes
     HW control enabled: yes
     SW control enabled: no
     RC1e Enabled: no
     RC6 Enabled: yes
     Deep RC6 Enabled: yes
     Deepest RC6 Enabled: no
     Current RC state: RC6
     Core Power Down: no
     RC6 "Locked to RPn" residency since boot: 0
     RC6 residency since boot: 0
     RC6+ residency since boot: 137415387
     RC6++ residency since boot: 0

   RC1e is disabled,should I enable it?
Comment 5 Ben Widawsky 2012-04-16 20:08:19 UTC
(In reply to comment #4)
> Created attachment 60132 [details]
> chiefriver booting dmesg
> 
>    Here is the dmesg after booting on chiefriver,run the case alone shows the
> same error before:
>   GPU was not in RC6 long enough. Check that the GPU is as idle as possible
> (ie. no X, running and running no other tests)
>   do cat /sys/module/i915/parameters/i915_enable_rc6,the result is: -1
>   do cat /sys/kernel/debug/dri/0/i915_drpc_info the result is:
>      RC information accurate: yes
>      Video Turbo Mode: yes
>      HW control enabled: yes
>      SW control enabled: no
>      RC1e Enabled: no
>      RC6 Enabled: yes
>      Deep RC6 Enabled: yes
>      Deepest RC6 Enabled: no
>      Current RC state: RC6
>      Core Power Down: no
>      RC6 "Locked to RPn" residency since boot: 0
>      RC6 residency since boot: 0
>      RC6+ residency since boot: 137415387
>      RC6++ residency since boot: 0
> 
>    RC1e is disabled,should I enable it?

You do not need to enable RC1e.

2 things:
1. I'm confused/worried that you have deep RC6 enabled. It shouldn't make the test fail, but it also shouldn't be enabled. You can try setting the parameter i915_enable_rc6=1, and checking the test results. Even if this passes, we have to dig a bit further.
2.. do cat /sys/kernel/debug/dri/0/i915_drpc_info multiple times and see if the residency numbers are changing.
Comment 6 Guang Yang 2012-04-17 02:31:59 UTC
> You do not need to enable RC1e.
> 2 things:
> 1. I'm confused/worried that you have deep RC6 enabled. It shouldn't make the
> test fail, but it also shouldn't be enabled. You can try setting the parameter
> i915_enable_rc6=1, and checking the test results. Even if this passes, we have
> to dig a bit further.
> 2.. do cat /sys/kernel/debug/dri/0/i915_drpc_info multiple times and see if the
> residency numbers are changing.

 do with 2 things:
 1. I boot the kernel with setting i915.i915_enable_rc6=1, the case still fails.
 2. I have cat it sevel times, the numbers of RC6+ residency since boot is changing by time.
Comment 7 Ben Widawsky 2012-04-17 11:57:41 UTC
(In reply to comment #6)
> > You do not need to enable RC1e.
> > 2 things:
> > 1. I'm confused/worried that you have deep RC6 enabled. It shouldn't make the
> > test fail, but it also shouldn't be enabled. You can try setting the parameter
> > i915_enable_rc6=1, and checking the test results. Even if this passes, we have
> > to dig a bit further.
> > 2.. do cat /sys/kernel/debug/dri/0/i915_drpc_info multiple times and see if the
> > residency numbers are changing.
> 
>  do with 2 things:
>  1. I boot the kernel with setting i915.i915_enable_rc6=1, the case still
> fails.
>  2. I have cat it sevel times, the numbers of RC6+ residency since boot is
> changing by time.

Can you display the sha of the intel-gpu-tools HEAD? Or just confirm that the following commit is in your i-g-t: ee014dbb8d98ada16a4829ff9878af6d4a06dcad

I will try to reproduce this locally meanwhile.
Comment 8 Guang Yang 2012-04-17 18:44:38 UTC
(In reply to comment #7)
> Can you display the sha of the intel-gpu-tools HEAD? Or just confirm that the
> following commit is in your i-g-t: ee014dbb8d98ada16a4829ff9878af6d4a06dcad
> I will try to reproduce this locally meanwhile.
  Ben, I reset my I-G-T to the commit:ee014dbb8d98ada16a4829ff9878af6d4a06dcad
then try the case with the kernel booting with the setting i915_enable_rc6=1,
the case still fails.
Comment 9 Ben Widawsky 2012-04-17 20:00:14 UTC
This occurs with both "make test" and running the test directly?
Comment 10 Guang Yang 2012-04-17 20:14:06 UTC
(In reply to comment #9)
> This occurs with both "make test" and running the test directly?
 yeah,with these two methods,the case all fails.
Comment 11 Eugeni Dodonov 2012-04-17 20:31:42 UTC
Unfortunately, until Haswell, we don't have that much control over RC6 activation in our driver. GPU decides on its own when to enter the RC6 state. Sometimes it is enough to run some sleep commands to put it into idle, sometimes GPU decides that it is not enough and stays awake. This also gets worse when different BIOS implementations define different VIDs thresholds and timeouts, so it is hard to get a deterministic figure here.

I have one new theory that if we run the 'xset dpms force off' perhaps it would tell the GPU to stop doing anything and could force it to idle (as in theory we do turn off the pipe this way), but I don't have enough arguments to confirm this yet.
Comment 12 Eugeni Dodonov 2012-04-17 20:46:03 UTC
(Updating to rephrase the previous comment and avoid potential ambiguity I noticed)

Unfortunately, with the way the i915 driver interacts with rc6 as of now, we don't have that much control over RC6 activation in our driver, as we leave the control of it to the GPU. I still intend to modify this when after the initial Haswell patches land.
Comment 13 Ben Widawsky 2012-04-17 20:53:58 UTC
Eugeni, are you aware of other registers that may give a clue what is going on? It could definitely be threshold related.
Comment 14 Ben Widawsky 2012-04-20 14:03:09 UTC
Please check if this is fixed in the latest drm-intel-next-queued
Comment 15 Daniel Vetter 2012-04-21 06:10:27 UTC
Actually it should be fixed:

commit 0e31d4ebb0dd5bbad7f998f23958710d7d96c7ed
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Apr 20 11:50:01 2012 -0700

    drm/i915: rc6 residency (fix the fix)
Comment 16 Guang Yang 2012-04-23 18:24:33 UTC
(In reply to comment #14)
> Please check if this is fixed in the latest drm-intel-next-queued
 I have try kernel with the commit 0e31d4ebb0dd5bbad7f998f23958710d7d96c7ed
 on destop and mobile platform,the bug has been fixed.
Comment 17 Florian Mickler 2012-07-01 03:51:01 UTC
A patch referencing this bug report has been merged in Linux v3.5-rc1:

commit a85d4bcb8a0cd5b3c754f98ff91ef2b9b3a73bc5
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Fri Apr 20 11:50:01 2012 -0700

    drm/i915: rc6 residency (fix the fix)
Comment 18 Elizabeth 2017-10-06 14:50:24 UTC
Closing old verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.