Bug 106188

Summary: Can't successfully set pstates in pp_od_clk_voltage
Product: DRI Reporter: tempel.julian
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED NOTABUG QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description tempel.julian 2018-04-23 12:49:36 UTC
Hello,
I specified "amdgpu.ppfeaturemask=0xffffffff" as a boot parameter so I could access "/sys/class/drm/card0/device/pp_od_clk_voltage".
The pstate table for source and memory clocks looks correct.

When I run "echo "s 7 1209 900" /sys/class/drm/card0/device/pp_od_clk_voltage", it returns "s 7 1209 900 /sys/class/drm/card0/device/pp_od_clk_voltage".
When I run "echo "c" /sys/class/drm/card0/device/pp_od_clk_voltage" afterwards, it returns "c /sys/class/drm/card0/device/pp_od_clk_voltage".

However, the change is not applied. When I do "cat /sys/class/drm/card0/device/pp_od_clk_voltage", it still says "7:       1196Mhz       1006 mV".
And when I run "watch -n 0.5  cat /sys/kernel/debug/dri/0/amdgpu_pm_info", it reports
"       1196 MHz (SCLK)
        981 mV (VDDGFX)
".

Am I making a mistake somewhere or should it work like this?

I also tried "echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level" and setting pstates 5-7, but that didn't help either.
In the documentation, I read that pp_od_clk_voltage should also include OD_range, but it's not there for me.

Linux drm-next-4.18-wip 4.16.1.52132fd03
MSI RX 560 Aero ITX

Thanks!
Comment 1 Alex Deucher 2018-04-23 15:04:04 UTC
(In reply to tempel.julian from comment #0)
> Hello,
> I specified "amdgpu.ppfeaturemask=0xffffffff" as a boot parameter so I could
> access "/sys/class/drm/card0/device/pp_od_clk_voltage".
> The pstate table for source and memory clocks looks correct.
> 
> When I run "echo "s 7 1209 900"
> /sys/class/drm/card0/device/pp_od_clk_voltage", it returns "s 7 1209 900
> /sys/class/drm/card0/device/pp_od_clk_voltage".
> When I run "echo "c" /sys/class/drm/card0/device/pp_od_clk_voltage"
> afterwards, it returns "c /sys/class/drm/card0/device/pp_od_clk_voltage".
> 

Are you redirecting to the file? Something like the following should work:
echo "s 7 1209 900" > /sys/class/drm/card0/device/pp_od_clk_voltage


> However, the change is not applied. When I do "cat
> /sys/class/drm/card0/device/pp_od_clk_voltage", it still says "7:      
> 1196Mhz       1006 mV".
> And when I run "watch -n 0.5  cat /sys/kernel/debug/dri/0/amdgpu_pm_info",
> it reports
> "       1196 MHz (SCLK)
>         981 mV (VDDGFX)
> ".
> 
> Am I making a mistake somewhere or should it work like this?
> 
> I also tried "echo "manual" >
> /sys/class/drm/card0/device/power_dpm_force_performance_level" and setting
> pstates 5-7, but that didn't help either.

You have to set manual mode before you can manually edit the state.  You also have to be root (or have permission) to write to these files.

> In the documentation, I read that pp_od_clk_voltage should also include
> OD_range, but it's not there for me.

The patch is on the mailing list, but hasn't been committed yet.
https://patchwork.freedesktop.org/patch/217812/
Comment 2 tempel.julian 2018-04-23 16:12:25 UTC
Thanks for your response.

I somehow forgot the missing ">", whoops. I now almost got it working like desired, thanks!

However, there is a small problem left: It doesn't switch into pstate 7.
I set up:
echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level

echo "s 0 214 721" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 1 389 721" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 2 844 725" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 3 1009 818" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 4 1079 875" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 5 1134 880" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 6 1179 890" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 7 1209 900" > /sys/class/drm/card0/device/pp_od_clk_voltage

echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage

But according to /sys/kernel/debug/dri/0/amdgpu_pm_info, pstate 6 (1179 890) is the maximum of chosen clocks.
I set up Unigine Valley to run in 720p with 8xMSAA, so there is no high power consumption which could make the GPU throttle down to a lower state, so it definitely should switch to pstate 7.

I also tried making s 7 and s 6 the same, but then it's just limited to an even lower pstate. So I fail to get to my desired 1209MHz 900mV.

It also seems like setting up power_dpm_force_performance_level to manual isn't actually required, at least it doesn't make any difference here (or I'm missing something).


Btw: Since I'm already bugging you with this :) :
Is there a way to also increase the maximum allowed power consumption?
There are /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap and /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap_max containing the maximum ASIC power (at least it looks to me that way), but I haven't been able to successfully alter those to e.g. raise the limit of my Baffin GPU to 60 Watts.
Comment 3 tempel.julian 2018-04-23 16:39:26 UTC
I figured out that the cause of the described behavior is that it doesn't allow me to really increase the GPU clock.
So I can set "echo "s 7 1194 900" > /sys/class/drm/card0/device/pp_od_clk_voltage" and it works correctly.
But when I set e.g. "echo "s 7 1199 900" > /sys/class/drm/card0/device/pp_od_clk_voltage", the clock doesn't get over pstate 6.

Is this restriction by purpose? It's not there with Wattman on Windows.
Comment 4 tempel.julian 2018-04-27 18:45:49 UTC
Ok, it's simple. To unlock higher clocks, I simply have to set "/sys/class/drm/card0/device/pp_sclk_od" as well.
To sum up: For my 1209MHz 900mV I have to do the following:

echo "2" > /sys/class/drm/card0/device/pp_sclk_od
echo "s 7 1209 900" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage

Works like a charm, thanks for the "Wattman" functionality!

However, I am still not able to raise the maximum allowed power consumption of the GPU.
It's clear that "/sys/class/drm/card0/device/hwmon/hwmon0/power1_cap" sets the actual power limit and thus I need to raise it. But the problem is:
- I can't set it higher than power1_cap_max, only values below work.
- power1_cap_max equals to power1_cap, so raising power consumption is actually completely forbidden
- Despite of being root, I don't have write access to power1_cap_max, unlike to power1_cap

So, it seems I'm being stuck and there is no way of increasing allowed maximum power consumption, which is not a problem with Wattman on Windows.

I suppose either I'm missing a boot parameter to unlock write access to power1_cap_max, or there might be a bug in amdgpu driver.
Any help appreciated.
Comment 5 tempel.julian 2018-04-29 18:43:33 UTC
I'm closing this issue now as my current issue isn't related anymore to the initial one. I'm going to create a new ticket for it instead.
Comment 6 Alex Deucher 2018-05-08 22:18:25 UTC
(In reply to tempel.julian from comment #4)
> However, I am still not able to raise the maximum allowed power consumption
> of the GPU.
> It's clear that "/sys/class/drm/card0/device/hwmon/hwmon0/power1_cap" sets
> the actual power limit and thus I need to raise it. But the problem is:
> - I can't set it higher than power1_cap_max, only values below work.
> - power1_cap_max equals to power1_cap, so raising power consumption is
> actually completely forbidden
> - Despite of being root, I don't have write access to power1_cap_max, unlike
> to power1_cap

The max supported power cap is defined by the OEM in a vbios table and we limit the max to that value.  power1_cap_min and max are RO so you can't change them.  They define the range of available values.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.