Bug 106306

Summary: amdgpu CIK power management issues (overdrive and wattman)
Product: DRI Reporter: grmat
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: malcolmlewis
Version: DRI git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg output pp_num_states none

Description grmat 2018-04-28 22:21:44 UTC
Created attachment 139211 [details]
dmesg output pp_num_states

Testing upstream kernel (4.17.0-rc2-ga27fc14219f2) because of the newly introduced "wattman" functionality. On my CIK hardware (Hawaii/R9 290X), none of the sysfs and hwmon entries work. Writing to sysfs files like pp_sclk_od, pp_od_clk_voltage works (no error) but the hardware/driver does not react to the changes. Reading sysfs/hwmon files like power1_cap_max, pp_table, pp_od_clk_voltage print no output. Trying to read from pp_num_states even segfaults (attached is the dmesg output of the corresponding error).
Comment 1 grmat 2018-05-03 17:36:31 UTC
Overdrive doesn't work either with current stable kernel (4.16.6). Clock remains stable (as seen in amdgpu_pm_info) and reading the value gives 0:

> echo 5 | sudo tee /sys/class/drm/card0/device/pp_sclk_od
5
> cat /sys/class/drm/card0/device/pp_sclk_od
0

I can't currently tell when this issue appeared or if the feature even ever worked with this hardware, as I've never been using it.
Comment 2 Alex Deucher 2018-05-08 22:09:19 UTC
CI parts default to the legacy dpm support, you need to enable powerplay to use the new APIs.  Boot with amdgpu.dpm=1 on the kernel command line in grub to force the driver to use powerplay rather than the old dpm code.
Comment 3 grmat 2018-05-10 21:57:22 UTC
Thanks, I wasn't aware of that.

The old overdrive functionality indeed works with amdgpu.dpm=1 on 4.16. However, it doesn't on 4.17. And wattman functionality doesn't work at all; pp_od_clk_voltage prints nothing and doesn't accept anything. Is wattman even implemented for CI?

Also, do you consider enabling amdgpu.dpm for CI by default? I found the big advantage that the VRAM/MC is finally back to the lower dpm state in idle while driving a WQHD 144 Hz monitor. That saves solid 40 Watts on desktop usage.
Comment 4 grmat 2018-07-04 15:00:43 UTC
> wattman functionality doesn't work at all;
> pp_od_clk_voltage prints nothing and doesn't accept anything. Is wattman
> even implemented for CI?

@Alex Deucher: Currently on 4.17.3 w/ dpm and the situation hasn't changed. Is the wattman functionality supposed to work with CIK or not? Do you need any additional info?
Comment 5 Alex Deucher 2018-07-05 16:27:01 UTC
You need to make sure the PP_OVERDRIVE_MASK flag (bit 14) is set.  E.g., amdgpu.ppfeaturemask=0xfffd7fff
See the documentation in the driver for more:
https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c?h=amd-staging-drm-next#n469
Also, If the vbios on the particular board does not support overclocking, the functionality is not exposed on 4.17 IIRC.
Comment 6 Alex Deucher 2018-07-05 17:38:58 UTC
Append amdgpu.ppfeaturemask=0xfffd7fff to the kernel command line in grub.
Comment 7 grmat 2018-07-06 00:17:47 UTC
Thank you for your answer.

I've already set the ppfeaturemask accordingly.

cat /sys/class/drm/card0/device/pp_od_clk_voltage

returns nothing and 

[root] echo "s 7 1000 1200" > /sys/class/drm/card0/device/pp_od_clk_voltage

prints "Invalid argument"

at the time of trying to write to the file, the following is put to the kernel ring buffer:

amdgpu: [powerplay] OverDrive feature not enabled
Comment 8 Martin Peres 2019-11-19 08:37:14 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/369.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.