* Ubuntu 17.04, Kernel-4.10 * Gigabyte Radeon™ RX 480 WINDFORCE 8G rev1.0 * http://www.gigabyte.com/Graphics-Card/GV-RX480WF2-8GD-rev-10 * https://gist.github.com/anonymous/2e8964de6e8bf37d3a3b52dc7d213078 1. On Windows 10 & macOS coolers may be parked in mode 0 rpm. AMDGPU "pwm1_enable" can work only in mode=1, it is interesting that the rotation speed static. How hard can it be to implement? The initial implementation with support for throttling of the cooler. Have already been implemented, not enough reading VBIOS from EFI settings to automatically stop the cooler in idle. EFI VBIOS for macOS & Windows driver detected coolers supported noise fan 0% to idle card, but not linux amdgpu.... Lzzy can change /sys/class/drm/card0/device/power meters/hwmon1/pwm1=0 revs but under load it is fraught with burnout, nv support coolers speed pwm1_enable * 0=NONE - Card control PW? * 1=MANUAL * 2=AUTO - OS control PW? 2. When parking in cooler 0 rpm gauge still shows the rotation of coolers (lm-sensors), but the physical status LED/Fan stopped # echo 0 | tee /sys/class/drm/card0/device/hwmon/hwmon1/pwm1 # sensors amdgpu-pci-0100 Adapter: PCI adapter fan1: 875 RPM temp1: +38.0°C (crit = +0.0°C, hyst = +0.0°C) fan1: 875 RPM ? $ cat /sys/class/drm/card0/device/power_dpm_force_performance_level auto $ cat /sys/class/drm/card0/device/power_dpm_state performance $ cat /sys/class/drm/card0/device/hwmon/hwmon1/pwm1 81 $ cat /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_min 0 $ cat /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_max 255 $ echo 0 | sudo tee /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_enable $ cat /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_enable 1 $ echo 0 | sudo tee /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_enable $ cat /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_enable 1 $ echo 2 | sudo tee /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_enable $ cat /sys/class/drm/card0/device/hwmon/hwmon1/pwm1_enable 1 $ lsb_release -dcr Description: Ubuntu 17.04 Release: 17.04 Codename: zesty $ uname -rmv 4.10.0-19-generic #21-Ubuntu SMP Thu Apr 6 17:04:57 UTC 2017 x86_64 $ DRI_PRIME=1 glxinfo |grep string server glx vendor string: SGI server glx version string: 1.4 client glx vendor string: Mesa Project and SGI client glx version string: 1.4 OpenGL vendor string: X.Org OpenGL renderer string: Gallium 0.4 on AMD POLARIS10 (DRM 3.9.0 / 4.10.0-19-generic, LLVM 4.0.0) OpenGL core profile version string: 4.5 (Core Profile) Mesa 17.0.3 OpenGL core profile shading language version string: 4.50 OpenGL version string: 3.0 Mesa 17.0.3 OpenGL shading language version string: 1.30 OpenGL ES profile version string: OpenGL ES 3.1 Mesa 17.0.3 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10 $ LC_ALL=C dmesg -Tx | grep -E "drm|radeon" kern :info : [Tue Apr 11 00:03:04 2017] [drm] Initialized kern :info : [Tue Apr 11 00:03:04 2017] [drm] amdgpu kernel modesetting enabled. kern :info : [Tue Apr 11 00:03:04 2017] fb: switching to amdgpudrmfb from EFI VGA kern :info : [Tue Apr 11 00:03:04 2017] [drm] initializing kernel modesetting (POLARIS10 0x1002:0x67DF 0x1458:0x22DF 0xC7). kern :info : [Tue Apr 11 00:03:04 2017] [drm] register mmio base: 0xEFE00000 kern :info : [Tue Apr 11 00:03:04 2017] [drm] register mmio size: 262144 kern :info : [Tue Apr 11 00:03:04 2017] [drm] doorbell mmio base: 0xE0000000 kern :info : [Tue Apr 11 00:03:04 2017] [drm] doorbell mmio size: 2097152 kern :info : [Tue Apr 11 00:03:04 2017] [drm] probing gen 2 caps for device 8086:1901 = 261ad03/e kern :info : [Tue Apr 11 00:03:04 2017] [drm] probing mlw for device 8086:1901 = 261ad03 kern :info : [Tue Apr 11 00:03:04 2017] [drm] UVD is enabled in VM mode kern :info : [Tue Apr 11 00:03:04 2017] [drm] VCE enabled in VM mode kern :info : [Tue Apr 11 00:03:04 2017] [drm] GPU post is not needed kern :info : [Tue Apr 11 00:03:04 2017] [drm] Detected VRAM RAM=8192M, BAR=256M kern :info : [Tue Apr 11 00:03:04 2017] [drm] RAM width 256bits GDDR5 kern :info : [Tue Apr 11 00:03:04 2017] [drm] amdgpu: 8192M of VRAM memory ready kern :info : [Tue Apr 11 00:03:04 2017] [drm] amdgpu: 8192M of GTT memory ready. kern :info : [Tue Apr 11 00:03:04 2017] [drm] GART: num cpu pages 2097152, num gpu pages 2097152 kern :info : [Tue Apr 11 00:03:04 2017] [drm] PCIE GART of 8192M enabled (table at 0x0000000000040000). kern :info : [Tue Apr 11 00:03:04 2017] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). kern :info : [Tue Apr 11 00:03:04 2017] [drm] Driver supports precise vblank timestamp query. kern :info : [Tue Apr 11 00:03:04 2017] [drm] amdgpu: irq initialized. kern :info : [Tue Apr 11 00:03:04 2017] [drm] AMDGPU Display Connectors kern :info : [Tue Apr 11 00:03:04 2017] [drm] Connector 0: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DP-1 kern :info : [Tue Apr 11 00:03:04 2017] [drm] HPD6 kern :info : [Tue Apr 11 00:03:04 2017] [drm] DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b kern :info : [Tue Apr 11 00:03:04 2017] [drm] Encoders: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DFP1: INTERNAL_UNIPHY2 kern :info : [Tue Apr 11 00:03:04 2017] [drm] Connector 1: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DP-2 kern :info : [Tue Apr 11 00:03:04 2017] [drm] HPD4 kern :info : [Tue Apr 11 00:03:04 2017] [drm] DDC: 0x4870 0x4870 0x4871 0x4871 0x4872 0x4872 0x4873 0x4873 kern :info : [Tue Apr 11 00:03:04 2017] [drm] Encoders: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DFP2: INTERNAL_UNIPHY2 kern :info : [Tue Apr 11 00:03:04 2017] [drm] Connector 2: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DP-3 kern :info : [Tue Apr 11 00:03:04 2017] [drm] HPD1 kern :info : [Tue Apr 11 00:03:04 2017] [drm] DDC: 0x486c 0x486c 0x486d 0x486d 0x486e 0x486e 0x486f 0x486f kern :info : [Tue Apr 11 00:03:04 2017] [drm] Encoders: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DFP3: INTERNAL_UNIPHY1 kern :info : [Tue Apr 11 00:03:04 2017] [drm] Connector 3: kern :info : [Tue Apr 11 00:03:04 2017] [drm] HDMI-A-1 kern :info : [Tue Apr 11 00:03:04 2017] [drm] HPD5 kern :info : [Tue Apr 11 00:03:04 2017] [drm] DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877 kern :info : [Tue Apr 11 00:03:04 2017] [drm] Encoders: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DFP4: INTERNAL_UNIPHY1 kern :info : [Tue Apr 11 00:03:04 2017] [drm] Connector 4: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DVI-D-1 kern :info : [Tue Apr 11 00:03:04 2017] [drm] HPD3 kern :info : [Tue Apr 11 00:03:04 2017] [drm] DDC: 0x487c 0x487c 0x487d 0x487d 0x487e 0x487e 0x487f 0x487f kern :info : [Tue Apr 11 00:03:04 2017] [drm] Encoders: kern :info : [Tue Apr 11 00:03:04 2017] [drm] DFP5: INTERNAL_UNIPHY kern :info : [Tue Apr 11 00:03:04 2017] [drm] Found UVD firmware Version: 1.79 Family ID: 16 kern :info : [Tue Apr 11 00:03:04 2017] [drm] Found VCE firmware Version: 52.4 Binary ID: 3 kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 0 succeeded in 15 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 1 succeeded in 28 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 2 succeeded in 28 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 3 succeeded in 13 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 4 succeeded in 13 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 5 succeeded in 13 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 6 succeeded in 14 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 7 succeeded in 13 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 8 succeeded in 13 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 9 succeeded in 6 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 10 succeeded in 6 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] ring test on 11 succeeded in 1 usecs kern :info : [Tue Apr 11 00:03:04 2017] [drm] UVD initialized successfully. kern :info : [Tue Apr 11 00:03:05 2017] [drm] ring test on 12 succeeded in 10 usecs kern :info : [Tue Apr 11 00:03:05 2017] [drm] ring test on 13 succeeded in 5 usecs kern :info : [Tue Apr 11 00:03:05 2017] [drm] VCE initialized successfully. kern :info : [Tue Apr 11 00:03:05 2017] [drm] fb mappable at 0xD136F000 kern :info : [Tue Apr 11 00:03:05 2017] [drm] vram apper at 0xD0000000 kern :info : [Tue Apr 11 00:03:05 2017] [drm] size 8294400 kern :info : [Tue Apr 11 00:03:05 2017] [drm] fb depth is 24 kern :info : [Tue Apr 11 00:03:05 2017] [drm] pitch is 7680 kern :info : [Tue Apr 11 00:03:05 2017] fbcon: amdgpudrmfb (fb0) is primary device kern :info : [Tue Apr 11 00:03:05 2017] amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 0 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 1 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 2 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 3 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 4 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 5 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 6 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 7 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 8 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 9 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 10 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 11 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] ib test on ring 12 succeeded kern :info : [Tue Apr 11 00:03:05 2017] [drm] Initialized amdgpu 3.9.0 20150101 for 0000:01:00.0 on minor 0
You should try kernel 4.12. There is some progress with pwm1_enable: now we have not only "1" which is manual control but "0" (full speed) and "2" (this should be FW control, although it doesn’t look so for me: fans are still spinning and pwm1 reads 0 or 122-124 randomly). Still have to use userspace control (mode "1").
(In reply to Sergey Kochneff from comment #1) > You should try kernel 4.12. There is some progress with pwm1_enable: now we > have not only "1" which is manual control but "0" (full speed) and "2" (this > should be FW control, although it doesn’t look so for me: fans are still > spinning and pwm1 reads 0 or 122-124 randomly). Still have to use userspace > control (mode "1"). I use 4.13.0-041300rc3-generic, but still do not see the differences "pwm1_enable=1" or "pwm1_enable=2". 4.13.0-041300rc3-generic uses the default "pwm1_enable=2". I still need to run the daemon for monitoring the temperature and load of voltage relative fan speed. $ lsb_release -drc Description: Ubuntu 17.04 Release: 17.04 Codename: zesty $ uname -rso Linux 4.13.0-041300rc3-generic GNU/Linux
I believe this is related. Might be a separate issue though. I have an Asus RX550. With the amdgpu drive my fan is at what I believe is 100% all the time. Though reporting doesn't work. I can change the pwm1_enable setting between 1/2 but no difference in the fan. $ cat /sys/class/drm/card0/device/hwmon/hwmon0/pwm1 cat: pwm1: No such device Attempting to set pwm1 results in no change. $ sensors amdgpu-pci-2400 Adapter: PCI adapter fan1: N/A temp1: +45.0°C (crit = +0.0°C, hyst = +0.0°C) $ lsb_release -drc Description: Arch Linux Release: rolling Codename: n/a $ uname -rso Linux 4.12.10-1-ARCH GNU/Linux
$ ls /sys/class/drm/card*/device/hwmon/hwmon*/pwm*
$ ls /sys/class/drm/card*/device/hwmon/hwmon*/pwm* /sys/class/drm/card0/device/hwmon/hwmon0/pwm1 /sys/class/drm/card0/device/hwmon/hwmon0/pwm1_enable /sys/class/drm/card0/device/hwmon/hwmon0/pwm1_max /sys/class/drm/card0/device/hwmon/hwmon0/pwm1_min
Lucas, the Asus RX-550 (and apparently cards from all manufacturers for this chipset) doesn't have any fan control. See my comment here: https://bugs.freedesktop.org/show_bug.cgi?id=97556#c7
Dimitrios, Good to know. Looks like my short term solution became a long term one. I unplugged the builtin fan and ziptied on another fan and connected it to a motherboard fan header. I might come back and extend the original fan header to reach the motherboard.
Yeah I documented the workaround too: https://forum-en.msi.com/index.php?topic=298468.0 Root cause may be able to how amdgpu handles not being able to read it's powerplay settings because motherboard bioses (old AMI) don't set up MMIO BARs properly and intel submitted a patch to enforce restrictions on memory address regions in UEFI.
(In reply to Luke McKee from comment #8) > Yeah I documented the workaround too: > https://forum-en.msi.com/index.php?topic=298468.0 > Please stop posting this on every bug report.
(In reply to Alex Deucher from comment #9) > (In reply to Luke McKee from comment #8) > > Yeah I documented the workaround too: > > https://forum-en.msi.com/index.php?topic=298468.0 > > > > Please stop posting this on every bug report. That page is confusing and not likely related to any of these.
In this case it was on topic. The link explains how to use fancontrol script from lm_sensors to work around fan control issues. I saw on another ticket when I first posted here that dc=1 fixed the fancontrol issues. Finally I got dc=1 working and still it doesn't resolve the dpm fancontrol issues on my platform. https://github.com/kobalicek/amdtweak as root # ./amdtweak --card 0 --verbose --extract-bios /tmp/amdbios.bin fails. The sysfs shows that the powerplay tables are not proper too. [ 4969.713277] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000c3fff window] [ 4969.713283] caller pci_map_rom+0x66/0xf0 mapping multiple BARs [ 4969.713289] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff If it can't read it's powerplay table because it can't read the bios maybe that's why there is all these problems. (In reply to Alex Deucher from comment #9) > > Please stop posting this on every bug report. https://bugs.freedesktop.org/show_bug.cgi?id=100666#c0 Also the users above on this ticket above here when they grepped their dmesg wouldn't have output any powerplay mes.sages because they grepped radeon instead of amdgpu [ 10.124232] amdgpu: [powerplay] failed to send message 309 ret is 254 [ 10.124248] amdgpu: [powerplay] failed to send pre message 14e ret is 254 Maybe Denis could confirm or deny if this is in his dmesg?
(In reply to Alex Deucher from comment #10) > > Please stop posting this on every bug report. That page is confusing and not likely related to any of these. You obviously know about this sir. https://bugs.freedesktop.org/attachment.cgi?id=135739 https://bugs.freedesktop.org/show_bug.cgi?id=98798 A new intel patch has caused a reversion to the behaviour in this old ticket. it's using pci_info not dev_info now.
(In reply to Luke McKee from comment #11) > In this case it was on topic. The link explains how to use fancontrol script > from lm_sensors to work around fan control issues. I saw on another ticket > when I first posted here that dc=1 fixed the fancontrol issues. Finally I > got dc=1 working and still it doesn't resolve the dpm fancontrol issues on > my platform. dc and powerplay are largely independent. It's generally not likely that one will affect the other. > > https://github.com/kobalicek/amdtweak > as root > # ./amdtweak --card 0 --verbose --extract-bios /tmp/amdbios.bin > fails. The sysfs shows that the powerplay tables are not proper too. > I'm not familiar with that tool or how it goes about attempting to fetch the vbios. The driver uses several mechanism to fetch it depending on the platform. It's possible that tool does something weird to fetch the vbios and it's possible that tool incorrectly interprets some of the vbios tables. > [ 4969.713277] resource sanity check: requesting [mem > 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem > 0x000c0000-0x000c3fff window] > [ 4969.713283] caller pci_map_rom+0x66/0xf0 mapping multiple BARs > [ 4969.713289] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: > expecting 0xaa55, got 0xffff This last message is from the pci subsystem and is harmless. If the driver were not able to load the vbios, it would fail to load. > > If it can't read it's powerplay table because it can't read the bios maybe > that's why there is all these problems. The driver is able to load the vbios image just fine. If it wasn't able to, or if there was a major problem with one of the tables, the driver would fail to load. > > > (In reply to Alex Deucher from comment #9) > > > > Please stop posting this on every bug report. > > https://bugs.freedesktop.org/show_bug.cgi?id=100666#c0 > Also the users above on this ticket above here when they grepped their dmesg > wouldn't have output any powerplay mes.sages because they grepped radeon > instead of amdgpu > > [ 10.124232] amdgpu: [powerplay] > failed to send message 309 ret is 254 > [ 10.124248] amdgpu: [powerplay] > failed to send pre message 14e ret is 254 > There are lots of reasons an smu message might fail. Just because you see an smu message failure does not mean you are seeing the same issue as someone else. It's like a GPU hang. There are lots of potential root causes.
Alex thanks for your help. How it gets the rom is probably the same as this shell script using the pci method listed on that github link in the last comment. # To read ROM you first need to write `1` to it, then read it, and then write # `0` to it as described in the documentation. The reason is that the content # is not provided by default, by writing `1` to it you are telling the driver # to make it accessible. CARD_ID=0 CARD_ROM="/sys/class/drm/card${CARD_ID}/device/rom" FILE_ROM="amdgpu-rom.bin" echo 1 > $CARD_ROM cat $CARD_ROM > $FILE_ROM echo 0 > $CARD_ROM echo "Saved as ${FILE_ROM}" -- output: cat: /sys/class/drm/card0/device/rom: Input/output error [Not] Saved as amdgpu-rom.bin Is there any other user-space accessible methods to extract / write the rom in Linux? Now only focusing on comparing the pp table to other roms not modifying it. Maybe the powerplay is an atom-bios issue perhaps. If it's still broken in this 4.16-rc1 version I'm trying out now I'll open a ticket. For your reference this is the ticket that claims powerplay dpm is fixed in newer kernels / dc=1 in 4.15. https://bugs.freedesktop.org/show_bug.cgi?id=100443#c37
(In reply to Luke McKee from comment #14) > Alex thanks for your help. > > How it gets the rom is probably the same as this shell script using the pci > method listed on that github link in the last comment. > > # To read ROM you first need to write `1` to it, then read it, and then write > # `0` to it as described in the documentation. The reason is that the content > # is not provided by default, by writing `1` to it you are telling the driver > # to make it accessible. > CARD_ID=0 > CARD_ROM="/sys/class/drm/card${CARD_ID}/device/rom" > FILE_ROM="amdgpu-rom.bin" > > echo 1 > $CARD_ROM > cat $CARD_ROM > $FILE_ROM > echo 0 > $CARD_ROM > echo "Saved as ${FILE_ROM}" > > -- > output: > cat: /sys/class/drm/card0/device/rom: Input/output error > [Not] Saved as amdgpu-rom.bin That should generally work for desktop discrete cards. You need to be root however. > > Is there any other user-space accessible methods to extract / write the rom > in Linux? Now only focusing on comparing the pp table to other roms not > modifying it. You can read the amdgpu_vbios file in debugfs. That will dump the copy of the vbios that the driver is using. > Maybe the powerplay is an atom-bios issue perhaps. If it's still broken in > this 4.16-rc1 version I'm trying out now I'll open a ticket. > > For your reference this is the ticket that claims powerplay dpm is fixed in > newer kernels / dc=1 in 4.15. > https://bugs.freedesktop.org/show_bug.cgi?id=100443#c37 There's no confirmation that specifically enabling dc fixed it. Anyway, we are cluttering up this bug with potentially unrelated information. Please file a new bug for your issue and we can discuss it there.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/152.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.