Bug 105718

Summary: amdgpu reported fan speed looks too high (dual fan Sapphire Pulse Vega 56 and Sapphire RX 5700 XT)
Product: DRI Reporter: Shmerl <shtetldik>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: mikhail.v.gavrilov, thomas
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Shmerl 2018-03-23 16:55:03 UTC
I have Sapphire Pulse Vega 56, which ships with two fans. According to this: https://www.youtube.com/watch?v=3mKSrSluxbM&t=1m25s
in a normal scenario, fans should spin at under 650 RPM. However sensors report for idle load around 1280 RPM for me, which looks like a double value of the above:

amdgpu-pci-2f00
Adapter: PCI adapter
fan1: 1281 RPM
temp1: +29.0°C (crit = +0.0°C, hyst = +0.0°C)

pwm1 is set to auto:

sudo cat /sys/class/drm/card0/device/hwmon/hwmon3/pwm1_enable
2

Is this a misdetection, or it's supposed to be like that? Actual fans are pretty silent.

System: Debian testing, x86_64 (kernel 4.15.4).
Comment 1 Wilko Bartels 2018-11-16 11:25:47 UTC
(In reply to Shmerl from comment #0)
> I have Sapphire Pulse Vega 56, which ships with two fans. According to this:
> https://www.youtube.com/watch?v=3mKSrSluxbM&t=1m25s
> in a normal scenario, fans should spin at under 650 RPM. However sensors
> report for idle load around 1280 RPM for me, which looks like a double value
> of the above:
> 
> amdgpu-pci-2f00
> Adapter: PCI adapter
> fan1: 1281 RPM
> temp1: +29.0°C (crit = +0.0°C, hyst = +0.0°C)
> 
> pwm1 is set to auto:
> 
> sudo cat /sys/class/drm/card0/device/hwmon/hwmon3/pwm1_enable
> 2
> 
> Is this a misdetection, or it's supposed to be like that? Actual fans are
> pretty silent.
> 
> System: Debian testing, x86_64 (kernel 4.15.4).

i notice the same for my vega pulse.
in ur example im pretty sure the fans are even off.they should at such a low temp. the rpm information seems to work as long at it goes up. but when the temperature falls below 55°C the fan are going to stop and the rpm info doesnt get updated anymore and stays there.
Comment 2 Wilko Bartels 2018-11-16 11:31:15 UTC
*** Bug 108352 has been marked as a duplicate of this bug. ***
Comment 3 wedens13 2019-07-23 05:56:17 UTC
I have exactly the same issue with Sapphire Pulse Vega 56. It also reports some unreasonably high value (something around 3000rpm) as max fans RPM.

I've found this thread https://www.hwinfo.com/forum/Thread-Solved-SAPPHIRE-PULSE-Radeon%E2%84%A2-RX-Vega56-8G-HBM2-Wrong-FAN-RPM. Apparently this issue was solved in windows with AMD driver update.
Comment 4 Shmerl 2019-08-20 18:12:16 UTC
By the way, after using this card for a while, newer firmware fixed fan curves that were getting stucks, though top level RPMs still remain like that. But over time I stopped considering them too high. I think they are appropriate for the card.
Comment 5 Shmerl 2019-09-06 04:50:04 UTC
Actually, I've noticed another similar issue. I just got Sapphire Pulse RX 5700 XT. It's also dual fan.

According to this: https://www.gamersnexus.net/hwreviews/3498-sapphire-rx-5700-xt-pulse-review

The top level of fan rotation (at high load and more performance BIOS setting) is around 1570 rpm. While sensors report that max is 3200 rpm for me!

And something like 50% load (of the fan) is shown as around 1600 RPM!

So I'd say something is definitely off. It's almost like values from both fans are added up, instead of showing actual one.
Comment 6 Shmerl 2019-09-06 06:03:17 UTC
Hm, it could be radeon-profile mis-reporting fan load percentage, using the max value of 3200 reported by sensors. That's where I got those 50% I mentioned above.

I just run The Witcher without vsync (Wine+dxvk) which loads GPU almost to 100% consistently.

Here is what I see:

From sensors:

amdgpu-pci-0f00
Adapter: PCI adapter
vddgfx:       +1.12 V
fan1:        1892 RPM  (min =    0 RPM, max = 3200 RPM)
edge:         +76.0°C  (crit = +118.0°C, hyst =  +0.0°C)
                       (emerg = +80000.0°C)
junction:     +94.0°C  (crit = +80000.0°C, hyst =  +0.0°C)
                       (emerg = +80000.0°C)
mem:          +88.0°C  (crit = +80000.0°C, hyst =  +0.0°C)
                       (emerg = +80000.0°C)
power1:      182.00 W  (cap = 195.00 W)

From: /sys/kernel/debug/dri/0/amdgpu_pm_info

GPU Temperature: 76 C
GPU Load: 99 %
MEM Load: 50 %

So the fan shows around 1900 rpm for full load and 76°C. That is closer to 1570 rpm described in the Gamersnexus review. So may be it's correct?

It would be good to get some input from AMD, from those who know something about VBIOS and Navi firmware.
Comment 7 Andrew Sheldon 2019-09-09 06:30:11 UTC
(In reply to Shmerl from comment #5)
> Actually, I've noticed another similar issue. I just got Sapphire Pulse RX
> 5700 XT. It's also dual fan.
> 
> According to this:
> https://www.gamersnexus.net/hwreviews/3498-sapphire-rx-5700-xt-pulse-review
> 
> The top level of fan rotation (at high load and more performance BIOS
> setting) is around 1570 rpm. While sensors report that max is 3200 rpm for
> me!
> 
> And something like 50% load (of the fan) is shown as around 1600 RPM!
> 
> So I'd say something is definitely off. It's almost like values from both
> fans are added up, instead of showing actual one.

You could use UPP (https://github.com/sibradzic/upp) so see what your the powerplay tables report is the maximum for fanspeed. On the MSI Evoke 5700 XT, mine reports a FanMaximumRpm of 3200. I believe in the case of the Evoke, that some benchmarks have shown it can get near 3200 RPM at max speed.

I don't know if the driver impacts the values here, or if this is strictly based on the BIOS. But it might be helpful to see if that max matches what the sensors are reporting.
Comment 8 Martin Peres 2019-11-19 08:33:18 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/335.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.