Summary: | [NVCF] PWM fan speed too high | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Invalid Invalid <gianni> | ||||||||||||||||||||||||||||||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||||||||||||||||||||||||||||||
Status: | RESOLVED MOVED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||||||||||||||||||||||
Severity: | normal | ||||||||||||||||||||||||||||||||||
Priority: | medium | CC: | kpschrage, lars, pepko94 | ||||||||||||||||||||||||||||||||
Version: | unspecified | ||||||||||||||||||||||||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||||||||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Invalid Invalid
2014-07-04 07:33:44 UTC
Created attachment 102244 [details]
Dmesg
Created attachment 102245 [details]
vbios.rom
Created attachment 102246 [details]
vbios.rom
Created attachment 102247 [details]
xorg.log
*** Bug 80900 has been marked as a duplicate of this bug. *** Created attachment 102251 [details]
sensors.nouveau
Created attachment 102252 [details]
sensors.nvidia_levels
Sorry, i forgot to mention: temperature is around 40-50°C in both cases. This is not an heavy-duty machine, so usually it is dead silent. Created attachment 102255 [details]
nvidia-smi output
Created attachment 102256 [details]
lspci -vv output
Thanks for all this information. It seems like NVIDIA changed the default minimum temperature before increasing the fan speed (http://code.woboq.org/linux/linux/drivers/gpu/drm/nouveau/core/subdev/therm/fan.c.html#195). Based on your information, I would guess it is set to 60°C instead of the 40°C I saw on some card. It is very possible that they set a different value per chipset. I'll check again with a recent version of the blob what is the default temperature at which the fan speed starts being increased but I don't think the fix should be to bump the value (that I will set to the minimum value I found across the boards). The proper fix would be to let users change this value in sysfs. I'll keep the bug open as a reminder but my time is very limited this summer. If you feel like writing the patch (in nouveau_hwmon.c), I would gladly review it! Thanks for reporting :) Created attachment 102433 [details]
PATCH: add LINEAR_MIN and LINEAR_MAX to sysfs
Tentative patch.
NOTE: It compiles, but I've not tested this yet, since at the moment I don't have a NVIDIA machine around.
Comment on attachment 102433 [details] PATCH: add LINEAR_MIN and LINEAR_MAX to sysfs Thanks, this patch is perfectly sound! However, it doesn't follow the sysfs interface of hwmon as defined here: https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface I don't particularly like this interface, but I've tried to stay as close to it as possible. One possibility could be to use the trip points to expose linear_min/max, however, we would be bending the definition of trip point by a long shot (we are not supposed to scale linearly between trip points, at least that's how NVIDIA defines trip points, that's a question worth asking to the hwmon guys). If we are to keep your patch more or less intact, you would have to move your the end result to nouveau_sysfs.c. You would also need to change the name to temp1_fan_linear_min/max, to improve the clarity of what values are expected in there :) Finally, you would need to update the documentation (http://cgit.freedesktop.org/nouveau/linux-2.6/tree/Documentation/thermal/nouveau_thermal). Thanks again for your interest in fixing this! The problem I see with trip-points is that those allow to set a fixed PWM value when the sensors detect a certain temperature. The nouveau driver instead raises the fan speed "continuously" after a certain temperature is reached (would we need infinite trip-points for that?). Please correct any error in my understanding here :) Maybe we can use the trip points as a start/end point and then move autonomously disregarding the point[0-*]_pwm value? For now, if you agree, I will move the sysfs values as suggested (and change the documentation), hopefully before/during next weekend. So this bug will still have an almost "clean" patch for anyone (included me) who want it. Then I will ask around in the lm-sensors mailing list (is it the correct one?). Hopefully someone can point me to a better solution. Thanks for the review! :) (In reply to comment #14) > The problem I see with trip-points is that those allow to set a fixed PWM > value when the sensors detect a certain temperature. The nouveau driver > instead raises the fan speed "continuously" after a certain temperature is > reached (would we need infinite trip-points for that?). > Please correct any error in my understanding here :) > > Maybe we can use the trip points as a start/end point and then move > autonomously disregarding the point[0-*]_pwm value? > > > For now, if you agree, I will move the sysfs values as suggested (and change > the documentation), hopefully before/during next weekend. > So this bug will still have an almost "clean" patch for anyone (included me) > who want it. > Then I will ask around in the lm-sensors mailing list (is it the correct > one?). Hopefully someone can point me to a better solution. > > Thanks for the review! :) Yes, you understood my review and have a perfect todo list! Good luck with it :) Created attachment 102562 [details]
PATCH v2 [1/2]: add LINEAR_MIN and LINEAR_MAX to sysfs
Created attachment 102563 [details]
PATCH v2 [1/2]: add LINEAR_MIN and LINEAR_MAX to sysfs
Created attachment 102564 [details]
PATCH v2 [2/2]: add new attributes to documentation
Those new patches should (hopefully) be cleaner that the previous one. (In reply to comment #19) > Those new patches should (hopefully) be cleaner that the previous one. They are and they seem to work as expected. I would now like to hear back from the hwmon guys. Could you contact them (please CC: nouveau@lists.freedesktop.org)? Created attachment 102785 [details]
Hwmon email
Email sent. Text in attachment for reference.
(In reply to comment #21) > Created attachment 102785 [details] > Hwmon email > > Email sent. Text in attachment for reference. Any update on this? Sorry but I find myself short on time since currently. I will start working on this again as soon as I can. (In reply to comment #23) > Sorry but I find myself short on time since currently. > I will start working on this again as soon as I can. Ok, good luck! TTYL then ;) It seems that I am struck by this very bug as well since several months now. Fedora 21, kernel-3.17.7-300.fc21.x86_64 (but it started on Fedora 20, around kernel-3.15.5-200.fc20.x86_64) Graphics card: 01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [GeForce GT 630] (rev a1) (prog-if 00 [VGA controller]) Subsystem: Micro-Star International Co., Ltd. [MSI] Device 8a90 Flags: bus master, fast devsel, latency 0, IRQ 43 Memory at f6000000 (32-bit, non-prefetchable) [size=16M] Memory at e8000000 (64-bit, prefetchable) [size=128M] Memory at f0000000 (64-bit, prefetchable) [size=32M] I/O ports at e000 [size=128] Expansion ROM at f7000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau Kernel modules: nouveau It seems as if problems started with this commit: http://lists.freedesktop.org/archives/nouveau/2014-March/016589.html At least reverting this patch to the nouveau source is a workaround, see discussion on Redhat bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1121331 BTW, isn't bug 84721 a duplicate of this one? It is the very same NVIDIA controller as mine that is affected. (In reply to K.-P. Schrage from comment #25) > It seems that I am struck by this very bug as well since several months now. > > Fedora 21, kernel-3.17.7-300.fc21.x86_64 (but it started on Fedora 20, > around kernel-3.15.5-200.fc20.x86_64) > > Graphics card: > 01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [GeForce GT 630] > (rev a1) (prog-if 00 [VGA controller]) > Subsystem: Micro-Star International Co., Ltd. [MSI] Device 8a90 > Flags: bus master, fast devsel, latency 0, IRQ 43 > Memory at f6000000 (32-bit, non-prefetchable) [size=16M] > Memory at e8000000 (64-bit, prefetchable) [size=128M] > Memory at f0000000 (64-bit, prefetchable) [size=32M] > I/O ports at e000 [size=128] > Expansion ROM at f7000000 [disabled] [size=512K] > Capabilities: [60] Power Management version 3 > Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Capabilities: [78] Express Endpoint, MSI 00 > Capabilities: [b4] Vendor Specific Information: Len=14 <?> > Capabilities: [100] Virtual Channel > Capabilities: [128] Power Budgeting <?> > Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 > Len=024 <?> > Kernel driver in use: nouveau > Kernel modules: nouveau > > It seems as if problems started with this commit: > http://lists.freedesktop.org/archives/nouveau/2014-March/016589.html > At least reverting this patch to the nouveau source is a workaround, see > discussion on Redhat bugzilla: > https://bugzilla.redhat.com/show_bug.cgi?id=1121331 > > BTW, isn't bug 84721 a duplicate of this one? It is the very same NVIDIA > controller as mine that is affected. It depends on your problem. If by "too high", you mean fan at 100% constantly, then the bug is a duplicate. If you mean the default fan speed is 35% instead of 30%, then the bugs are not related and this is a bug hijacking :p Please send your vbios in your answer and I'll or will not advise you to open a new bug report. I am currently in the process of moving to another home, but I can still write code for you to try :) Created attachment 111288 [details]
GeForce GT 630 vbios.rom
vbios.rom: cat /sys/kernel/debug/dri/0/vbios.rom >vbios.rom
(In reply to Martin Peres from comment #26) > > BTW, isn't bug 84721 a duplicate of this one? It is the very same NVIDIA > > controller as mine that is affected. > > It depends on your problem. If by "too high", you mean fan at 100% > constantly, then the bug is a duplicate. If you mean the default fan speed > is 35% instead of 30%, then the bugs are not related and this is a bug > hijacking :p Please send your vbios in your answer and I'll or will not > advise you to open a new bug report. > > I am currently in the process of moving to another home, but I can still > write code for you to try :) Thanks for caring ... yes, the fan is running at full speed constantly. I am the original poster of the Fedora bug report mentioned by K.-P. Schrage, https://bugzilla.redhat.com/show_bug.cgi?id=1121331 I have the exact same problem as the original poster of this bug report, but using another graphical card. I.e. the fan on the card works nice using kernels below 3.15, but all kernels from 3.15 and up will peg the fan. Reverting the commit mentioned by K.-P. Schrage, http://lists.freedesktop.org/archives/nouveau/2014-March/016589.html solves the issue for me. The fan returns to how it behaved before kernel 3.15. I am using a Nvidia GF106GL (Quadro 2000) card, lspci: 01:00.0 VGA compatible controller: NVIDIA Corporation GF106GL [Quadro 2000] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation Device 084a Flags: bus master, fast devsel, latency 0, IRQ 30 Memory at f0000000 (32-bit, non-prefetchable) [size=32M] Memory at e0000000 (64-bit, prefetchable) [size=128M] Memory at e8000000 (64-bit, prefetchable) [size=64M] I/O ports at e000 [size=128] Expansion ROM at f2000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau Kernel modules: nouveau At the moment I am using the following kernel Linux tux 3.17.7-300.fc21.x86_64 #1 SMP Wed Dec 17 03:08:44 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux vbios.rom will follow... Created attachment 111308 [details]
vbios.rom from NVIDIA Corporation GF106GL
Any news on this issue? I just had to re-compile the module for yet another kernel update... Is more information needed to solve the issue? (In reply to Lars E Pettersson from comment #31) > Any news on this issue? I just had to re-compile the module for yet another > kernel update... Yes, for me as well, the issue has reached the 3.18 kernel line (now on 3.18.3-201.fc21.x86_64). Hey guys, I may finally have managed to reproduce your bug. To check that, I need you to install envytools and send me the result of the following command: nvapeek e114 10 When selecting the manual fan management mode, you should be able to bring down the fan speed by running: nvapoke e118 80000005 or nvapoke e120 80000005 In any case, please open a separate bug report as this bug clearly is not related to your bug. (In reply to Martin Peres from comment #33) Hello, Martin, thank you for caring! # nvapeek e114 10 0000e114: 0000021c 000000d8 00000001 00000000 (that's with the latest nouveau driver from darktama) After enabling manual fan control mode (pwm1_enable = 1), 'nvapoke e118 80000005' somewhat reduces fan speed audibly, but still seems to be too high (pwm1 shows a value of 0). FWIW, after 'nvapoke e118 80000005", the output of the nvapeek command has changed to 0000e114: 0000021c 00000005 00000001 00000000 (In reply to K.-P. Schrage from comment #34) > (In reply to Martin Peres from comment #33) > > Hello, Martin, > > thank you for caring! > > > # nvapeek e114 10 > 0000e114: 0000021c 000000d8 00000001 00000000 > > (that's with the latest nouveau driver from darktama) > > After enabling manual fan control mode (pwm1_enable = 1), > 'nvapoke e118 80000005' somewhat reduces fan speed audibly, but still seems > to be too high (pwm1 shows a value of 0). > > FWIW, after 'nvapoke e118 80000005", the output of the nvapeek command has > changed to > 0000e114: 0000021c 00000005 00000001 00000000 Ok, try to boot with nouveau blacklisted then run nvapeek e114 10 again and send me the result. We may be on to something here. (In reply to Martin Peres from comment #35) > Ok, try to boot with nouveau blacklisted then run nvapeek e114 10 again and > send me the result. We may be on to something here. # nvapeek e114 10 0000e114: 0000021c 00000002 00000001 00000000 (nouveau hopefully killed: blacklist.conf, grub commandline, dracut) (In reply to K.-P. Schrage from comment #36) > (In reply to Martin Peres from comment #35) > > > Ok, try to boot with nouveau blacklisted then run nvapeek e114 10 again and > > send me the result. We may be on to something here. > > > # nvapeek e114 10 > 0000e114: 0000021c 00000002 00000001 00000000 > > (nouveau hopefully killed: blacklist.conf, grub commandline, dracut) It worked. Set fan management to manual then nvapoke e118 80000002 and you'll get quietness. I kind of have the same problem with the other nvc1 I have access to at work. I'll be digging into this to fix this problem for good. Good that I got access to this board, your bug would have been a mystery otherwise... Thanks! (In reply to Martin Peres from comment #37) > It worked. Set fan management to manual then nvapoke e118 80000002 and > you'll get quietness. > Yes, it works (Silence Is Golden ...) I put the two lines echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable /usr/local/bin/nvapoke e118 80000002 into /etc/rc.d/rc.local (don't know any other way to make this workaround permanent). Thanks again! OK, lets see... # ./nvapeek e114 10 0000e114: 0000021c 000000f9 00000001 00000000 sometimes I get the following answer: 0000e114: 0000021c 000000fe 00000001 00000000 I then select manual control and do the following: [root@tux nva]# ./nvapoke e118 80000005 [root@tux nva]# ./nvapeek e114 10 0000e114: 0000021c 00000005 00000001 00000000 Silent fan. Check with 'sensors' command. RPM is 0! Temperature raising! Set fan control back to old setting and fan returns to normal. Restart with the module blacklisted. [root@tux nva]# ./nvapeek e114 10 0000e114: 0000021c 000000a2 00000001 00000000 Restart with the nouveau module in place again. You asked K.-P. Schrage to use 'nvapoke e118 80000002' As I have 000000a2 I instead tried the following: [root@tux nva]# ./nvapoke e118 800000a2 (If I try 'nvapoke e118 80000002' the RPM goes down to 0 rpm, so I think that 800000a2 is correct for me to use. The RPM then stays slightly above 2000 rpm and the temperature is at about 60-65 degC, as it was before the change in the 3.15 kernel.) Not sure what all these numbers means though... :) (In reply to K.-P. Schrage from comment #38) ... > I put the two lines > echo 1 > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable > /usr/local/bin/nvapoke e118 80000002 > into /etc/rc.d/rc.local (don't know any other way to make this workaround > permanent). > $ udevadm info -a -p /sys/class/drm/card0 | grep -m2 'ATTRS{device}\|ATTRS{vendor}' ATTRS{device}=="0xabcd" ATTRS{vendor}=="0x1234" /etc/udev/rules.d/10-unladen-swallow.rules ACTION=="add", ATTRS{vendor}=="0x1234", ATTRS{device}=="0xabcd", RUN+="/bin/sh -c '/bin/echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable'", RUN+="/usr/local/bin/nvapoke e118 80000002" (In reply to Lars E Pettersson from comment #39) ... > > Not sure what all these numbers means though... :) 'nvapeek' might be something like "nvidia peek(read)" and 'nvapoke' might be something like "nvidia poke(write)", MMIO regs. Addresses 'e114'&'e118' with their 'values' should fall into this range: - G80:GF100 MMIO map 0x00e000 all PNVIO GPIOs, I2C buses, PWM fan control, and other external devices OR - GF100+ MMIO map 0x00e000 all PNVIO GPIOs, I2C buses, PWM fan control, and other external devices How Martin reached the actual addresses and values, he knows better. :) Ref. PEEK and POKE http://en.wikipedia.org/wiki/PEEK_and_POKE MMIO http://en.wikipedia.org/wiki/Memory-mapped_I/O MMIO register ranges http://envytools.readthedocs.org/en/latest/hw/mmio.html PNVIO: external device interface http://envytools.readthedocs.org/en/latest/hw/io/pnvio.html Pokémon http://en.wikipedia.org/wiki/Pokémon (In reply to poma from comment #40) > /etc/udev/rules.d/10-unladen-swallow.rules > ACTION=="add", ATTRS{vendor}=="0x1234", ATTRS{device}=="0xabcd", > RUN+="/bin/sh -c '/bin/echo 1 > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable'", > RUN+="/usr/local/bin/nvapoke e118 80000002" Thanks, Poma, this rule (all in one line, with my appropriate device and vendor id's) seems to work correctly, but it only reduces fan speed for a second or so during the boot process, then speed is up again, and the value that nvapoke writes is overwritten (800000d8 instead of 80000002). Perhaps this rule comes up too early, but changing the prefix number from 10 to e. g. 99 doesn't help. (In reply to K.-P. Schrage from comment #42) > (In reply to poma from comment #40) > > > /etc/udev/rules.d/10-unladen-swallow.rules > > ACTION=="add", ATTRS{vendor}=="0x1234", ATTRS{device}=="0xabcd", > > RUN+="/bin/sh -c '/bin/echo 1 > > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable'", > > RUN+="/usr/local/bin/nvapoke e118 80000002" > > Thanks, Poma, this rule (all in one line, with my appropriate device and > vendor id's) seems to work correctly, but it only reduces fan speed for a > second or so during the boot process, then speed is up again, and the value > that nvapoke writes is overwritten (800000d8 instead of 80000002). > Perhaps this rule comes up too early, but changing the prefix number from 10 > to e. g. 99 doesn't help. Exactly, these are oneliners. There is no the GPU fan here, so I tested the CPU fan, and this is how it works: /etc/udev/rules.d/10-cpu-fan-manual-mode.rules RUN+="/bin/sh -c '/bin/echo 1 > /sys/devices/platform/it87.656/pwm1_enable'" Therefore remove ACTION & ATTRS part, so it runs unconditionally. Before you reboot, check with: # udevadm trigger (In reply to poma from comment #43) > There is no the GPU fan here, so I tested the CPU fan, and this is how it > works: > /etc/udev/rules.d/10-cpu-fan-manual-mode.rules > RUN+="/bin/sh -c '/bin/echo 1 > /sys/devices/platform/it87.656/pwm1_enable'" > > Therefore remove ACTION & ATTRS part, so it runs unconditionally. > > Before you reboot, check with: > # udevadm trigger Now my 10-...rules files in /etc/udev/rules.d/ looks like this: RUN+="/bin/sh -c '/bin/echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable'", RUN+="/usr/local/bin/nvapoke e118 80000002" During boot, it sounds as if it gets triggered several times (noise-silence-noise-silence), but it ends up with a silent gpu fan when the graphical desktop has started. Startup logs are flooded with messages like: nvapoke:473 conflicting memory types e8000000-f0000000 uncached-minus<->write-combining reserve_memtype failed [mem 0xe8000000-0xefffffff], track uncached-minus, req uncached-minus (In reply to K.-P. Schrage from comment #44) > Now my 10-...rules file in /etc/udev/rules.d/ looks like this: > RUN+="/bin/sh -c '/bin/echo 1 > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable'", > RUN+="/usr/local/bin/nvapoke e118 80000002" > > During boot, it sounds as if it gets triggered several times > (noise-silence-noise-silence), but it ends up with a silent gpu fan when the > graphical desktop has started. > Startup logs are flooded with messages like: > > nvapoke:473 conflicting memory types e8000000-f0000000 > uncached-minus<->write-combining > reserve_memtype failed [mem 0xe8000000-0xefffffff], track uncached-minus, > req uncached-minus # rm /etc/udev/rules.d/10-gpu-fan.rules And try this: /etc/systemd/system/gpu-fan.service ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [Unit] Description=GPU fan lower speed [Service] Type=oneshot ExecStart=/bin/sh -c '/bin/echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable' ExecStart=/usr/local/bin/nvapoke e118 80000002 StandardOutput=null StandardError=null [Install] WantedBy=basic.target ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # systemctl enable gpu-fan.service # systemctl start gpu-fan.service # systemctl status gpu-fan.service If everything is OK: # systemctl reboot ... BOOT ... # systemctl status gpu-fan.service (In reply to poma from comment #45) > And try this: > /etc/systemd/system/gpu-fan.service > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [Unit] > Description=GPU fan lower speed > > [Service] > Type=oneshot > ExecStart=/bin/sh -c '/bin/echo 1 > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable' > ExecStart=/usr/local/bin/nvapoke e118 80000002 > StandardOutput=null > StandardError=null > > [Install] > WantedBy=basic.target > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > # systemctl enable gpu-fan.service > # systemctl start gpu-fan.service > # systemctl status gpu-fan.service > > If everything is OK: > # systemctl reboot > ... > > BOOT > ... > # systemctl status gpu-fan.service After reboot: [root@linux_keller kp]# systemctl status gpu-fan.service ● gpu-fan.service - GPU fan lower speed Loaded: loaded (/etc/systemd/system/gpu-fan.service; enabled) Active: inactive (dead) since Sa 2015-03-14 11:40:03 CET; 33s ago Process: 683 ExecStart=/usr/local/bin/nvapoke e118 80000002 (code=exited, status=0/SUCCESS) Process: 672 ExecStart=/bin/sh -c /bin/echo 1 > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable (code=exited, status=0/SUCCESS) Main PID: 683 (code=exited, status=0/SUCCESS) Mär 14 11:40:03 linux_keller systemd[1]: Starting GPU fan lower speed... Mär 14 11:40:03 linux_keller systemd[1]: Started GPU fan lower speed. --------------- BUT: Fan speed is high, register e118 shows value 000000d8, not 00000002, as expected. I have to restart the service manually to calm down the fan. Let me be honest, poma: I am very grateful for all your help, but I think I'll stick to the old-school method rc.local which seems to be rather straightforward to me (although even that is now governed by systemd and not so old-school anymore). (In reply to K.-P. Schrage from comment #46) > (In reply to poma from comment #45) > > > And try this: > > /etc/systemd/system/gpu-fan.service > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > [Unit] > > Description=GPU fan lower speed > > > > [Service] > > Type=oneshot > > ExecStart=/bin/sh -c '/bin/echo 1 > > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable' > > ExecStart=/usr/local/bin/nvapoke e118 80000002 > > StandardOutput=null > > StandardError=null > > > > [Install] > > WantedBy=basic.target > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > # systemctl enable gpu-fan.service > > # systemctl start gpu-fan.service > > # systemctl status gpu-fan.service > > > > If everything is OK: > > # systemctl reboot > > ... > > > > BOOT > > ... > > # systemctl status gpu-fan.service > > After reboot: > [root@linux_keller kp]# systemctl status gpu-fan.service > ● gpu-fan.service - GPU fan lower speed > Loaded: loaded (/etc/systemd/system/gpu-fan.service; enabled) > Active: inactive (dead) since Sa 2015-03-14 11:40:03 CET; 33s ago > Process: 683 ExecStart=/usr/local/bin/nvapoke e118 80000002 (code=exited, > status=0/SUCCESS) > Process: 672 ExecStart=/bin/sh -c /bin/echo 1 > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1/pwm1_enable > (code=exited, status=0/SUCCESS) > Main PID: 683 (code=exited, status=0/SUCCESS) > > Mär 14 11:40:03 linux_keller systemd[1]: Starting GPU fan lower speed... > Mär 14 11:40:03 linux_keller systemd[1]: Started GPU fan lower speed. > --------------- > BUT: Fan speed is high, register e118 shows value 000000d8, not 00000002, as > expected. I have to restart the service manually to calm down the fan. > > Let me be honest, poma: I am very grateful for all your help, but I think > I'll stick to the old-school method rc.local which seems to be rather > straightforward to me (although even that is now governed by systemd and not > so old-school anymore). No problemos. ;) Sorry guys, I'm back at the problem ... again ... I really want to fix this upstream! (In reply to Martin Peres from comment #48) > Sorry guys, I'm back at the problem ... again ... > > I really want to fix this upstream! Fine. Tell me if I can supply any more information. Any news on this issue? I removed the lines mentioned in comment 39 to see what happens, and up goes the fan speed. I.e. the problem seem to still exist... :( Running kernel at the moment is: Linux tux.home.rpz 4.4.6-300.fc23.x86_64 #1 SMP Wed Mar 16 22:10:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux The bug is still there. I am now running Fedora 24 with the following kernel: Linux tux.home.rpz 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux The fan noise is annoying... Any news on this issue? Just an update that the bug is still there. Running kernel at the moment: Linux tux.home.rpz 4.8.11-200.fc24.x86_64 #1 SMP Mon Nov 28 19:36:57 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Will this bug be fixed? (In reply to Lars E Pettersson from comment #52) > Just an update that the bug is still there. Running kernel at the moment: > > Linux tux.home.rpz 4.8.11-200.fc24.x86_64 #1 SMP Mon Nov 28 19:36:57 UTC > 2016 x86_64 x86_64 x86_64 GNU/Linux > > Will this bug be fixed? Hey, I have not forgotten you. I found the information in the table to do the right thing for your fan and I sort of managed to make sense of it... but I am apparently unable to make a model of what the proprietary driver does :s It is so frustrating because I compute the right value most of the time, but when I don't the error is quite catastrophic. I will now swallow my pride and ask for help from Nvidia :s -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/116. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.