Bug 66177 - pwm1 value not restored during hibernate/restore cycle in the event of manual fan management mode
Summary: pwm1 value not restored during hibernate/restore cycle in the event of manual...
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Martin Peres
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
: 66022 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-06-25 22:27 UTC by Mr-4
Modified: 2013-08-13 02:58 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Restore the pwm1_enable and pwm1 (in case of manual fan management) after hibernate/restore (1.49 KB, patch)
2013-06-25 22:30 UTC, Mr-4
no flags Details | Splinter Review
set fan_mode and fan_duty at boot up (2.17 KB, patch)
2013-06-25 22:32 UTC, Mr-4
no flags Details | Splinter Review
[PATCH 1/5] drm/nouveau/therm: Set the correct pwm_mode upon resume (977 bytes, patch)
2013-08-09 08:06 UTC, Martin Peres
no flags Details | Splinter Review
[PATCH 2/5] drm/nouveau/fan: restore the PWM value on resume when using the manual or auto mode (1.91 KB, patch)
2013-08-09 08:07 UTC, Martin Peres
no flags Details | Splinter Review
[PATCH 3/5] drm/nouveau/timer: restore the time on resume (2.83 KB, patch)
2013-08-09 08:08 UTC, Martin Peres
no flags Details | Splinter Review
[PATCH 4/5] drm/nouveau/timer: add a way to cancel alarms (3.55 KB, patch)
2013-08-09 08:08 UTC, Martin Peres
no flags Details | Splinter Review
[PATCH 5/5] drm/nouveau/therm: survive to suspend/resume cycles (5.08 KB, patch)
2013-08-09 08:09 UTC, Martin Peres
no flags Details | Splinter Review

Description Mr-4 2013-06-25 22:27:11 UTC
When I have done the following:

echo 1 > /sys/class/hwmon/hwmon0/pwm1_enable 
echo 30 > /sys/class/hwmon/hwmon0/pwm1 

pwm1=30
pwm1_enable=1
pwm1_max=100
pwm1_min=0
temp1_auto_point1_pwm=100
temp1_auto_point1_temp=90000
temp1_auto_point1_temp_hyst=3000
temp1_crit=115000
temp1_crit_hyst=2000
temp1_emergency=135000
temp1_emergency_hyst=5000
temp1_input=41000
temp1_max=95000
temp1_max_hyst=3000
update_rate=1000

and then do hibernate/restore, I get this:

pwm1=100
pwm1_enable=1
pwm1_max=100
pwm1_min=0
temp1_auto_point1_pwm=100
temp1_auto_point1_temp=90000
temp1_auto_point1_temp_hyst=3000
temp1_crit=115000
temp1_crit_hyst=2000
temp1_emergency=135000
temp1_emergency_hyst=5000
temp1_input=37000
temp1_max=95000
temp1_max_hyst=3000
update_rate=1000

The "pwm1_enable" value is restored, thanks to the patch proposed by Emil Velikov in Bug 66022, but that doesn't restore the pwm1 value in case of manual fan management. I am attaching 2 patches to this report:

1. which fixes this bug and restores everything ("pwm1_enable", as well as "pwm1"); and

2. a separate patch, allowing fan_mode and fan duty (percentage) to be specified at the command line, allowing me to set my fan speed immediately and not waiting for OS scripts to do that for me as was the case up until now.

Since the automatic fan management is, well, lets just say incomplete, that's currently the best option to not have my fan wail like a jet engine every time I boot up and use my machine.

Patches follow...
Comment 1 Mr-4 2013-06-25 22:30:21 UTC
Created attachment 81430 [details] [review]
Restore the pwm1_enable and pwm1 (in case of manual fan management) after hibernate/restore

This patch restores pwm1_enable as well as pwm1 (in case of manual fan management) values after hibernate/restore.
Comment 2 Mr-4 2013-06-25 22:32:02 UTC
Created attachment 81431 [details] [review]
set fan_mode and fan_duty at boot up

The following patch allows fan_mode and fan_duty (percentage) to be specified at the kernel command line
Comment 3 Martin Peres 2013-06-26 15:37:03 UTC
The real solution would be to restore all sysfs attributes after reboot. The proposed solution doesn't seem scalable :s However, I fully agree something should be done here!

I would rather see an automatic fan management by default and some overrides to disable it though. Please use the "config" parameter to specify them instead of adding parameters to nouveau, it seems to be how darktama wants things to work.

Automatic fan management is supposed to work on all cards < NVC0 (which has it at boot time, by default). I discussed enabling it by default for Linux 3.11 but it seems we'll need to wait until 3.12.
Comment 4 Mr-4 2013-06-26 22:49:29 UTC
(In reply to comment #3)
> The real solution would be to restore all sysfs attributes after reboot.
Indeed, I agree.

> The proposed solution doesn't seem scalable :s
I never claimed that it is - the patch was submitted (note that it wasn't formal submission either) barely as a guidance - it-works-for-me (tm) (;

> However, I fully agree something should be done here!
Yep.

> I would rather see an automatic fan management by default and some overrides
> to disable it though. Please use the "config" parameter to specify them
> instead of adding parameters to nouveau, it seems to be how darktama wants
> things to work.
How do I "use the config" exactly? And who is "darktama" - is that Ben Skeggs?

> Automatic fan management is supposed to work on all cards < NVC0 (which has
> it at boot time, by default). I discussed enabling it by default for Linux
> 3.11 but it seems we'll need to wait until 3.12.
I am, quite frankly, amazed that something so raw and incomplete as the automatic fan management could make it into the mainstream kernel code - Ben is usually thorough as far as these things are concerned and I don't know how this slipped through.

The "automatic" mode doesn't work - far from it, as evident from my other submission (and an additional feedback provided) - see Bug #66022 (and Comment #5 in particular).
Comment 5 Martin Peres 2013-06-28 15:40:49 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > The real solution would be to restore all sysfs attributes after reboot.
> Indeed, I agree.
> 
> > The proposed solution doesn't seem scalable :s
> I never claimed that it is - the patch was submitted (note that it wasn't
> formal submission either) barely as a guidance - it-works-for-me (tm) (;

Ok.

> How do I "use the config" exactly? And who is "darktama" - is that Ben
> Skeggs?

Yes, Darktama is Ben Skeggs. The config option is a string that is used to pass multiple parameters.

> > Automatic fan management is supposed to work on all cards < NVC0 (which has
> > it at boot time, by default). I discussed enabling it by default for Linux
> > 3.11 but it seems we'll need to wait until 3.12.
> I am, quite frankly, amazed that something so raw and incomplete as the
> automatic fan management could make it into the mainstream kernel code - Ben
> is usually thorough as far as these things are concerned and I don't know
> how this slipped through.
> 
> The "automatic" mode doesn't work - far from it, as evident from my other
> submission (and an additional feedback provided) - see Bug #66022 (and
> Comment #5 in particular).

Well, I agree that it is my fault that I may not have thought about the suspend/resume cycle. Thanks for reporting that. However, I'm afraid you are hit by another bug, a bug in the ALARM mechanism, which explains why sometimes, the fan speed isn't updated anymore. This is very worrying and may force us to move to using Linux's periodic tasks instead.

Thanks for this review. I'll work on fixing this ASAP. I can work on the sysfs restoration but I have no card with a PWM fan with me here. I may then require you to test some stuff.
Comment 6 Mr-4 2013-06-28 16:14:48 UTC
(In reply to comment #5)
> Well, I agree that it is my fault that I may not have thought about the
> suspend/resume cycle. Thanks for reporting that. However, I'm afraid you are
> hit by another bug, a bug in the ALARM mechanism, which explains why
> sometimes, the fan speed isn't updated anymore. This is very worrying and
> may force us to move to using Linux's periodic tasks instead.
It is indeed a serious bug as the average user may not be as careful as I was or may not necessarily pay attention to the card temperature after hibernate/restore, particularly if the "auto" fan management was working prior to that. If that happens, the video card is almost certain to burn out!

> Thanks for this review. I'll work on fixing this ASAP. I can work on the
> sysfs restoration but I have no card with a PWM fan with me here. I may then
> require you to test some stuff.
Have I ever let you down? (;

When you've made your first call for testing of the therm features over 18 months ago, I responded and did quite a lot of some testing (you can't deny that, can you?), while also benefiting from the "auto" fan management which was, at the time, working perfectly. It was simply sublime and was a great pity that it didn't make it into the main nouveau repository at the end (I used your git thermal fork for absolute ages).

So, I'll be able to help you here with the testing as well, as long as there is willingness on your and Ben's part to fix these, in my view, very serious bugs.
Comment 7 Mr-4 2013-06-28 16:36:47 UTC
Sorry, one other thing I forgot to mention, which should be pretty obvious, given what I reported so far:

The "auto" fan management seems to work as expected when I do *not* use it after hibernate/restore.

In other words, when I boot up my machine (which, by default, has fan management disabled) and then select "auto" fan management with "echo 2 > ../pwm1_enable", then things are OK - the fan duty (pwm1) constantly hovers between 20 and 28, depending on my card temperature, keeping that temperature under control. No problems there.

Things are then *really* screwed up when I do hibernate/restore as reported previously. So, I suspect the hibernate/restore does something to the nouveau driver which prevents the auto mode from working properly.
Comment 8 Martin Peres 2013-06-28 18:08:32 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Well, I agree that it is my fault that I may not have thought about the
> > suspend/resume cycle. Thanks for reporting that. However, I'm afraid you are
> > hit by another bug, a bug in the ALARM mechanism, which explains why
> > sometimes, the fan speed isn't updated anymore. This is very worrying and
> > may force us to move to using Linux's periodic tasks instead.
> It is indeed a serious bug as the average user may not be as careful as I
> was or may not necessarily pay attention to the card temperature after
> hibernate/restore, particularly if the "auto" fan management was working
> prior to that. If that happens, the video card is almost certain to burn out!

Well, luckily, this isn't true :) PTHERM sends the kernel some IRQs whenever temperature reaches some threshold. The first one is fan-boost which puts the fan to 100%. The last one turns off the computer.

This feature shouldn't be affected by resume/suspend.

> 
> > Thanks for this review. I'll work on fixing this ASAP. I can work on the
> > sysfs restoration but I have no card with a PWM fan with me here. I may then
> > require you to test some stuff.
> Have I ever let you down? (;
> 
> When you've made your first call for testing of the therm features over 18
> months ago, I responded and did quite a lot of some testing (you can't deny
> that, can you?), while also benefiting from the "auto" fan management which
> was, at the time, working perfectly. It was simply sublime and was a great
> pity that it didn't make it into the main nouveau repository at the end (I
> used your git thermal fork for absolute ages).

Hey hey, thanks for being supportive. I indeed recognized your nick straight away.

The new model is nicer than it used to be from a dev point of view. As a user, I agree the previous system was better. I'll try to improve on that as time permits :)

> 
> So, I'll be able to help you here with the testing as well, as long as there
> is willingness on your and Ben's part to fix these, in my view, very serious
> bugs.

I agree. I am away from France for a few more months, but I can use my laptop and fake a fan to check everything is right. I'll see what I can do in a timely fashion.
Comment 9 Martin Peres 2013-06-28 18:10:45 UTC
(In reply to comment #7)
> Sorry, one other thing I forgot to mention, which should be pretty obvious,
> given what I reported so far:
> 
> The "auto" fan management seems to work as expected when I do *not* use it
> after hibernate/restore.
> 
> In other words, when I boot up my machine (which, by default, has fan
> management disabled) and then select "auto" fan management with "echo 2 >
> ../pwm1_enable", then things are OK - the fan duty (pwm1) constantly hovers
> between 20 and 28, depending on my card temperature, keeping that
> temperature under control. No problems there.

That's what I tested and it indeed worked beautifully. But I only tested that on my desktop PCs and I never suspend them. Silly me...

> 
> Things are then *really* screwed up when I do hibernate/restore as reported
> previously. So, I suspect the hibernate/restore does something to the
> nouveau driver which prevents the auto mode from working properly.

Again, what worries me the most is the fact that the fan stop being updated randomly. That's a SERIOUS bug! For the sysfs bug, I'll try to come up with an acceptable solution for both users and devs.
Comment 10 Martin Peres 2013-08-09 08:06:41 UTC
Created attachment 83879 [details] [review]
[PATCH 1/5] drm/nouveau/therm: Set the correct pwm_mode upon resume
Comment 11 Martin Peres 2013-08-09 08:07:18 UTC
Created attachment 83880 [details] [review]
[PATCH 2/5] drm/nouveau/fan: restore the PWM value on resume when  using the manual or auto mode
Comment 12 Martin Peres 2013-08-09 08:08:14 UTC
Created attachment 83881 [details] [review]
[PATCH 3/5] drm/nouveau/timer: restore the time on resume
Comment 13 Martin Peres 2013-08-09 08:08:39 UTC
Created attachment 83882 [details] [review]
[PATCH 4/5] drm/nouveau/timer: add a way to cancel alarms
Comment 14 Martin Peres 2013-08-09 08:09:14 UTC
Created attachment 83883 [details] [review]
[PATCH 5/5] drm/nouveau/therm: survive to suspend/resume cycles
Comment 15 Martin Peres 2013-08-09 08:12:39 UTC
Hey Mr-4,

Sorry for the late. These 5 patches should fix all your issues. Could you please test them? I tested them on my nvd9 but I had to upload a modified vbios + do some additional changes to nouveau to make it believe I had a pwm fan.

Keep me up to date :)
Comment 16 Mr-4 2013-08-10 16:27:51 UTC
(In reply to comment #15)
> Keep me up to date :)
Quick testing (4-5 hibernate/resume cycles + a bit of tweaking with the /sys/class/... options) shows me that everything works well as it should.

I have one suggestion, however: could you expose the fan "mode" and the fan pwm values as kernel parameters please, so that I could specify the fan mode and, optionally, the pwm value (if fan mode is manual) at startup when the kernel starts.

This, if implemented, will have the benefit of applying the fan mode, as well as the pwm value *immediately* upon kernel boot or restore from hibernate, and not wait until the whole system is started (less time to hear my jet engines).

I have a small patch I use here, which does just that - it has been working flawlessly (without your last patches applied and only in "manual" fan mode, obviously, as the auto mode was borked) from the time I created this bug report, so if there is an interest, I will post it, just let me know.
Comment 17 Mr-4 2013-08-10 16:29:23 UTC
Forgot to add - I'll continue to test the 5 patches (particularly in "auto" mode as this is one I am most-likely going to use here) and will report any issues I encounter.
Comment 18 Martin Peres 2013-08-10 23:37:11 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > Keep me up to date :)
> Quick testing (4-5 hibernate/resume cycles + a bit of tweaking with the
> /sys/class/... options) shows me that everything works well as it should.

Wonderful :) Thanks for the testing, I'll send the patches to the Nouveau ML with you as a tester. Can you remind me your full name please?

> 
> I have one suggestion, however: could you expose the fan "mode" and the fan
> pwm values as kernel parameters please, so that I could specify the fan mode
> and, optionally, the pwm value (if fan mode is manual) at startup when the
> kernel starts.

Can you check that /etc/sysctl.conf isn't exactly what you are asking for? I am not keen on adding kernel parameters that would duplicate what the sysfs interface provides.
Comment 19 Mr-4 2013-08-11 10:59:23 UTC
(In reply to comment #18)
> Wonderful :) Thanks for the testing, I'll send the patches to the Nouveau ML
> with you as a tester. Can you remind me your full name please?
Dash Four

> > I have one suggestion, however: could you expose the fan "mode" and the fan
> > pwm values as kernel parameters please, so that I could specify the fan mode
> > and, optionally, the pwm value (if fan mode is manual) at startup when the
> > kernel starts.
> Can you check that /etc/sysctl.conf isn't exactly what you are asking for?
Interesting. I've never considered that approach. A few questions:

How do I specify the settings in that file (what are they called)?

Also, are these settings activated *immediately* upon boot (at the same time as the kernel parameters are processed?) or are they processed right at the end when the whole system is loaded?

Are sysctl.conf settings processed before or after rc.* (rc.local or rc.sysinit)?

Finally, are these options effective upon restore from hibernate and is the order in which they are processed the same as during boot?

The reason I ask all this is because when I specify the fan mode and pwm values as kernel parameters using my own patch, these are processed *immediately* during boot or restore from hibernate. In other words, my jet engines are gone quiet as soon as I start my machine, so I don't have to wait long.

If I don't specify these settings as kernel parameters, then these are processed right at the end when the system is started (this could be anything from 15 seconds to up to 1.5 minutes), during which my nVidia fan is at full pelt blasting out and that's annoying.

> I am not keen on adding kernel parameters that would duplicate what the sysfs
> interface provides.
I understand that, but we already have a "precedent" set with using the perflvl and the nvidia /sys/class write protection mode (=7777, I think) kernel settings, so this isn't something new.
Comment 20 Martin Peres 2013-08-11 12:33:52 UTC
(In reply to comment #19)
> How do I specify the settings in that file (what are they called)?
> 
> Also, are these settings activated *immediately* upon boot (at the same time
> as the kernel parameters are processed?) or are they processed right at the
> end when the whole system is loaded?
> 
> Are sysctl.conf settings processed before or after rc.* (rc.local or
> rc.sysinit)?

It is supposed to be called quite early at boot. Maybe even too early and you'll have to load Nouveau earlier than this (by adding it to your ramfs).

When exactly is it run depends on your distro.
> 
> Finally, are these options effective upon restore from hibernate and is the
> order in which they are processed the same as during boot?

sysctl is just a tool to do the writes to sysfs for you. Nothing fancy. So, hibernate should work too.

> 
> The reason I ask all this is because when I specify the fan mode and pwm
> values as kernel parameters using my own patch, these are processed
> *immediately* during boot or restore from hibernate. In other words, my jet
> engines are gone quiet as soon as I start my machine, so I don't have to
> wait long.
> 
> If I don't specify these settings as kernel parameters, then these are
> processed right at the end when the system is started (this could be
> anything from 15 seconds to up to 1.5 minutes), during which my nVidia fan
> is at full pelt blasting out and that's annoying.

I get that. The proper solution is coming soon though, automatic fan management activated at boot time. Ben has a patch for that in his private tree and we have been discussing about activating it by default in the next release.
This is why I would like to thank you for your testing!

> 
> > I am not keen on adding kernel parameters that would duplicate what the sysfs
> > interface provides.
> I understand that, but we already have a "precedent" set with using the
> perflvl and the nvidia /sys/class write protection mode (=7777, I think)
> kernel settings, so this isn't something new.

Yeah, well, the perflvl_wr is unfortunate and the same can be said about the perflvl one (setting a perflvl at boot time). Those will go away when we have stable reclocking.
Comment 21 Martin Peres 2013-08-13 02:51:09 UTC
The patches have been accepted. Thanks for reporting and testing!
Comment 22 Martin Peres 2013-08-13 02:58:16 UTC
*** Bug 66022 has been marked as a duplicate of this bug. ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.