Created attachment 41436 [details] while true ; do powertop -d -t 5|grep kworker ; echo "--" ; done #output Chipset: 945GM Kernel version: 2.6.36.2 Arch: i686 xorg-server: 1.9.2 mesa / intel-dri: 7.9.0.git20101207 xf86-video-intel: 2.13.0 libdrm version: 2.4.22 Linux distribution: Arch linux Machine model: Asus 1005HA Display Connector: LVDS Reproducible: Always Step to reproduce: ------------------- 1 start powertop in a terminal 2 observe kworker wakeups per seconds (hopefully you have low values) 3 browse, read some documents... use the pc for a while 4 observe kworker wakeups per seconds (it goes and stays to until 300,xx wakeups per second) 5 kill the processes one by one and see if the problem goes away 6 see that the only way to lower kworker wakeups is to kill/restart X The time needed to reproduce the bug appears to be random, sometimes it happens even after 10..15minutes of use, it goes to 300 and stays there. I've attached the output of this command: while true ; do powertop -d -t 5|grep kworker ; echo "--" ; done As you can see, after about 170 iterations, there's another kworker thread: 45,3% (300,2) kworker/0:2 I have to say that i strongly use suspend to disk feature, maybe it is related(?)
A perf top or record would be useful to see what's spinning inside the worker thread. Watching gem_objects, polling intel_gpu_dump, would give some clues if there is continuing rendering. But perf is probably the best source of information here.
Created attachment 41445 [details] perf top output while kworker wakes up 300 times per second
Created attachment 41446 [details] perf top output while kworker wakes up 20 times per second
I launched perf top and pasted the output while kworker was at 300wakes/s, next, i killed and started X again and kworker was at about 20wakes/s and pasted the output of perf top in another attachment. I never used perf before, if you need more info, i will try to provide them, thanks.
Created attachment 41449 [details] perf top output while kworker wakes up 20 times per second wrong file
Judging by that X is frequently polling during the 300 wakeup/s (note the amount of time spent in read_hpet which I associate with poll()). Doesn't seem -intel driver related at all. strace -p `pidof X` during the high frequency would confirm that and may give another clue. But it doesn't explain why powertop associated the wakeups with kworker/0. A 'perf record -f -g -a' so that the callgraphs are captured as well may also help. Similar 'perf timechart' might help as that is geared towards trying to figure out relationships between context switches and wakeups.
Created attachment 41453 [details] strace X, perf record and perf timechart record output on 300wakes/sec
Created attachment 41454 [details] strace X, perf record and perf timechart record output on 20..40 wakes/sec
Made two attachments (one when the wakes per seconds are around 20..40 and the other when they are around 300)with outputs from: strace -p `pidof X` timechart record perf record -f -g -a
Ah, you need the local debug files to translate the perf *record into something meaningful. Can you run 'perf report -i 300.perf.data-perf_record_f_g_a.txt -g fr,5 | head 500 > perf-300.txt' The strace suggests a lot of ioctl activity (which of course is impenetrable to strace) but ioctl(9, 0xc010645b, 0xbfa73fd0) = 0 ioctl(9, 0xc0106464, 0xbfa74054) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 9, 0x1259a) = 0xb6609000 ioctl(9, 0x400c645f, 0xbfa74064) = 0 ioctl(9, 0xc010645b, 0xbfa73fd0) = 0 is indicative of our driver (and fallback rendering to boot, or if you are lucky pixmap upload). So it does look like our driver is active during the 300 wakeup/s and also that it is rendering at that time.
Created attachment 41455 [details] perf report -i 300.perf.data-perf_record_f_g_a.txt -g fr,5 | head 500 > perf-300.txt Output from: perf report -i 300.perf.data-perf_record_f_g_a.txt -g fr,5 | head 500 > perf-300.txt
This is extremely odd: 24.01% lspci libc-2.12.1.so [.] 0x740af (with further lspci symbols, i.e. lspci is running and hooging the cpu which is very odd). The cpu time for kthread/0 is actually spent sending the timer/scheduler tick. So that would seem to just be context switches. Did you think of just checking top to see what was running at that time? I've no idea what to make of that lspci is running.
(In reply to comment #12) > This is extremely odd: > > 24.01% lspci libc-2.12.1.so [.] 0x740af > > (with further lspci symbols, i.e. lspci is running and hooging the cpu which is > very odd). > > The cpu time for kthread/0 is actually spent sending the timer/scheduler tick. > So that would seem to just be context switches. > > Did you think of just checking top to see what was running at that time? I've > no idea what to make of that lspci is running. that lspci is just a case, i have powertop in background which calls lspci periodically, that's not the problem. But down in the tree there's swapper->read_hpet
Seems that somehow suspend to disk -> resume (tried both tuxonice and uswsusp methods) is related and necessary for the bug to be reproduced. Now this netbook is up from a complete boot for about 30 minutes and kworker is still at 30W/s.
Switching to new powertop from git, i've more information about this. it shows this at first place: Usage Events/s Category Description 2,0 ms/s 289,5 kWork flush_to_ldisc As i understood, flush_to_ldisc() has something to do with ttys, i'm not sure anymore this issue is related to Xorg/intel, but still the only way i found to stop it from waking up the cpu is to kill X. Do you have some clue? Thank you.
I just downgraded to: intel-dri-7.8.2 (was 7.9.0.git20101207) xf86-video-intel-2.12.0 (was 2.13) xorg-server-1.9.0 (was 1.9.2) Now the netbook is up from hibernation for about 10 Hours and the issue seems to be gone (crossing my fingers...) So it appears to be something related after all.
Hmm, don't recognise that at all. Can you now upgrade and recheck in this order: intel-dri (mesa), xf86-video-intel, xserver. Given the wakeups seems to be flush_to_ldisc, I would expect the xserver to include the culprit. Then it's just a matter of bisecting the xserver...
(In reply to comment #17) > Hmm, don't recognise that at all. > > Can you now upgrade and recheck in this order: > > intel-dri (mesa), > xf86-video-intel, > xserver. > > Given the wakeups seems to be flush_to_ldisc, I would expect the xserver to > include the culprit. Then it's just a matter of bisecting the xserver... Now seems to be fine with: intel-dri: 7.9.0.git20101207 xf86-video-intel: 2.13.0 xorg-server 1.9.0 For the reference, this gave the problem: intel-dri: 7.9.0.git20101207 xf86-video-intel: 2.13.0 xorg-server 1.9.2 * So one would say that the xorg-server is the culprit, but still on different hardware (nvidia-ion chipset with the same packages and using hibernation) issue is not present. I will try to upgrade xorg-server 1.9.3 in two weeks. Thanks for your support!
(In reply to comment #18) > So one would say that the xorg-server is the culprit, but still on different > hardware (nvidia-ion chipset with the same packages and using hibernation) > issue is not present. Don't forget that the extensions supported (and therefore code paths exercised) between nvidia and the OS drivers are different. The prime example being RandR and hotplug/udev. But yes, that should narrow down the source of the error.
(In reply to comment #17) > Hmm, don't recognise that at all. > > Can you now upgrade and recheck in this order: > > intel-dri (mesa), > xf86-video-intel, > xserver. > > Given the wakeups seems to be flush_to_ldisc, I would expect the xserver to > include the culprit. Then it's just a matter of bisecting the xserver... Sorry for the late answer. Today i fully upgraded my system and i had: xorg-server 1.9.4 xf86-video-intel 2.14.0 (For the 3d part now i'm using mesa/gallium from git because i've better 3d performance in compiz.) Same issue as before, and if i just downgrade xorg-server and it's dependancies to 1.9.0 the issue is solved. Everything points to xorg-server now, do you have some advices to bisect it? (I never did something like that) Thanks.
I can confirm this on my T400 with Intel graphics on KDE 4. I am NOT using suspend2disk, only suspend2ram. akurei@joel: ~ $ yaourt -Qi xorg-server | grep Version Version : 1.10.1-1 akurei@joel: ~ $ yaourt -Qi xf86-video-intel | grep Version Version : 2.15.0-1 akurei@joel: ~ $ yaourt -Qi intel-dri | grep Version Version : 7.10.2-2 akurei@joel: ~ $ yaourt -Qi libdrm | grep Version Version : 2.4.25-1 akurei@joel: ~ $ uname -a Linux joel 2.6.38-ARCH #1 SMP PREEMPT Fri May 13 09:24:47 CEST 2011 x86_64 Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz GenuineIntel GNU/Linux
Can't say when, because i didn't tried every version, but now, with xorg-server 1.10.3.901 it seems that the issue is gone.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.