Bug 30364

Summary: [945GME] poor 3d performance in deep c-states
Product: DRI Reporter: Antonio Orefice <kokoko3k>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: low CC: anarsoul, b.buschinski, jbarnes, jeramy.smith, lambchop468, linux, maxijac, mcepl, rodrigo.vivi, sergio.callegari
Version: unspecified   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
xorg log
none
dmesg log
none
xorg.conf
none
ICH7 LPC debug driver
none
Use PM QoS latency to prevent dropping below C2 on Atom
none
Use PM QoS latency to keep CPU from dropping below C1 when vblanks enabled
none
WARNs from using "Use PM QoS latency to keep CPU from dropping below C1 when vblanks enabled"
none
Use PM QoS to prevent C-State starvation of gen3 GPU
none
Use PM QoS to prevent C-State starvation of gen3 GPU for 2.6.37
none
Twiddle INSTPM bit11
none
Move INSTPM bit twiddling to intel_mark_busy none

Description Antonio Orefice 2010-09-24 10:23:54 UTC
Created attachment 38937 [details]
xorg log

Chipset: 945GM
Kernel version: 2.6.35 (same problem in 2.6.34 but not in 2.6.33)
Arch: i686
xorg-server: 1.8.1.902
mesa / intel-dri: 7.8.2
xf86-video-intel: 2.12.0
libdrm version: 2.4.21

Linux distribution: Arch linux (similar issues reported for fedora too)
Machine model: Asus 1005HA
Display Connector: LVDS (happens on VGA too)

Reproducible: Always


Step to reproduce:
-------------------
Compile mesa demos
Launch teapot
Observe the framerate
Move the mouse around
Observe the framerate jumping
(in my case it went from 14~125 to 30 just by putting my finger on the touchpad)

Roll back to kernel 2.6.33
Launch teapot
Observe that there is no difference in framerate if you move the mouse around, notice the framerate is "high" (30fps for me) in any case.

Even if just changing kernel version makes the bug disappear, i think it is more logical to file a bug report here.
I i made i mistake, i apologize.
Comment 1 Antonio Orefice 2010-09-24 10:24:55 UTC
Created attachment 38938 [details]
dmesg log
Comment 2 Antonio Orefice 2010-09-24 10:26:22 UTC
Created attachment 38939 [details]
xorg.conf
Comment 3 Antonio Orefice 2010-09-24 10:32:28 UTC
Sorry for the typo:
"(in my case it went from 14~125[..]
Is:
"(in my case it went from 14~15[..]
Comment 4 Chris Wilson 2010-09-24 10:41:03 UTC
Swapbuffer vs interrupts.
Comment 5 Chris Wilson 2010-09-24 10:42:39 UTC
Is the teapot fullscreen? Are page-flips enabled? Is something disabling the interrupts on your system?
Comment 6 Jesse Barnes 2010-09-24 10:44:52 UTC
Looks like the processor c-state issue we have on some other 945 machines.  If you boot with processor.max_cstate=1 does the problem go away?

The issue is that we rely on vblank interrupts arriving at the correct frequency, and on some platforms when the CPU is in a deep sleep state, it won't wake up when a vblank interrupt arrives, but it will wake up when other device interrupts arrive.  That's why you see the performance increase when you move the mouse.

I still don't know the root cause, but if the above works for you, then it's a duplicate of a known bug at least.
Comment 7 Antonio Orefice 2010-09-24 11:13:12 UTC
(In reply to comment #5)
> Is the teapot fullscreen? Are page-flips enabled? Is something disabling the
> interrupts on your system?

I don't know how to make teapot runs in fullscreen mode, the best i've done was to launch it in a empty X screen by xterm without any WM and the problem persists.
With compiz enabled (and unredirect fullscreen windows) i was able to make it fullscreen too with a shortcut, the problem persists.

Pagefilps were disabled in my system because of stability issues (enabled in kernel, disabled in X), i tried to recompile the driver to enable them again for a while and tried again, no success.

I can't say if something is blocking interrupts in my system, sorry, anyway, listening to an mp3 in background helped a bit (fps went from 15 to 20 in teapot)
Comment 8 Antonio Orefice 2010-09-24 11:19:15 UTC
(In reply to comment #6)
> Looks like the processor c-state issue we have on some other 945 machines.  If
> you boot with processor.max_cstate=1 does the problem go away?
> 
> The issue is that we rely on vblank interrupts arriving at the correct
> frequency, and on some platforms when the CPU is in a deep sleep state, it
> won't wake up when a vblank interrupt arrives, but it will wake up when other
> device interrupts arrive.  That's why you see the performance increase when you
> move the mouse.
> 
> I still don't know the root cause, but if the above works for you, then it's a
> duplicate of a known bug at least.

I tried that boot option and the problem disappeared.
Unfortunately, and as expected that thing makes my netbook power hungry, using powertop i noticed that it went from ~6.5..7W to ~8W+ just idling, and expected uptime battery life dropped from about ~10hrs to ~8.

Out of curiosity, is a vblank interrupt still needed when one doesn't need (or doesn't care) about vsync?

At least, thank you very much for answering and claryfing things, at this point it is clear that this is a duplicate bug, if could you mark it to the right one?
Comment 9 Jesse Barnes 2010-09-24 11:24:52 UTC
(In reply to comment #8)
> I tried that boot option and the problem disappeared.
> Unfortunately, and as expected that thing makes my netbook power hungry, using
> powertop i noticed that it went from ~6.5..7W to ~8W+ just idling, and expected
> uptime battery life dropped from about ~10hrs to ~8.

Yeah, it's unfortunate.  I don't think they see this problem on Windows because they probably can't reach a deep enough sleep state to be affected (Windows and its applications tend to have lots of timers running that keep the CPU awake).

> Out of curiosity, is a vblank interrupt still needed when one doesn't need (or
> doesn't care) about vsync?

Yes, if you don't have apps waiting for vsync or doing buffer swaps, you shouldn't need the vblank interrupt (the kernel will shut it off).  But anything using GL will do buffer swaps and thus need the vsync interrupt, unless you disable it entirely using vblank_mode=0 in your dri configuration file (.drirc or /etc/drirc iirc).

> At least, thank you very much for answering and claryfing things, at this point
> it is clear that this is a duplicate bug, if could you mark it to the right
> one?

Actually I don't think we have bug open on this, so we'll use this one. :)  All the discussion of this so far has just been on the mailing lists.
Comment 10 Vasily Khoruzhick 2010-09-24 12:10:10 UTC
Jesse, what about pm_qos stuff mentioned on maillist?
Comment 11 Jesse Barnes 2010-09-24 12:29:11 UTC
I don't have a tool to set that from userspace, and I didn't see a good way of doing it from within the kernel, but I expect it just limits the processor max c state, just like the boot param.

Another thing to try, that worked on my aspireone, is to boot with maxcpus=1.
Comment 12 Antonio Orefice 2010-09-24 13:50:25 UTC
(In reply to comment #9)

Anyway, the same driver on kernel 2.6.33 performs just fine for me (low power consumption and right vblank interrupts), so i think this problem has definitely a solution lying around.
Comment 13 Jesse Barnes 2010-09-24 13:54:45 UTC
2.6.33 doesn't support vblank events, so you wouldn't be able to run the code that exposes this problem.  I'm sure the interrupt issue still exists on 2.6.33 though, you just don't see it because you're not running code that's sensitive to interrupt latency.
Comment 14 Vasily Khoruzhick 2010-09-27 06:36:41 UTC
I tried using ShadowFB as workaround, and found that it works _much_ better with KDE 4.5 and latest intel driver :) (at least konsole is not jerky)
Comment 15 Antonio Orefice 2010-09-27 07:31:09 UTC
(In reply to comment #13)
> 2.6.33 doesn't support vblank events, so you wouldn't be able to run the code
> that exposes this problem.  I'm sure the interrupt issue still exists on 2.6.33
> though, you just don't see it because you're not running code that's sensitive
> to interrupt latency.

Please, excuse in advance my ignorance and probably the stupid question, but what are the advantages (if any) on running that code?
I'm asking because i didn't noticed any performance or tearing difference with 2.6.35+processor.max_cstate=1 compared to 2.6.33.
Comment 16 Jesse Barnes 2010-09-27 09:07:15 UTC
The new code has some potential performance benefits (it allows page flipping and won't waste GPU time on frames that won't be displayed), and adds back several missing GL features.

You can get the same behavior with current code as in 2.6.33 by disabling the new features.  You can do this by setting vblank_mode=0 in your environment or drirc config file.
Comment 17 Antonio Orefice 2010-09-27 09:35:19 UTC
(In reply to comment #16)
> The new code has some potential performance benefits (it allows page flipping
> and won't waste GPU time on frames that won't be displayed), and adds back
> several missing GL features.
> 
> You can get the same behavior with current code as in 2.6.33 by disabling the
> new features.  You can do this by setting vblank_mode=0 in your environment or
> drirc config file.

I just readed that answer by Vasily Khoruzhick on the mailing list:
"That doesn't help, glxgears shows ~1000fps, but it's output is jerky"

Anyway thank you for the suggestion, i'll try by myself as soon as possible.
Comment 18 Chris Wilson 2010-09-30 04:09:51 UTC
(In reply to comment #17)
> I just readed that answer by Vasily Khoruzhick on the mailing list:
> "That doesn't help, glxgears shows ~1000fps, but it's output is jerky"
> 
> Anyway thank you for the suggestion, i'll try by myself as soon as possible.

If I've got it right, that should be fixed on -next with the per-process throttling.
Comment 19 Antonio Orefice 2010-09-30 04:35:26 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > I just readed that answer by Vasily Khoruzhick on the mailing list:
> > "That doesn't help, glxgears shows ~1000fps, but it's output is jerky"
> > 
> > Anyway thank you for the suggestion, i'll try by myself as soon as possible.
> 
> If I've got it right, that should be fixed on -next with the per-process
> throttling.

Can't understand fully what you said, but let's wait for the next release then.
Comment 20 Vasily Khoruzhick 2010-09-30 05:32:12 UTC
(In reply to comment #18)
> (In reply to comment #17)
> > I just readed that answer by Vasily Khoruzhick on the mailing list:
> > "That doesn't help, glxgears shows ~1000fps, but it's output is jerky"
> > 
> > Anyway thank you for the suggestion, i'll try by myself as soon as possible.
> 
> If I've got it right, that should be fixed on -next with the per-process
> throttling.

Please give a link to commit/patch when it's ready. Thanks
Comment 21 Vasily Khoruzhick 2010-10-04 08:43:48 UTC
(In reply to comment #18)
> If I've got it right, that should be fixed on -next with the per-process
> throttling.

Tried drm-intel-next from today, bug still remains.
Comment 22 Jesse Barnes 2010-10-11 10:34:59 UTC
Created attachment 39347 [details] [review]
ICH7 LPC debug driver

Can you load this driver and tell me what it outputs?  I wonder if BM_BREAK_EN is 0 on your machine as well...
Comment 23 Jesse Barnes 2010-10-11 10:43:50 UTC
This patch on top of the last attachment should let the CPU wake up much more frequently, assuming the break reg is 0, give it a try and see if it helps your performance problem.

diff --git a/drivers/platform/x86/intel_lpc.c b/drivers/platform/x86/intel_lpc.c
index d3c5ef5..3be93c1 100644
--- a/drivers/platform/x86/intel_lpc.c
+++ b/drivers/platform/x86/intel_lpc.c
@@ -50,6 +50,8 @@ static int lpc_probe(struct pci_dev *dev, const struct pci_dev
        dev_err(&dev->dev, "ACPI_CX_STATE_CONF: 0x%02x\n", cxstate);
        dev_err(&dev->dev, "ACPI_BM_BREAK_EN: 0x%02x\n", break_en);
 
+       pci_write_config_byte(dev, ACPI_BM_BREAK_EN, 0xf3);
+
 out:
        return ret;
 }
Comment 24 Vasily Khoruzhick 2010-10-11 10:47:44 UTC
[  565.573458] intel lpc 0000:00:1f.0: ACPI_CX_STATE_CONF: 0x1c
[  565.573464] intel lpc 0000:00:1f.0: ACPI_BM_BREAK_EN: 0x00
Comment 25 Antonio Orefice 2010-10-13 01:21:07 UTC
(In reply to comment #23)
> This patch on top of the last attachment should let the CPU wake up much more
> frequently, assuming the break reg is 0, give it a try and see if it helps your
> performance problem.

I didn't tried out the patch yet because i'm not so familiar with kernel patching and we need this netbook daily.

But i was wondering if is possible (and how) to use setpci to try different configurations for BM_BREAK_EN register at runtime.

Thank you very much for your efforts.
Comment 26 Vasily Khoruzhick 2010-10-18 11:42:04 UTC
Bug is reproducible on following machines:

Lenovo 3000 N100 laptop, Core 2 Duo T5500 CPU, 
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03), pciid: 8086:27a2

Acer Aspire AOA110 netbook, Atom N270 CPU,
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GME Express Integrated Graphics Controller (rev 03), pciid: 8086:27ae
Comment 27 Vasily Khoruzhick 2010-10-18 12:44:02 UTC
Also reproducible on Acer extensa 5513 laptop, with C2D T5500 CPU,
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03), pciid: 8086:27a2
Comment 28 Oleksij Rempel 2010-10-31 11:20:06 UTC
Just to add my two cents:
have same issue atom+945gm
if i add more load on cpu frame rate will grow too.
processor.max_cstat option didn't changed anything, powertop show there is still c4 (may be some other kernel bug)

maxcpus=1 solve the problem, it work with C4, powersaving and better performance.

So haw about the problem with sheduler or irq balancing on SMP?

I'll will test the patch von Jesse ASAP.
Comment 29 Oleksij Rempel 2010-10-31 11:31:17 UTC
The patch from comment 23 do not make any difference for me. disable SMP is best configuration for me.
Comment 30 Artem S. Tashkinov 2010-11-02 05:06:09 UTC
This bug probably affects Intel HD Graphics too:

glxgears with idle CPU:

4925 frames in 5.0 seconds = 984.918 FPS
4941 frames in 5.0 seconds = 988.052 FPS
4996 frames in 5.0 seconds = 999.137 FPS
4973 frames in 5.0 seconds = 994.512 FPS

glxgears with 100% loaded CPU (one thread only):

7544 frames in 5.0 seconds = 1508.685 FPS
7458 frames in 5.0 seconds = 1491.536 FPS
7378 frames in 5.0 seconds = 1475.574 FPS
7415 frames in 5.0 seconds = 1482.973 FPS

roughly 50%(!) faster.
Comment 31 Antonio Orefice 2010-12-01 04:58:28 UTC
Today i tried with 2.6.36, and obviously the results are the same, so i'm still using 2.6.33.
For me disabling a core or the hyperthreading is not an option due to the higher power consumption and the shorter battery life.

If i understood properly, the issue still appears to be unresolved and the hypothesis made doesn't seems to be able to make anything really useful.

I understood thet the new code is looking forward to provide a "gain" in performance, but now my proposal is to do some kind of workaround for the specific chipsets that expose the problem so that at least their users will be able to upgrade to newer kernels without suffernig any performance "loss".

Could such a thing be done in the video driver itself or does it requires patches or special config options of the kernel (?).
Comment 32 Chris Wilson 2010-12-08 08:18:12 UTC
Created attachment 40923 [details] [review]
Use PM QoS latency to prevent dropping below C2 on Atom

Proof-of-principle?
Comment 33 Artem S. Tashkinov 2011-01-05 00:35:32 UTC
(In reply to comment #32)
> Created an attachment (id=40923) [details]
> Use PM QoS latency to prevent dropping below C2 on Atom
> 
> Proof-of-principle?

This patch helped only marginally (10% better than without it in idle mode):

$ glxgears (power savings on, CPU running @ 1.2GHz)
5601 frames in 5.0 seconds = 1120.104 FPS
5612 frames in 5.0 seconds = 1122.275 FPS
5603 frames in 5.0 seconds = 1120.483 FPS
5606 frames in 5.0 seconds = 1121.091 FPS
5587 frames in 5.0 seconds = 1117.238 FPS

$ glxgears (power savings off, CPU running @ 3.2GHz)
7089 frames in 5.0 seconds = 1417.741 FPS
7068 frames in 5.0 seconds = 1413.511 FPS
7082 frames in 5.0 seconds = 1416.285 FPS
7079 frames in 5.0 seconds = 1415.792 FPS
7057 frames in 5.0 seconds = 1411.390 FPS

P.S. I have Intel HD 1st generation graphics.
Comment 34 Antonio Orefice 2011-01-05 00:54:42 UTC
As .drirc configuration file is finally honoured in the latest intel-dri/mesa (i have 7.9.0.git20101207), setting vblank_mode=0 (as explicitely suggested by Jesse Barnes) now works and the issue is gone for me.
Strangely enough, i can't see any tearing in glxgears. 

I know this is a workaround, but on such poor hardware enabling vsync would be a bad idea anyway.
Comment 35 Vasily Khoruzhick 2011-01-05 01:13:21 UTC
(In reply to comment #32)
> Created an attachment (id=40923) [details]
> Use PM QoS latency to prevent dropping below C2 on Atom
> 
> Proof-of-principle?

As I stated on IRC, it does not help in my case - glxgears still shows 30-40fps instead of 60. I want to note that it's not only tearing/vblank issue, response on user actions in KDE with effects enabled is not good (it was much better earlier)
Comment 36 Alexander Lam 2011-01-06 11:34:17 UTC
Created attachment 41720 [details] [review]
Use PM QoS latency to keep CPU from dropping below C1 when vblanks enabled

(In reply to comment #32)
> Created an attachment (id=40923) [details]
> Use PM QoS latency to prevent dropping below C2 on Atom
> 
> Proof-of-principle?

Here is a variant of that patch I tried that does fix the issue on my hardware:
Acer Aspire One 9" Netbook AOA150, 945GSE and Intel N270 Processor

It does produce a few WARNs because I am calling pm_qos_add_request from an interrupt disabled context. (also attached)

testcase used is vblank_mode=2 glxgears
Comment 37 Alexander Lam 2011-01-06 11:35:19 UTC
Created attachment 41721 [details]
WARNs from using "Use PM QoS latency to keep CPU from dropping below C1 when vblanks enabled"
Comment 38 Chris Wilson 2011-01-09 03:38:04 UTC
*** Bug 32916 has been marked as a duplicate of this bug. ***
Comment 39 Chris Wilson 2011-01-09 04:10:57 UTC
Created attachment 41796 [details] [review]
Use PM QoS to prevent C-State starvation of gen3 GPU 

Raise you a work function.
Comment 40 Vasily Khoruzhick 2011-01-09 05:58:48 UTC
(In reply to comment #39)
> Created an attachment (id=41796) [details]
> Use PM QoS to prevent C-State starvation of gen3 GPU 
> 
> Raise you a work function.

It does not apply on top of 2.6.37, could you please prepare version for stable kernel?
Comment 41 Alexander Lam 2011-01-09 13:16:00 UTC
Created attachment 41814 [details] [review]
Use PM QoS to prevent C-State starvation of gen3 GPU for 2.6.37

(In reply to comment #40)
> (In reply to comment #39)
> > Created an attachment (id=41796) [details] [details]
> > Use PM QoS to prevent C-State starvation of gen3 GPU 
> > 
> > Raise you a work function.
> 
> It does not apply on top of 2.6.37, could you please prepare version for stable
> kernel?

Chris's patch mangled to work with 2.6.37 (two changes, s/irq_lock/user_irq_lock/ in two places)
Comment 42 Alexander Lam 2011-01-09 13:21:25 UTC
(In reply to comment #39)
> Created an attachment (id=41796) [details]
> Use PM QoS to prevent C-State starvation of gen3 GPU 
> 
> Raise you a work function.

Confirming that this works on 2.6.37 on:

Acer Aspire One 9" Netbook AOA150, 945GSE and Intel N270 Processor

testcase
vblank_mode=2 glxgears

(I probably should test with -next but don't have time at the moment)
Comment 43 Vasily Khoruzhick 2011-01-09 23:34:49 UTC
(In reply to comment #41)
> Created an attachment (id=41814) [details]
> Use PM QoS to prevent C-State starvation of gen3 GPU for 2.6.37
>
> Chris's patch mangled to work with 2.6.37 (two changes,
> s/irq_lock/user_irq_lock/ in two places)

Thanks, looks like it works.
Comment 44 Vasily Khoruzhick 2011-01-10 02:52:28 UTC
(In reply to comment #43)

> Thanks, looks like it works.

But it does not work after update to xf86-video-intel-2.14.0 :( 20-30 fps in glxgears instead of 60.
Comment 45 Alexander Lam 2011-01-10 10:12:23 UTC
(In reply to comment #44)
> (In reply to comment #43)
> 
> > Thanks, looks like it works.
> 
> But it does not work after update to xf86-video-intel-2.14.0 :( 20-30 fps in
> glxgears instead of 60.

I'm not seeing this with xf86-video-intel-2.14.0

Hmm...

libdrm-git version: bad5242a
xf86-video-intel version: 2.14.0
mesa version: 7.10
xorg-server: 1.9.3.901-1
kernel: (not vanilla) 2.6.37 + patch in attachment 41814 [details] [review]
Comment 46 Jesse Barnes 2011-01-14 14:06:32 UTC
Reassigning back to Chris; doesn't look like we'll be able to find a hardware solution to this one.
Comment 47 Chris Wilson 2011-01-25 04:30:27 UTC
I've applied Alexander's patch to drm-intel-next, so please give that branch a thorough testing!
Comment 48 Chris Wilson 2011-02-01 04:47:06 UTC
Tentatively closing with the patch landing in -next.

Things to look out for:

1. fps stuttering (i.e. the reoccurrence of the original bug);

2. obscene power consumption;

3. aliens.
Comment 49 Chris Wilson 2011-02-05 02:22:38 UTC
Created attachment 42959 [details] [review]
Twiddle INSTPM bit11

New patch time!
Comment 50 Oleksij Rempel 2011-03-03 01:10:38 UTC
I tested last patch (replace vblank PM QoS with "Interrupt-Based AGPBUSY#"),

it return first issue, fps stuttering.
power usage is ok.
Comment 51 Chris Wilson 2011-03-03 02:58:44 UTC
Created attachment 44065 [details] [review]
Move INSTPM bit twiddling to intel_mark_busy

How about with this patch?
Comment 52 Oleksij Rempel 2011-03-03 05:03:42 UTC
no noticeable difference.
Comment 53 Alexander Lam 2011-03-08 09:56:49 UTC
(In reply to comment #51)
> Created an attachment (id=44065) [details]
> Move INSTPM bit twiddling to intel_mark_busy
> 
> How about with this patch?

plain drm-intel-next (47ae63e) with and without this patch resulted in missing vblanks & stuttery glxgears.

As discussed on IRC, my BIOS doesn't set INSTPM_AGPBUSY_DIS (INSTPM bit 11), so this won't fix it anyway.
Comment 54 Chris Wilson 2011-06-05 22:15:19 UTC
*** Bug 37966 has been marked as a duplicate of this bug. ***
Comment 55 Chris Wilson 2012-05-09 02:29:24 UTC
This might be interesting:

http://cgit.freedesktop.org/~danvet/drm/log/?h=better-gpu_cpufreq
Comment 56 Rodrigo Vivi 2012-12-13 19:16:56 UTC
Is this issue still there at new kernel? What is the latest kernel this issue was seen?

Does any one tested this better-gpu_cpufreq branch?
Comment 57 Vasily Khoruzhick 2012-12-13 19:38:08 UTC
Still here on 3.6, will test on 3.7 as soon as it get into archlinux repos
Comment 58 Chris Wilson 2012-12-13 20:40:01 UTC
No need, it's a known design feature of the power management hardware. The only question is whether we can find an acceptable workaround.
Comment 59 Chris Wilson 2013-01-26 11:14:15 UTC
*** Bug 59895 has been marked as a duplicate of this bug. ***
Comment 60 sergio.callegari 2013-01-26 15:49:04 UTC
Thanks for pointing out so quickly the status of Bug 59895 as a duplicate of this one! This thread was an intersting read.
Comment 61 Daniel Vetter 2013-11-18 17:44:59 UTC
I guess it's time to give up - the only approach with restricting the deep sleep states resulted in horrid power consumption figures ... Just wiggle your mouse a bit :(
Comment 63 Oleksij Rempel 2014-02-07 06:40:22 UTC
I'll be able to test them in 2-3 weeks.
Comment 64 Chris Wilson 2014-02-11 12:09:01 UTC
As a reminder to myself, my only surviving non-pnv machine (915gm) has a processor that does not support C-states (only speedstep). I tried the patches and only keeping the CPU at maximum is sufficient to hit glxgears vrefresh.
Comment 65 Oleksij Rempel 2014-02-22 08:09:00 UTC
So, i can test it.
Are there any place where i can pull all patches together? On top of which branch should i test?
Comment 66 Ville Syrjala 2014-02-22 16:52:02 UTC
(In reply to comment #65)
> So, i can test it.
> Are there any place where i can pull all patches together? On top of which
> branch should i test?

I pushed the patches here:
git://gitorious.org/vsyrjala/linux.git agpbusy

I also reorganized them so it's easy to revert the top commit, which is something you might as well try in case there's no improvement with the branch as is.
Comment 67 Oleksij Rempel 2014-02-23 07:50:31 UTC
Hmm... i do not see noticeable changes.
I tested this patches on ubuntu 13.10 with unity/compize deskotop.
Glxgears show same performance before and after patches - about 58fps.
C4ATM usage seems to be identical too.

Do you have some suggestions what should i test?
Comment 68 Chris Wilson 2014-02-23 07:57:16 UTC
It would be easier to reproduce on a bare X.

If you do from a vt:

sudo service ligthdm stop
sudo Xorg -ac -noreset & sleep 3; DISPLAY=:0 xterm

then launch glxgears from the xterm, does it show the behaviour we need to fix?
i.e. runs at below refresh rate unless there is another source of interrupts (e.g. wiggling the mouse)?

If you can reproduce that, we can begin to test the patches.
Comment 69 Oleksij Rempel 2014-02-23 08:44:22 UTC
No, i can't reproduce initial bug.
After powertop optimisation i get about 20 wk/s. Just to make sure the suystem is idle.
On plain Xorg i get 125fps. Without any glitches. 
With and without patches i get same results.
Comment 70 Chris Wilson 2014-02-23 08:54:22 UTC
(In reply to comment #69)
> On plain Xorg i get 125fps. Without any glitches. 
> With and without patches i get same results.

Ah, that's broken - we are not using vsync. Presumably it failed to get permission to open /dev/dri/card0 and so is using indirect rendering (which does not respect vsync).

Try "LIBGL_DEBUG=1 glxinfo" and see if (a) reports indirect rendering and (b) why.
Comment 71 Oleksij Rempel 2014-02-23 09:09:33 UTC
You was right, there was no access to dri.
Now i tested it with sudo glxgears.
So results are absolutely unusable. with moving mouse fps will drop to 5fps. With moving mouse - 60fps. Results are same, before and after this patch set.
Comment 72 Oleksij Rempel 2014-02-23 09:33:32 UTC
Typo in previous comment:
without mouse - 5fps
with mouse - 60fps
Comment 73 Oleksij Rempel 2014-02-23 11:42:27 UTC
If it will some how help, i can give ssh access to this machine.
Comment 74 Ville Syrjala 2014-02-24 14:20:15 UTC
(In reply to comment #71)
> You was right, there was no access to dri.
> Now i tested it with sudo glxgears.
> So results are absolutely unusable. with moving mouse fps will drop to 5fps.
> With moving mouse - 60fps. Results are same, before and after this patch set.

Hmm. Was it running fullscreen or under a GL compositor that page flips?

Something like: 'vblank_mode=3 glxgears -fullscreen' should force it to do what we want, assuming your wm isn't totally crap.
Comment 75 Chris Wilson 2014-02-24 14:34:20 UTC
(In reply to comment #74)
> (In reply to comment #71)
> > You was right, there was no access to dri.
> > Now i tested it with sudo glxgears.
> > So results are absolutely unusable. with moving mouse fps will drop to 5fps.
> > With moving mouse - 60fps. Results are same, before and after this patch set.
> 
> Hmm. Was it running fullscreen or under a GL compositor that page flips?

Windowed under bare X.

> Something like: 'vblank_mode=3 glxgears -fullscreen' should force it to do
> what we want, assuming your wm isn't totally crap.

We don't need to force fullscreen to cause us to loose vblank interrupts whilst the processor is asleep (and so render very slowly).
Comment 76 Ville Syrjala 2014-02-24 17:45:55 UTC
(In reply to comment #75)
> We don't need to force fullscreen to cause us to loose vblank interrupts
> whilst the processor is asleep (and so render very slowly).

Oh right. Not sure where I got the idea that we wouldn't use vblank irqs unless fullscreen.

After thinking about this for a while I started to question why we're frobbing the AGPBUSY bit all the time. It won't force an exit from C3 unless there's a pending interrupt, so we should just be able to leave it on all the time.

I pushed that idea here:
git://gitorious.org/vsyrjala/linux.git agpbusy2

I guess the chances of it working are slim, but migth as well try.
Comment 77 Oleksij Rempel 2014-02-24 19:39:26 UTC
kernel 3.13.0-00966-gec441a0, same result. 5-10fps on idle system, and 60fps with moving mouse.
Comment 78 Daniel Vetter 2014-11-04 15:40:59 UTC
Yeah I guess that's it, time to give up on this one. Wiggling the mouse or running with wayland should fix this.

Thanks for reporting this bug and testing ideas, sorry that we couldn't make this work :(
Comment 79 Ville Syrjala 2019-03-25 07:15:44 UTC
I got fed up with my 945gm not being capabile of 60fps glxgears.

commit d938da6b132a2d6addeba4c57a67ec3c07824843
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Fri Mar 22 20:08:03 2019 +0200

    drm/i915: Disable C3 when enabling vblank interrupts on i945gm

The main difference compared to the older pm_qos attempts is that I found a way to dig out the exact c3 disable latency, so we should have a reasonable guarantee that we do disable c3 but not c2. The power cost of not using c3 seems to be about 0.7W on my machine (with the display on), so this isn't exactly cheap :(

I did spend quite a bit of time at some point digging through the chipset docs (such as they are). It's been a while since I did that but I'll try to summarize what I recall; Gen3 introduced some kind of new mechanism by which the gmch can wake up the CPU. The old AGPBUSY/PM_BUSY involved the ICH as well IIRC, whereas the new mechanism supposedly does not. IIRC the new mechanism already appears in the i915gm docs, but my theory is that i945gm is where it actually got into use and either it is broken or we're missing some magic undocumented bit somewhere. I did try (blindly if necessary) poking at various registers that seemed relevant. Alas, I was unable to find a magic bit to make C3+vblank interrupts cooperate.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.