Bug 39202 - FPS - KDE desktop effects with 3.0 rc6 kernel
Summary: FPS - KDE desktop effects with 3.0 rc6 kernel
Status: RESOLVED NOTABUG
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r600 (show other bugs)
Version: git
Hardware: All Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-13 12:11 UTC by Siganderson
Modified: 2011-10-28 06:25 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
xorg log (46.49 KB, text/plain)
2011-07-13 15:26 UTC, Siganderson
Details
xorg conf (532 bytes, application/octet-stream)
2011-07-13 15:26 UTC, Siganderson
Details
dmesg (51.87 KB, text/x-log)
2011-07-13 15:27 UTC, Siganderson
Details
provisional patch to try as an experiment (4.30 KB, patch)
2011-10-27 15:27 UTC, Ilija Hadzic
Details | Splinter Review

Description Siganderson 2011-07-13 12:11:53 UTC
The FPS when desktop effects are activated in KDE 4.6.4 under a 3.0 rc6 kernel are the half of the FPS under a 2.6.38 kernel.
The experience with the 2.6.38 kernel is smoother and generally more fluid.

2.6.38 -> 60 to 120 FPS with vsync on
3.0 rc6 -> 30 to 60 FPS with vsync on

I don't know if this is a regression.
Comment 1 Siganderson 2011-07-13 12:25:57 UTC
video card: ATI Radeon HD 5670 512 MB Ram
Comment 2 Alex Deucher 2011-07-13 13:22:55 UTC
Please attach your xorg log and config and dmesg output.
Comment 3 Siganderson 2011-07-13 15:26:34 UTC
Created attachment 49061 [details]
xorg log
Comment 4 Siganderson 2011-07-13 15:26:57 UTC
Created attachment 49062 [details]
xorg conf
Comment 5 Siganderson 2011-07-13 15:27:24 UTC
Created attachment 49063 [details]
dmesg
Comment 6 Michel Dänzer 2011-08-09 04:15:45 UTC
(In reply to comment #6)
> 2.6.38 -> 60 to 120 FPS with vsync on

vsync can't actually be effective if the framerate is higher than the refresh rate...

Does

 Option "EnablePageFlip" "off"

and possibly also

 Option "SwapbuffersWait" "off"

restore the previous performance?
Comment 7 Siganderson 2011-08-09 14:29:20 UTC
(In reply to comment #6)

I tried the first option alone and together with the second option. In both cases the previous performance was not restored.

The test this time was done under ubuntu 11.10 alpha 3 with the 3.0 kernel and mesa 7.11-devel.
Comment 8 Siganderson 2011-08-09 14:30:56 UTC
sorry... it's not ubuntu but kubuntu (kde 4.6.7)
Comment 9 Siganderson 2011-08-09 14:31:50 UTC
(In reply to comment #8)
> sorry... it's not ubuntu but kubuntu (kde 4.6.7)

uff.. XD
kde 4.7
Comment 10 Michel Dänzer 2011-08-10 00:49:08 UTC
Other than bisecting the kernel, I don't have any more good ideas for figuring out what caused this.
Comment 11 Alex Deucher 2011-08-10 06:31:28 UTC
I don't really see how the frame rate could have been higher than the refresh rate if vsync was enabled unless there was a bug in the vsync code in older kernels.
Comment 12 Siganderson 2011-08-10 08:59:56 UTC
(In reply to comment #11)
> I don't really see how the frame rate could have been higher than the refresh
> rate if vsync was enabled unless there was a bug in the vsync code in older
> kernels.

I'm not an expert like you, but I think that the fps is going from 120 to 60 and that should be possible as 120 is multiple of 60 (my refresh rate). Is this correct?
If you need, I can bisect in some days.
Comment 13 Siganderson 2011-08-12 02:09:13 UTC
The interesting thing is that when I disable vsync (and I'm sure that the option has effect as I see the "tearing" while watching videos) the FPS doesn't go beyond 60 (3.0 kernel). There is like a wall that can't be overtaken.
Comment 14 Alex Deucher 2011-08-12 06:17:44 UTC
Are you setting the vblank_mode=0 env var?
Comment 15 Michel Dänzer 2011-08-15 02:49:11 UTC
(In reply to comment #13)
> The interesting thing is that when I disable vsync (and I'm sure that the
> option has effect as I see the "tearing" while watching videos) the FPS doesn't
> go beyond 60 (3.0 kernel). There is like a wall that can't be overtaken.

What exactly does 'disable vsync' mean here? Is this still with page flipping and SwapBuffersWait disabled?
Comment 16 Siganderson 2011-08-15 10:59:21 UTC
(In reply to comment #15)
> (In reply to comment #13)
> > The interesting thing is that when I disable vsync (and I'm sure that the
> > option has effect as I see the "tearing" while watching videos) the FPS doesn't
> > go beyond 60 (3.0 kernel). There is like a wall that can't be overtaken.
> 
> What exactly does 'disable vsync' mean here? Is this still with page flipping
> and SwapBuffersWait disabled?

I'm not setting vblank_mode=0 but I'm setting SwapBuffersWait and PageFlipping to false (I disabled even the vsync option in kde desktop effects settings)
Comment 17 Siganderson 2011-08-15 11:04:02 UTC
to summarize:

OPTIONS:
PageFlipping false
SwapBuffersWait false
vsync in kde desktop effects settings disabled

RESULTS:
using 3.0 kernel    -> 60 fps as maximum and vsync seems to be actually disabled
using 2.6.38 kernel -> 100/120 fps as maximum and vsync seems to be actually disabled
Comment 18 Michel Dänzer 2011-08-16 00:41:32 UTC
If vblank_mode=0 doesn't make a difference either, I'm afraid you'll really need to bisect.
Comment 19 Siganderson 2011-08-16 01:49:02 UTC
(In reply to comment #18)
> If vblank_mode=0 doesn't make a difference either, I'm afraid you'll really
> need to bisect.


I did a rough bisect (tagging as bad even all the uncompilable commits and tagging as good every 2.6.38-rc commit without testing it) and got a result.
Is it better to label an uncompilable commit as bad or to skip it?

dcdf3a293522e6ef09d8b3650ac1ceec56438e5d is the first bad commit
commit dcdf3a293522e6ef09d8b3650ac1ceec56438e5d
Author: Roman Zippel <zippel@linux-m68k.org>
Date:   Tue Nov 18 21:25:17 2008 +0100

    m68k: Add helper function handle_kernel_fault()
    
    Add helper function handle_kernel_fault() in signal.c, so frame_extra_sizes
    can become static, and to avoid future code duplication.
    
    Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
    Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>

:040000 040000 dd4a90c01a056ed7ac20f83518bf6c041c6190ad 1cd025a7a61a8e0c872bdc34ce039bc18e6e91d0 M      arch
Comment 20 Michel Dänzer 2011-08-16 01:54:25 UTC
(In reply to comment #19)
> I did a rough bisect (tagging as bad even all the uncompilable commits and
> tagging as good every 2.6.38-rc commit without testing it)

Beware that even commits which say 2.6.38-rc might actually have landed in mainline after 2.6.38, so you should really test those as well.

> Is it better to label an uncompilable commit as bad or to skip it?

You have to skip them, as you can't tell if they're affected by the problem.

So this means you tried vblank_mode=0?
Comment 21 Siganderson 2011-08-16 02:14:04 UTC
(In reply to comment #20)
> You have to skip them, as you can't tell if they're affected by the problem.

Ok I will redo it from the beginning.

> So this means you tried vblank_mode=0?

No, where should I put vblank_mode=0 to see the effect it does into kde?
Comment 22 Michel Dänzer 2011-08-16 02:34:28 UTC
(In reply to comment #21)
> [...] where should I put vblank_mode=0 to see the effect it does into kde?

Wherever kwin is started. E.g. you can try restarting it manually:

vblank_mode=0 kwin --replace
Comment 23 Siganderson 2011-08-16 04:07:27 UTC
(In reply to comment #22)
> vblank_mode=0 kwin --replace

I just did it and it has the same effect of turning off vsync from the kde desktop effects settings panel: the FPS doesn't go beyond 60/61 and vsync is actually disabled.

I'm going to bisect between 2.6.38 and 2.6.39-rc1.
Comment 24 Siganderson 2011-08-16 10:24:35 UTC
(In reply to comment #23)
> I'm going to bisect between 2.6.38 and 2.6.39-rc1.


I started but even after 1 hour of git bisect skip I continue to have kernels that don't compile:

"arch/x86/kernel/entry_32.S:1422: Error: .size expression for apf_page_fault does not evaluate to a constant"
Comment 25 Siganderson 2011-08-18 02:39:36 UTC
After a hell of attempts I'm at this point:
git bisect start
git bisect bad 40471856f2e38e9bfa8d605295e8234421110dd6
git bisect good 047f61c5d16d617e1df2deea5c5d5fb2a1b41c95
Bisecting: 352 revisions left to test after this (roughly 9 steps)

The bad thing is that I can't compile many kernels in this range because of this error:
arch/x86/kernel/entry_32.S:1422: Error: .size expression for apf_page_fault does not evaluate to a constant

So I launched git bisect visualize and gave a look to the most interesting commits. There is a sequence of commits in drm, some of them related to radeon and vblank.
Could be one of these the bad one? (I can't compile them because of that error)

Is there a way to see if one of these commits caused the problem without compiling the kernel?
Comment 26 Siganderson 2011-08-18 02:45:12 UTC
here it is an image of the commits
http://postimage.org/image/3ymp1bxg/
Comment 27 Michel Dänzer 2011-08-18 03:00:16 UTC
(In reply to comment #25)
> There is a sequence of commits in drm, some of them related to radeon
> and vblank.

I guess it's most likely commit 19b01b5fbf0b78930b3b06ee6080539c17b5d1fd ('drm/kernel: vblank wait on crtc > 1') that makes the difference. Without that commit, the X driver completely disables vsync functionality with your card, as it can't work on some of the CRTCs.

What I don't understand is why the suggested workarounds don't restore the previous behaviour... did you leave all of them (disabling page flipping and SwapBuffersWait — verify in Xorg.0.log that they're really disabled — and vblank_mode=0) enabled at the same time? Missing any one of them could result in vsync being effectively enabled.
Comment 28 Siganderson 2011-08-18 04:09:57 UTC
I just booted a 2.6.38 kernel before that commit with these results:

cat /var/log/Xorg.0.log | grep Pageflipping
[    12.197] (II) RADEON(0): KMS Pageflipping: disabled
cat /var/log/Xorg.0.log | grep SwapBuffers
[    12.197] (II) RADEON(0): SwapBuffers wait for vsync: disabled

vsync in kde desktop effetcts settings is unticked (disabled)

glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
9872 frames in 5.0 seconds = 1974.097 FPS
10436 frames in 5.0 seconds = 2087.059 FPS

vblank_mode=0 glxgears
ATTENTION: default value of option vblank_mode overridden by environment.
ATTENTION: default value of option vblank_mode overridden by environment.
10005 frames in 5.0 seconds = 1999.635 FPS
10654 frames in 5.0 seconds = 2130.778 FPS


the max fps is 60!!!
but if I enable vsync in kde desktop effects settings or doing "vblank_mode=1 kwin --replace" the fps goes to 100!!!
In these conditions vblank_mode=0 glxgears and vblank_mode=1 glxgears have the same output (2000 FPS).
This has no sense.
Comment 29 Siganderson 2011-08-18 05:04:29 UTC
and this is the last attempt, when I booted a newer kernel:

cat /var/log/Xorg.0.log | grep Pageflipping
[    37.812] (II) RADEON(0): KMS Pageflipping: disabled
cat /var/log/Xorg.0.log | grep SwapBuffers
[    37.812] (II) RADEON(0): SwapBuffers wait for vsync: disabled

glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
301 frames in 5.0 seconds = 60.112 FPS
300 frames in 5.0 seconds = 59.986 FPS


vblank_mode=0 glxgears
ATTENTION: default value of option vblank_mode overridden by environment.
ATTENTION: default value of option vblank_mode overridden by environment.
16145 frames in 5.0 seconds = 3228.899 FPS
15667 frames in 5.0 seconds = 3133.278 FPS

then I set the vsync to off in kde desktop effects settings:

glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
301 frames in 5.0 seconds = 60.072 FPS
301 frames in 5.0 seconds = 60.015 FPS

vblank_mode=0 glxgears
ATTENTION: default value of option vblank_mode overridden by environment.
ATTENTION: default value of option vblank_mode overridden by environment.
14340 frames in 5.0 seconds = 2867.876 FPS
14858 frames in 5.0 seconds = 2971.596 FPS

doing vblank_mode=0 kwin --replace has the same effect than setting off vsync in kde desktop effects settings
Comment 30 Siganderson 2011-08-18 05:06:35 UTC
I even tried do compile a 2.6.39 kernel with the old drm.h  drm_ioctl.c  drm_irq.c files (when the incriminated commit was not yet applied) and as expected I had 100 FPS in kde when vsync is on in kde destkop effects settings.
Comment 31 Ilija Hadzic 2011-08-18 06:20:18 UTC
(In reply to comment #28)

> 
> the max fps is 60!!!
> but if I enable vsync in kde desktop effects settings or doing "vblank_mode=1
> kwin --replace" the fps goes to 100!!!
> In these conditions vblank_mode=0 glxgears and vblank_mode=1 glxgears have the
> same output (2000 FPS).
> This has no sense.

I am assuming that your DDX (xf86-video-ati library) is 6.14.2 (latest). If you
use that DDX and a kernel prior to crtc>1 commit, then the behavior of glxgears
is an expected one. Prior to the offending commit, vblank synchronization for
crtcs that are higher than 1 didn't work. System was synchronizing on wrong crtc numbers and the appearance of working was mostly an accident than design. The fix included both the kernel and user space patches and a check was added to DDX against the kernel capabilities. So what happens in your case, is that when you boot up the old kernel and probably keep the new DDX, the DDX decides that it's missing a kernel capability and decides not to syncronize to vblank no matter what you set vblank_mode for. That's why you see 2000+ fps in both cases.

When it comes to KDE, I am not familiar enough to comment, but I speculate that KDE actually bypasses the DDX and calls vblank synchronization ioctl directly through libdrm. KDE would in that case be responsible for specifying the correct CRTC and the CRTC number depends on which connector you are. I see you are using an Evergreen GPU that has 6 crtcs so you could be sitting on a crtc whose number is greater than 1. If that's so, and assuming that KDE indeed specifies correct crtc, it could very well be that your lower FPS with the newer kernel is correct because that's what CRTC ticks at. With older kernel
you may be synchronizing to an incorrect CRTC that just so happens to tick at 100fps, but that's an accident. Try changing the connector you use on your evergreen part, use the kernel *with* the offending patch and *with* synchronization turned on and see if the FPS changes with the connector. If you find one that gives you 100fps, then that's probably the one on which the old broken kernel picked up the synchronization from.

BTW, on the part you are using, I have noticed that if you connect all three connectors, HDMI will be CRTC 0, VGA will be CRTC 1 and DVI will be CRTC 2.
In that case, an old kernel will be broken on DVI (which is the one that
people are most likely to look at)
Comment 32 Siganderson 2011-08-18 07:19:37 UTC
(In reply to comment #31)
> BTW, on the part you are using, I have noticed that if you connect all three
> connectors, HDMI will be CRTC 0, VGA will be CRTC 1 and DVI will be CRTC 2.
> In that case, an old kernel will be broken on DVI (which is the one that
> people are most likely to look at)

Your suppositions are right. I always used the git master xf86-video-ati.
I can connect only the VGA connector as I have only one LCD monitor without DVI or HDMI slots. The video card has VGA, HDMI and DVI connectors.
Maybe the old kernel is broken even on VGA (crtc1) (the patch says crtc>1 and not crtc>=1)?
Comment 33 Ilija Hadzic 2011-08-18 08:50:00 UTC
(In reply to comment #32)

> I can connect only the VGA connector as I have only one LCD monitor without DVI
> or HDMI slots. The video card has VGA, HDMI and DVI connectors.

If you have some spare change in your pocket you can get some HDMI-to-VGA
and DVI-to-VGA converter dongles to try out (they are no more than 
5-10 bucks a pop) and trying out different connectors might give us
some good datapoints.

> Maybe the old kernel is broken even on VGA (crtc1) (the patch says crtc>1 and
> not crtc>=1)?

No. crtc>1 is correct message associated with the patch, 
but the user space also plays the role here. 
Specifically the old kernel could only distinguish between primary and
secondary crtc and treated primary as 0 and secondary as 1. User space
on the other hand could at its own discretion map the real crtc number
to primary or secondary. Prior to the patch, radeon DDX called 0 
primary and everything else secondary so all crtcs >=1 mapped 
to 1, but 1 was correctly mapped.

Fixed radeon DDX maps all crtcs correctly, but it refuses to use
vblank synchronization with old kernel. New kernel on the other 
hand still understands and accepts primary/secondary notation
for backward compatibility with old DDX, but with the understanding
that it may wait on wrong CRTC if user space is old.

How KDE (assuming that it calls the vblank ioctl directly) determines
the CRTC number is anyone's guess but if it does similar primary/secondary
mapping of high crtcs like DDX did, things will look strange.
We need to find that out, so can you please check which CRTC does the
glxgears use in your case and which CRTC does the KDE use? You can do that
by enabling debug in drm (echo 7 > /sys/module/drm/parameters/debug)
and searching the kernel log for "waiting on vblank count %d, crtc %d\n"
string that comes from drm_wait_vblank function in drm_irq.c. Make sure
that you use the kernel *with* the crtc>1 patch because
the one without will not properly handle high crtcs.

Beware that the log will be very verbose once you turn the debug on. 
If you are overwhelmed by other messages you could temporary change
the relevant DRM_DEBUG statement in drm_irq.c into DRM_INFO and 
recompile the kernel; then you won't have to turn on the debug 
parameter. Observe the log and try to figure out which CRTCs are
used by whom. If it turns out that the one from KDE is different
from the one from glxgears, then we need to figure out which one
is wrong and why (in that case the problem will point to userland).

BTW, I may have mispoken in my previous post. Looking back at my logs
it's the VGA that typically sits on CRTC2 and DVI is CRTC1, so by using
the VGA port you may very easily be sitting on a CRTC number that is
mishandled in older kernels. On the other hand CRTCs are dynamically
routed to connectors and if you have only one connector hooked up 
to the monitor, the driver *should* in most cases route everything
to CRTC0, but that part of the code is quite complex and it is not
excluded that you indeed end up on CRTC2. We'll see that after you
do the experiment I described.
Comment 34 Siganderson 2011-08-18 09:52:25 UTC
(In reply to comment #33)

I booted into kde 4.7.0 with a 3.1.0-rc2 kernel and then I did:
echo 7 > /sys/module/drm/parameters/debug
and then I launched glxgears (and didn't close it).
Then I did:
dmesg | grep waiting
[  329.830313] [drm:drm_wait_vblank], waiting on vblank count 19676, crtc 0
[  329.834052] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
[  329.834068] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
[  329.846916] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
[  329.846974] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
[  329.850710] [drm:drm_wait_vblank], waiting on vblank count 19678, crtc 0
[  329.850726] [drm:drm_wait_vblank], waiting on vblank count 19678, crtc 0
[  329.863578] [drm:drm_wait_vblank], waiting on vblank count 19678, crtc 0
[  329.863636] [drm:drm_wait_vblank], waiting on vblank count 19678, crtc 0
[  329.867367] [drm:drm_wait_vblank], waiting on vblank count 19679, crtc 0
[  329.867383] [drm:drm_wait_vblank], waiting on vblank count 19679, crtc 0
[  329.881225] [drm:drm_wait_vblank], waiting on vblank count 19679, crtc 0
[  329.881282] [drm:drm_wait_vblank], waiting on vblank count 19679, crtc 0
[  329.884030] [drm:drm_wait_vblank], waiting on vblank count 19680, crtc 0
[  329.884051] [drm:drm_wait_vblank], waiting on vblank count 19680, crtc 0
[  329.896884] [drm:drm_wait_vblank], waiting on vblank count 19680, crtc 0
[  329.896942] [drm:drm_wait_vblank], waiting on vblank count 19680, crtc 0
[  329.900679] [drm:drm_wait_vblank], waiting on vblank count 19681, crtc 0
[  329.900695] [drm:drm_wait_vblank], waiting on vblank count 19681, crtc 0
[  329.914005] [drm:drm_wait_vblank], waiting on vblank count 19681, crtc 0
[  329.914064] [drm:drm_wait_vblank], waiting on vblank count 19681, crtc 0
[  329.917325] [drm:drm_wait_vblank], waiting on vblank count 19682, crtc 0
[  329.917341] [drm:drm_wait_vblank], waiting on vblank count 19682, crtc 0
[  329.930179] [drm:drm_wait_vblank], waiting on vblank count 19682, crtc 0
[  329.930237] [drm:drm_wait_vblank], waiting on vblank count 19682, crtc 0
[  329.933976] [drm:drm_wait_vblank], waiting on vblank count 19683, crtc 0
[  329.933992] [drm:drm_wait_vblank], waiting on vblank count 19683, crtc 0
[  329.946842] [drm:drm_wait_vblank], waiting on vblank count 19683, crtc 0
[  329.946901] [drm:drm_wait_vblank], waiting on vblank count 19683, crtc 0
[  329.950634] [drm:drm_wait_vblank], waiting on vblank count 19684, crtc 0
[  329.950650] [drm:drm_wait_vblank], waiting on vblank count 19684, crtc 0
[  329.963491] [drm:drm_wait_vblank], waiting on vblank count 19684, crtc 0
[  329.963550] [drm:drm_wait_vblank], waiting on vblank count 19684, crtc 0
[  329.967284] [drm:drm_wait_vblank], waiting on vblank count 19685, crtc 0
[  329.967300] [drm:drm_wait_vblank], waiting on vblank count 19685, crtc 0
[  329.981137] [drm:drm_wait_vblank], waiting on vblank count 19685, crtc 0
[  329.981195] [drm:drm_wait_vblank], waiting on vblank count 19685, crtc 0
[  329.983936] [drm:drm_wait_vblank], waiting on vblank count 19686, crtc 0
[  329.983952] [drm:drm_wait_vblank], waiting on vblank count 19686, crtc 0
[  329.996802] [drm:drm_wait_vblank], waiting on vblank count 19686, crtc 0
[  329.996860] [drm:drm_wait_vblank], waiting on vblank count 19686, crtc 0
[  330.000626] [drm:drm_wait_vblank], waiting on vblank count 19687, crtc 0
[  330.000641] [drm:drm_wait_vblank], waiting on vblank count 19687, crtc 0
[  330.013921] [drm:drm_wait_vblank], waiting on vblank count 19687, crtc 0
[  330.014069] [drm:drm_wait_vblank], waiting on vblank count 19687, crtc 0
[  330.017355] [drm:drm_wait_vblank], waiting on vblank count 19688, crtc 0
[  330.017371] [drm:drm_wait_vblank], waiting on vblank count 19688, crtc 0
[  330.030240] [drm:drm_wait_vblank], waiting on vblank count 19688, crtc 0
[  330.030298] [drm:drm_wait_vblank], waiting on vblank count 19688, crtc 0
[  330.033901] [drm:drm_wait_vblank], waiting on vblank count 19689, crtc 0
[  330.033917] [drm:drm_wait_vblank], waiting on vblank count 19689, crtc 0
[  330.046785] [drm:drm_wait_vblank], waiting on vblank count 19689, crtc 0
[  330.046843] [drm:drm_wait_vblank], waiting on vblank count 19689, crtc 0
[  330.050570] [drm:drm_wait_vblank], waiting on vblank count 19690, crtc 0
[  330.050586] [drm:drm_wait_vblank], waiting on vblank count 19690, crtc 0
[  330.063416] [drm:drm_wait_vblank], waiting on vblank count 19690, crtc 0
[  330.063474] [drm:drm_wait_vblank], waiting on vblank count 19690, crtc 0
[  330.067206] [drm:drm_wait_vblank], waiting on vblank count 19691, crtc 0
[  330.067222] [drm:drm_wait_vblank], waiting on vblank count 19691, crtc 0
[  330.081058] [drm:drm_wait_vblank], waiting on vblank count 19691, crtc 0
[  330.081116] [drm:drm_wait_vblank], waiting on vblank count 19691, crtc 0
[  330.083858] [drm:drm_wait_vblank], waiting on vblank count 19692, crtc 0
[  330.083874] [drm:drm_wait_vblank], waiting on vblank count 19692, crtc 0
[  330.096738] [drm:drm_wait_vblank], waiting on vblank count 19692, crtc 0
[  330.096797] [drm:drm_wait_vblank], waiting on vblank count 19692, crtc 0
[  330.100548] [drm:drm_wait_vblank], waiting on vblank count 19693, crtc 0
[  330.100564] [drm:drm_wait_vblank], waiting on vblank count 19693, crtc 0
[  330.113823] [drm:drm_wait_vblank], waiting on vblank count 19693, crtc 0
[  330.113881] [drm:drm_wait_vblank], waiting on vblank count 19693, crtc 0
[  330.117207] [drm:drm_wait_vblank], waiting on vblank count 19694, crtc 0
[  330.117223] [drm:drm_wait_vblank], waiting on vblank count 19694, crtc 0
[  330.130173] [drm:drm_wait_vblank], waiting on vblank count 19694, crtc 0
[  330.130231] [drm:drm_wait_vblank], waiting on vblank count 19694, crtc 0
[  330.133861] [drm:drm_wait_vblank], waiting on vblank count 19695, crtc 0
[  330.133877] [drm:drm_wait_vblank], waiting on vblank count 19695, crtc 0
[  330.146738] [drm:drm_wait_vblank], waiting on vblank count 19695, crtc 0
[  330.146797] [drm:drm_wait_vblank], waiting on vblank count 19695, crtc 0
[  330.150492] [drm:drm_wait_vblank], waiting on vblank count 19696, crtc 0
[  330.150508] [drm:drm_wait_vblank], waiting on vblank count 19696, crtc 0
[  330.163345] [drm:drm_wait_vblank], waiting on vblank count 19696, crtc 0
[  330.163403] [drm:drm_wait_vblank], waiting on vblank count 19696, crtc 0
[  330.167131] [drm:drm_wait_vblank], waiting on vblank count 19697, crtc 0
[  330.167146] [drm:drm_wait_vblank], waiting on vblank count 19697, crtc 0
[  330.179991] [drm:drm_wait_vblank], waiting on vblank count 19697, crtc 0
[  330.180097] [drm:drm_wait_vblank], waiting on vblank count 19697, crtc 0
[  330.183791] [drm:drm_wait_vblank], waiting on vblank count 19698, crtc 0
[  330.183807] [drm:drm_wait_vblank], waiting on vblank count 19698, crtc 0
[  330.196681] [drm:drm_wait_vblank], waiting on vblank count 19698, crtc 0
[  330.196740] [drm:drm_wait_vblank], waiting on vblank count 19698, crtc 0
[  330.200478] [drm:drm_wait_vblank], waiting on vblank count 19699, crtc 0
[  330.200495] [drm:drm_wait_vblank], waiting on vblank count 19699, crtc 0
[  330.213779] [drm:drm_wait_vblank], waiting on vblank count 19699, crtc 0
[  330.213837] [drm:drm_wait_vblank], waiting on vblank count 19699, crtc 0
[  330.217176] [drm:drm_wait_vblank], waiting on vblank count 19700, crtc 0
[  330.217193] [drm:drm_wait_vblank], waiting on vblank count 19700, crtc 0
[  330.230146] [drm:drm_wait_vblank], waiting on vblank count 19700, crtc 0
[  330.230204] [drm:drm_wait_vblank], waiting on vblank count 19700, crtc 0
[  330.233746] [drm:drm_wait_vblank], waiting on vblank count 19701, crtc 0
[  330.233761] [drm:drm_wait_vblank], waiting on vblank count 19701, crtc 0


I can read this in the xrandr --verbose output:
VGA-0 connected 1280x1024+0+0 (0x63) normal (normal left inverted right x axis y axis) 338mm x 270mm
        Identifier: 0x57
        Timestamp:  40354
        Subpixel:   no subpixels
        Gamma:      1.0:1.0:1.0
        Brightness: 1.0
        Clones:    
        CRTC:       0
        CRTCs:      0 1 2 3 4 5


Everything seems to be crtc 0 (and I have only the vga cable connected to the  monitor vga output).

I have Pageflipping off, SwapBuffersWait off.
Comment 35 Siganderson 2011-08-18 10:01:59 UTC
I was curious to test even with the kernel before the patch and I found that dmesg | grep waiting does not show anything. Maybe the debug line wasn't added yet.
Comment 36 Ilija Hadzic 2011-08-18 10:32:41 UTC
(In reply to comment #34)
> (In reply to comment #33)
> 
> I booted into kde 4.7.0 with a 3.1.0-rc2 kernel and then I did:
> echo 7 > /sys/module/drm/parameters/debug
> and then I launched glxgears (and didn't close it).
> Then I did:
> dmesg | grep waiting
> [  329.830313] [drm:drm_wait_vblank], waiting on vblank count 19676, crtc 0
> [  329.834052] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
> [  329.834068] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
> 
> ----------------- [SNIP]-----------------------------
>
> Everything seems to be crtc 0 (and I have only the vga cable connected to the 
> monitor vga output).
> 
> I have Pageflipping off, SwapBuffersWait off.

CRTC0 (routed to the connected connector) is correct when only one cable is connected. Does the frame rate reported by glxgears match your monitor refresh rate ? ... 60 ? If so then this is correct and that *should* be the value reported by KDE as well when you turn on desktop effects. Is it ? (turn 
SwapBuffersWait and PageFlipping back on). If that's also 60 (exactly 60 and
equal to your monitor refresh rate, *not* lower) then your system is behaving 
as it should and anything you saw differently before was wrong.

If it's lower than 60, then the only thing I can think of is that something
slows down your system so much that it can't complete whatever it needs to do
between two buffer swap requests and is actually missing vblanks. Following
that logic, the explanation for 100 you saw in the old kernel would be
that it wasn't synchronizing at all and 100 was the top speed it could do
in a tight loop without any waiting. That's also consistent with what
you said in your next post, that you don't see the "waiting on vblank"
messages from the kernel.
Comment 37 Ilija Hadzic 2011-08-18 10:37:28 UTC
(In reply to comment #35)
> I was curious to test even with the kernel before the patch and I found that
> dmesg | grep waiting does not show anything. Maybe the debug line wasn't added
> yet.

That message was there before the patch. So the fact that you don't see
it means that it is really not even trying to synchronize with vblank.

Most likely DDX is (as it should) refusing to call the ioctl due
to the old kernel.
Comment 38 Siganderson 2011-08-18 10:47:59 UTC
(In reply to comment #36 and #37)

If I launch glxgears the FPS is 60/59.
When desktop effects are active and repeatedly minimize and unminimize a window the FPS drops to 15 while with the old kernel before the patch it dropped to 30/25 and was better.
This can be seen even by eye.
What can I do to improve this situation?
Comment 39 Siganderson 2011-08-18 10:50:22 UTC
(In reply to comment #38)
> (In reply to comment #36 and #37)
> 
> If I launch glxgears the FPS is 60/59.
> When desktop effects are active and repeatedly minimize and unminimize a window
> the FPS drops to 15 while with the old kernel before the patch it dropped to
> 30/25 and was better.
> This can be seen even by eye.
> What can I do to improve this situation?

Except the first line I'm speaking about the KDE fps.
glxgears seems to stay on 60.
Comment 40 Ilija Hadzic 2011-08-18 11:23:12 UTC
(In reply to comment #38)
> (In reply to comment #36 and #37)
> 
> If I launch glxgears the FPS is 60/59.
> When desktop effects are active and repeatedly minimize and unminimize a window
> the FPS drops to 15 while with the old kernel before the patch it dropped to
> 30/25 and was better.
> This can be seen even by eye.
> What can I do to improve this situation?

60/59 is OK, one frame plus minus is just a rounding effect when glxgears
calculates the frame rate. So based on everything you said, I think that there
is no bug here, that it's just that your system can't keep up when you
turn on desktop effects. The root cause could be totally unrelated to
graphics subsystem.

Some explanation. Every graphics application (glxgears, KDE with desktop
effects your favorite game) is a loop that looks something like this:

while (1) {
  // do something related to application
  this();
  that();

  // dump some OpenGL commands on GPU
  glThis();
  glThat();

  // swap buffers
  glXSwapBuffers();

}

When vblank is turned on, the process will block waiting for the next vblank.
If you are running KDE with desktop effects and glxgears at the same time, 
these are separate processes that OS is scheduling. If both of them
can do their this(); and that(); in time that is much shorter than 
1/screen_refresh_rate, then you see the FPS equal to the refresh rate 
and it's steady. If one process executes this() and that() for too long,
then it misses the vblank and you see reduced (and maybe erratic) frame 
rate. Notie that max. time per loop iteration must be shorter that the
frame interval time. If it's somethimes shorter and sometimes longer
then vblanks are sometimes missed, sometimes not.

So with newer kernel vblanks work correctly, DDX uses them, but this() 
and that() in your KDE is unfortunately too slow (probably on the
boundary of frame interval). Note that glxgears is a different process 
whose this() and that() are doing almost nothing so it can top at much 
higher rate (probably determined by the rate at which the GPU can 
process its rendering command). KDE on the other hand is probably limited
by what the host is doing.

Now when you removed my patch from the kernel (or when you were running
2.6.38 kernel) but you still used the latest DDX, the DDX just decided to
not attempt the synchronization because it was missing the kernel capability
(handling of CRTCs > 1 .... your problem had nothing to do with that, but
missing capability was the reason for DDX to run unsynchronized). In this
situation both KDE and glxgears became tight loops with no wait
at the swap call. glxgears topped at 2000-something iterations of the loop
per second and that was the reported frame rate, while KDE topped at much
lower rate (100-something). Visually, that looked better for you because
you didn't have the effect of sometimes completing the loop iteration in
time and sometimes not (and thus missing the vblanks).

So the root cause is that you have to find out what is causing KDE to
slow down. It can be many reasons from a simply slow machine to 
some I/O device monopolizing the bus or some process causing KDE
to stall. 

You may want to try selectively disabling some effects and
until you get to the subset of effects that leave you with an acceptable
performance, but you should also search for causes outside the graphics
subsystem.
Comment 41 Siganderson 2011-08-18 11:43:07 UTC
(In reply to comment #40)
Your explaination is very clear, complete and useful.
This surprise me as i.e. in windows 7 it doesn't matter how many applications and what is the load on the cpu: the minimization of a window remains smooth.
However my system has a core 2 duo cpu (e6550) at 2.33 Ghz and unfortunately I don't know other people with a similar system to do comparisons. I should think that everything in my system is going well as I did a clean install and did not notice anything weird with other non-graphical applications. I remember that in the past I already tried to set off every effect but the fps calculator and the minimize effect: the result is the same.
I'm going to try to close as many process as I can to see if there will be some difference.
Comment 42 Siganderson 2011-08-19 09:45:58 UTC
(In reply to comment #41)

I couldn't improve the things so I decided to leave kde and move towards gnome3 that seems much faster.
Comment 43 Ilija Hadzic 2011-10-26 16:57:22 UTC
(In reply to comment #34)
> (In reply to comment #33)
> 
> I booted into kde 4.7.0 with a 3.1.0-rc2 kernel and then I did:
> echo 7 > /sys/module/drm/parameters/debug
> and then I launched glxgears (and didn't close it).
> Then I did:
> dmesg | grep waiting
> [  329.830313] [drm:drm_wait_vblank], waiting on vblank count 19676, crtc 0
> [  329.834052] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
> [  329.834068] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
> [  329.846916] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0
> [  329.846974] [drm:drm_wait_vblank], waiting on vblank count 19677, crtc 0

I know I closed this as "not-a-bug", but I have recently discovered an issue in DRM module that can cause severe blocking if multiple processes are issuing the vblank wait concurrently, so I have a second thought. Given that you have multiple instances of waiting for the same vblank count, I wonder that may be an issue in your case.

Are you a game for a little more testing (can't promise, success, but I want to try) ? If so, can you repeat the test in which you dumped the above log, but also find out the PID of the processes that are making the wait vblank calls (you can do it by either hacking up the DRM_DEBUG statement that prints the above to also print the PID of the current process) or by just examining the kernel log and trying to identify through which ioctl the drm_wait_vblank was entered and reading the PID from the log.

If you have multiple processes issuing the ioctl, then this patch may work for you (it's still being debated and I have to revise it to reflect some comments I received about the patch, but it's usable for experimentation purpose):

http://lists.freedesktop.org/archives/dri-devel/2011-October/015635.html
Comment 44 Siganderson 2011-10-27 13:36:58 UTC
I'll try the patch.
Comment 45 Siganderson 2011-10-27 15:05:52 UTC
mmm... I wanted to try it but I still don't understand how to apply it (what version of files, where to get them and what command to run). I'm used with patchfiles and patch -pNum < patchfilename.
Comment 46 Ilija Hadzic 2011-10-27 15:27:15 UTC
Created attachment 52833 [details] [review]
provisional patch to try as an experiment
Comment 47 Ilija Hadzic 2011-10-27 15:31:31 UTC
(In reply to comment #45)
> mmm... I wanted to try it but I still don't understand how to apply it (what
> version of files, where to get them and what command to run). I'm used with
> patchfiles and patch -pNum < patchfilename.

I have attached the patch for you to this bug report. Download it and save it as
a .patch file.

If you are using a kernel that you checked out from git, you can use 
'git am name_of_the_patch.patch'

If you are using stock kernel that came with your distribution (Ubuntu ?) then
go to the top of your kernel source directory (you must have it installed first) and apply the patch as 'patch -p1 name_of_the_patch.patch'

After that you have to recompile and reinstall your kernel (look at this page
for Ubuntu-specific instructions https://help.ubuntu.com/community/Kerne/Compile).

The patch is meant for the latest kernel in Git repository, but I have just tried it on 2.6.39 kernel (which I understand is what you are using) and it applies fine.
Comment 48 Ilija Hadzic 2011-10-27 15:33:33 UTC
> first) and apply the patch as 'patch -p1 name_of_the_patch.patch'

Typo, it should be


patch -p1 < name_of_the_patch.patch
        ^^^^^

but you probably already know that
Comment 49 Siganderson 2011-10-28 05:39:21 UTC
(In reply to comment #48)
> > first) and apply the patch as 'patch -p1 name_of_the_patch.patch'
> 
> Typo, it should be
> 
> 
> patch -p1 < name_of_the_patch.patch
>         ^^^^^
> 
> but you probably already know that

I tried the patch with a 3.0.4 kernel (latest stable).
It works and gives something like 2 or 4 fps gain (before the patch the fps dropped to 24-26 while repeatedly minimizing a window, now it drops to 26-28). Do you need even the test with the logs?
Comment 50 Siganderson 2011-10-28 05:42:37 UTC
I forgot to say that without touching anything in the window there are no changes (60 fps... obviously 60 is the maximum when vsync is on).
Comment 51 Ilija Hadzic 2011-10-28 06:25:40 UTC
(In reply to comment #49)
> (In reply to comment #48)
> > > first) and apply the patch as 'patch -p1 name_of_the_patch.patch'
> > 
> > Typo, it should be
> > 
> > 
> > patch -p1 < name_of_the_patch.patch
> >         ^^^^^
> > 
> > but you probably already know that
> 
> I tried the patch with a 3.0.4 kernel (latest stable).
> It works and gives something like 2 or 4 fps gain (before the patch the fps
> dropped to 24-26 while repeatedly minimizing a window, now it drops to 26-28).
> Do you need even the test with the logs?

I guess then your problem is not the same as the bug I am trying to address
with this patch. A few fps gain is probably just an (unrelated) incindental 
finding. At this time, I don't have new theories other than what I postulated
back in August. Sorry.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.