Bug 6260

Summary: [EXA] fbBresFillDash software fallback gets hit
Product: xorg Reporter: Marcin Kurek <morgoth6>
Component: Server/Acceleration/EXAAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED WORKSFORME QA Contact: Xorg Project Team <xorg-team>
Severity: enhancement    
Priority: high    
Version: 7.0.0   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Sample conky configuration file
none
xorg.conf
none
Full oprofile log
none
Current xorg.conf none

Description Marcin Kurek 2006-03-14 22:49:25 UTC
On my system I can observe a small glitches on XVideo output and DRI output in
some situations.

I noticed it first when I try to play a bit with EXA. On my system I normaly
have some program to display various informations on root window (conky or
superkaramba)  like cpu usage and system logs. It seems when one of the programs
is running the output from for example mplayer or glxgears programs is a little
jumpy and there is a regular little glitch on them, I can guess the glitch is
appear when conky or superkarambe updated it's contents. 

I think this can't be CPU problem because the CPU load on glxgears is quite low
and I guess both conky or superkaramba can't eat whole 1GHz G4 CPU on simple
update window contents.

I can observe the same problem when I disable EXA and enable XAA but I think
it's much more visible on EXA. Also it's present for KDE 3.5 && Gnome 2.13 desktops.

When I quit conkt/superkaramba the glxgears and mplayer works smooth and fine
(Hmmm, maybe the problem is still present but I definitly can't see it) Also I
discovered another workaround for this but only on conky program and it's to
disable dbe usage. When I disable doublebuffering the problem is gone too (But
of coz conky window starts to flicker like hell) I think this can be problem
with dbe extension, but can't be sure. If more informations are needed just
inform me what is needed.

Currently I use Radeon 9000 card (128MB/128BIT/VIVO) on 1GHz G4 CPU (Pegasos 2)
and 1GB of Ram. I updated the Radeon drivers to CVS 2006-03-03.
Comment 1 Marcin Kurek 2006-03-14 22:52:09 UTC
Created attachment 4933 [details]
Sample conky configuration file

Sample conky (http://conky.sourceforge.net/) configuration file to reproduce
problem. This one is with doublebuffering enabled to disable it edit the
double_buffer section in this file.
Comment 2 Marcin Kurek 2006-03-14 22:53:08 UTC
Created attachment 4934 [details]
xorg.conf

My system xorg configuration file.
Comment 3 Marcin Kurek 2006-03-14 22:55:13 UTC
I think profiler output would be nice too. But I am not sure how to make it. Is
there any guide about that on the net ?
Comment 4 Marcin Kurek 2006-03-16 07:35:46 UTC
It seems this is not related to Radeon or PPC only. I am currently in front of a
x86 machine and I can perfectly reproduce this problem here.

This is P4 2.8GHz with 512MB of memory and Via Unichrome VM800. This machine has
the same Xorg version as me (7.0) with today OpenChrome SVN snapshot.

Playing video on mplayer using xv output and gl2 output gives same glitches
here. The same for glxgears. I also test it using plain x11 output driver and
the video is jumpy too.
This is not visible all the time, mainly with large moving scenes but using
glxgears it's easy to notice.
Comment 5 Michel Dänzer 2006-03-16 20:29:51 UTC
conky probably triggers a software fallback in the X server. There's not much
that can be done about that causing glitches.

For profiling, I can recommend oprofile, or sysprof (on x86*).
Comment 6 Marcin Kurek 2006-03-16 21:39:21 UTC
The same on SuperKaramba. I wonder this is problem related to Xorg or 
Conky/SuperKaramba ? Can you explain a bit more about that ? I will try to 
recompile xorg-server and profile it, but only if it has some sense ? This 
would a quite long time for my system.
Comment 7 Michel Dänzer 2006-03-16 22:12:34 UTC
A software fallback is ultimately a missing feature (or possibly even a bug) in
the X server, but it's possible that the applications could do better as well.
It might be hard to get useful profiling data as the problem is usually caused
by relatively short bursts of activity. At the very least, you'll probably need
to restrict the profiling to the X server. On the bright side, the profiling
tools I mentioned don't require a rebuild unless the binaries don't have any
useful symbols.
Comment 8 Marcin Kurek 2006-03-17 18:56:27 UTC
I see. I will try to profile it as soon as possible and we will see, but IMHO 
this is not software fallback. My conclusion comes from ... why I can easily 
reproduce it on machine like P4 2.8GHz ? It's realy fast machine and propably 
any software part of code won't cause so heavy display glitch on simple glxgears 
window. Have I right ? I am not an expert here then I can be wrong of coz.
Comment 9 Michel Dänzer 2006-03-18 04:34:03 UTC
The CPU speed isn't necessarily very important for the effect of software
fallbacks. For one, software rendering is usually limited by the access to video
RAM, which is much slower than access to system RAM. More importantly though,
software fallbacks require draining the graphics card command queue and caches,
etc. The effect is probably particularly noticeable with glxgears as it renders
so many frames that even just a short delay is a noticeable irregularity.
Similarly, with video playback, you probably notice very quickly when a single
frame is delayed slightly.
Comment 10 Marcin Kurek 2006-03-25 01:38:06 UTC
Another one point for you. It's realy hard to trace what cause thich small
lockups (glitches) But I did'a a small profile of EXA here. I reported that long
time ago (EXA is terrible slow here) and it's seems when EXA is enabled the Xorg
spents almost 50% time in fbBlt() function in libfb.
Comment 11 Michel Dänzer 2006-05-08 23:16:02 UTC
fbBlt is a software fallback. It would be interesting if you could find out what
triggers it, e.g. if you could get call traces from the profile data.

Is this still an issue with EXA and the radeon driver from CVS HEAD?

BTW, is there a reason why you disable DMAForXv? Does the Xv behaviour improve
if you enable it?

Also, does not enabling ColorTiling help? With it and the DRI enabled, the
radeon driver currently has to do expensive calls into the kernel to set up
hardware surfaces for byte swapping with software fallbacks on big endian machines.
Comment 12 Marcin Kurek 2006-05-12 10:48:46 UTC
Hmmm, sorry about the delay. I have no ideas why but I didn't get any emial
notification about your reply.

Anyway I am compiling the today CVS snap of xorg-server, mesa and ati drivers
and we will see it's still present or no. I will give a try to disable
ColorTiling too.

The DMAforXv is disabled for a reason. I read it should speedup the XVideo, but
on my machine this option cause a speed loose on files with a high resolution
(For example DVD movies) The image is jumpy and generaly this looks similar to
problem caused by superkaramba/conki in doublebuffer mode. Anyway as far I know
DMA for GFX memory on Pegasos II is a bit slow then maybe this is a reason.
Comment 13 Marcin Kurek 2006-05-12 11:57:46 UTC
The problem is still reproduceable with today CVS snap too and disabling
ColorTiling didn't help at last for this problem.

Anyway if CT is slower on bigendian machines maybe there is a good idea to add a
information about that to the man page ? What do you think ?
Comment 14 Marcin Kurek 2006-06-15 00:45:18 UTC
OK, finaly found a free time to recompile xog-server && mesa && xf86-video-ati
with debug symbols a do a small profile session with oprofile.

It seems this still is a problem as I can see ~65% of Xorg CPU time is spend in
libfb.so/fbBlt24() function on my system.
Comment 15 Marcin Kurek 2006-06-15 00:45:58 UTC
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        image name               symbol name
50587    62.0357  libfb.so                 fbBlt24
15530    19.0447  libc-2.4.so              (no symbols)
2075      2.5446  libfb.so                 fbBresFillDash
768       0.9418  libexa.so                ExaOffscreenMarkUsed
577       0.7076  libfb.so                 fbCompositeSolidMask_nx8x0565
371       0.4550  libexa.so                exaOffscreenAlloc
357       0.4378  Xorg                     _fini
272       0.3336  libexa.so                exaOffscreenFree
237       0.2906  libexa.so                exaPixmapIsOffscreen
228       0.2796  libfb.so                 fbBltOne
207       0.2538  Xorg                     miRegionOp
165       0.2023  libexa.so                exaPixmapIsPinned
163       0.1999  libexa.so                exaDoMigration
155       0.1901  radeon_drv.so            RADEONDRISwapContext
138       0.1692  Xorg                     miComputeCompositeRegion
136       0.1668  libexa.so                exaComposite
135       0.1656  radeon_drv.so            R200PrepareCompositeCP
131       0.1606  radeon_drv.so            RADEONDRIGetVersion
127       0.1557  libexa.so                exaCopyArea
126       0.1545  radeon_drv.so            RADEONCopyCP
125       0.1533  libexa.so                exaFillRegionTiled
124       0.1521  libexa.so                exaGetOffscreenPixmap
116       0.1423  libfb.so                 fbBresSolid32
110       0.1349  libexa.so                exaGlyphs
105       0.1288  libexa.so                exaGetDrawablePixmap
103       0.1263  radeon_drv.so            RADEONDRIScreenInit
100       0.1226  libexa.so                exaCopyNtoN
100       0.1226  libexa.so                exaCreatePixmap
Comment 16 Marcin Kurek 2006-06-15 00:46:41 UTC
Created attachment 5913 [details]
Full oprofile log
Comment 17 Marcin Kurek 2006-06-15 00:47:23 UTC
Created attachment 5914 [details]
Current xorg.conf
Comment 18 Marcin Kurek 2006-06-15 01:07:29 UTC
I guess there is good idea to find what calls this function here. I looked at
oprofile documentation but was unable to find anything about call traces. How to
produce such information ?

I still have unstripped versions of Xorg components then I think I can try to help.
Comment 19 Michel Dänzer 2006-06-15 01:50:32 UTC
opreport -c will include call graph information, but this requires that the
oprofile kernel module can actually record it, which it traditionally couldn't
on PPC.

Looking at the profile, it's most likely related to fbBresFillDash, i.e. dashed
lines. These are typically used by legacy apps, and it's unlikely that EXA will
ever accelerate them. So your best bet might be to configure or change conky not
to use dashed lines and/or to update its display less frequently.
Comment 20 Marcin Kurek 2006-06-15 02:29:24 UTC
Stupid me I forgot I am using -fommit-frame-pointer by default and now I am
forced to recompile the xorg again. Anyway in this time I take a short look at
xorg sources and generaly there is not so many places where fbBlt() is used.

exa/exa_accel.c/exaPutImage():224 But I think it's not this point because radeon
have a accelerated UploadToScreen then propably this wont happend.

miext/rootless/rootlessWindow.c/StartFrameResize():919 Hmmmm, this seems to
fallback to fbBlt() if:

            if (copy_rect_width * copy_rect_height >
                        rootless_CopyBytes_threshold &&
                SCREENREC(pScreen)->imp->CopyBytes)

Fail. We will see soon I hope. This hardware is slow enough to compile Xorg more
than 1.5h ...
Comment 21 Michel Dänzer 2006-06-15 02:53:44 UTC
You could rebuild only fb and maybe exa for a start... the Xorg server doesn't
use the rootless code, and fbBresFillDash also uses fbBlt indirectly, which is
why I suspect it's related to that.
Comment 22 Marcin Kurek 2006-06-15 03:46:15 UTC
Hmmmm, weird. The opreport -cl --demangle=smart `which X` doesn't show me
anything usefull, but xorg is compiled with frame pointers.

CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples  %        image name               symbol name
-------------------------------------------------------------------------------
50587    31.6699  libfb.so                 fbBlt24
  50587    100.000  libfb.so                 fbBlt24 [self]
-------------------------------------------------------------------------------
......

I wonder what I miss this time.
Comment 23 Michel Dänzer 2006-06-15 03:59:06 UTC
See the first paragraph of comment #19.
Comment 24 Marcin Kurek 2006-06-15 04:28:32 UTC
Hmmm, ThX for information, I miss this comment. I guess then we can close this
bug now. Is that true ?
Comment 25 Eric Anholt 2006-12-28 13:48:51 UTC
Changing the summary and moving this down to enhancement.  I could imagine us
doing acceleration for horizontal/vertical dashed lines, which is what I assume
the app is using, so I'm not just closing it outright.
Comment 26 Daniel Stone 2007-02-27 01:30:54 UTC
Sorry about the phenomenal bug spam, guys.  Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Comment 27 Matt Turner 2010-12-02 19:03:14 UTC
Any update? Is this still an issue?
Comment 28 Jeremy Huddleston Sequoia 2011-10-09 03:27:36 UTC
No response.  Closing.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.