Bug 6179 - Very poor performance compared to 6.8.2
Summary: Very poor performance compared to 6.8.2
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: 7.0.0
Hardware: Other Linux (All)
: high critical
Assignee: Xorg Project Team
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-08 14:21 UTC by David Andruczyk
Modified: 2007-02-22 14:28 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description David Andruczyk 2006-03-08 14:21:21 UTC
I just upgraded my gentoo 2006.0 system to Xorg 7.0 via the gento modular X    
upgrade guide, as I've suffered with some pretty severe X problems wiht Xorg    
6.8.2 on my laptop (IBM Thinkpad T30, Radeon 7500 M7, 16 MB)    
    
With 7.0 it still suffers from severe stability problems, esp with the drm 
modules loaded.  
   
Details:   
I use my app "eXtace" (http://extace.sf.net) a lot for audio analysis, and it's   
also an excellent X11 2D  tester, as it pushes massive amounts of pixels, lines   
and polygons around. (it uses NO OpenGL). As a comparison a slower (1.2Ghz   
athlon) PC with an nvidia adapter (with a lowly TNT2) can run eXtace at 38 fps   
at 1024x768 resolution at about 10-12% cpu usage. (pushing about 75,000 lines   
per second to the X server (vert FFT mode)).  In contrast the the thinkpad has   
a 1.8Ghz P4, and it can't even run it at ALL when hte drm and radeon kernel 
modules are loaded.  With them unloaded it can barely run the same application 
(eXtace) at 900x400 pixels at over 17 fps without completely maxing the CPU 
out..  The "X" process is the cpu eater hogging 95-99% of the CPU.  Under 6.8.2 
I could get it up to about 28-30fps at the same resolution but the cpu would 
STILL be maxed out at 100%   
   
In nearly every case when the drm modules are loaded just even starting eXtace 
causes X to crash though the console is never restored, a remote login via ssh 
and a restart of X can restore the display.  The Xorg.0.log reports nothing 
when X crashes.    Kernel is 2.6.15, drm modules are x11-drm-20051223 which is 
the latest available on gentoo.
Comment 1 Erik Andren 2006-04-17 21:13:41 UTC
Could you perform a remote backtrace of the problem and post it here? 
Comment 2 Erik Andren 2006-06-16 03:04:22 UTC
Is this still an issue using a current version of xorg and the ati driver?
Comment 3 Felipe Contreras 2006-07-05 01:00:10 UTC
(In reply to comment #2)
> Is this still an issue using a current version of xorg and the ati driver?

I'm not exactly sure how to test this more precisely, but Xorg is always the
second most CPU intensive application in my system. This is both with the
oficial FC5 ati radeon driver and the latest one in git (yesterday), and also
both with XAA and EXA, and with or without drm and dri.

For example, in 5 hours of use I get from top that firefox has a cpu time of
35:48 and Xorg has 29:08. I don't know if this is meaningful at all, but I have
felt my system way slower since a while, and I strongly think it's because of
the ati driver.
Comment 4 Michel Dänzer 2006-07-17 05:38:43 UTC
Nothing can be done about this without at least a profile showing where the CPU
time is spent.
Comment 5 Felipe Contreras 2006-07-19 01:28:52 UTC
Running OProfile with MPlayer reproducing a DVD I got the following:

/opt/ati/lib/xorg/modules/drivers/radeon_drv.so:

Profiling through timer interrupt
samples  %        symbol name
4929     89.6997  RADEONPutImage
467       8.4986  RADEONWaitForFifoFunction
84        1.5287  RADEONWaitForIdleMMIO
3         0.0546  RADEONBlockHandler
2         0.0364  RADEONEngineFlush
2         0.0364  RADEONINPLL
1         0.0182  .plt
1         0.0182  RADEONAllocateMemory
1         0.0182  RADEONDisplayVideo
1         0.0182  RADEONLoadCursorARGB
1         0.0182  RADEONOUTPLL
1         0.0182  RADEONPllErrataAfterIndex
1         0.0182  RADEONSetupForScreenToScreenCopyMMIO
1         0.0182  RenderCallback
Comment 6 Michel Dänzer 2006-07-19 02:02:04 UTC
(In reply to comment #5)
> Running OProfile with MPlayer reproducing a DVD I got the following:

Note that this bug report is about 2D rendering performance, not XVideo.

> samples  %        symbol name
> 4929     89.6997  RADEONPutImage

That said, try enabling the DRI, or make sure write combining is enabled for the
framebuffer.
Comment 7 Felipe Contreras 2006-07-27 22:44:41 UTC
Thanks, that helped a lot.

Now I have been gathering profile data with normal X usage and this is what I get:

These are the most used Xorg binaries:

   137794  5.4858 /opt/xorg/lib/xorg/modules/libfb.so
    11953  0.4759 /opt/xorg/bin/Xorg
    11774  0.4687 /opt/xorg/lib/xorg/modules/libexa.so
     8098  0.3224 /opt/ati/lib/xorg/modules/drivers/radeon_drv.so

And these are the most used functions of libfb.so:

93800    67.4675  fbRasterizeEdges
23351    16.7957  fbFetch_x8r8g8b8
11220     8.0702  fbFetch_a8
2447      1.7601  fbCompositeSolidMask_nx8x8888mmx
2200      1.5824  fbCompositeSrc_8888RevNPx8888mmx
1663      1.1961  mmxCombineOverU
1178      0.8473  fbBlt
630       0.4531  mmxCombineMaskU
460       0.3309  fbCopyAreammx
431       0.3100  fbCompositeSolidMask_nx8888x8888Cmmx
277       0.1992  fbSolidFillmmx
201       0.1446  fbFetch
189       0.1359  fbCompositeGeneral
114       0.0820  fbStore_x8r8g8b8

I have EXA and composite enabled and I compiled my Xserver, I can try with the
old ones, but I remember libfb.so:fbRasterizeEdges was still by far the most
used function.

Also I'm wondering that maybe one library is doing a lot of memcpy's and so the
results appear in libc-2.4.so and not in an Xorg binary. I don't know how to
check that.

If I can provide you with more valuable information don't hesitate to ask for
it, I'll be glad to help.
Comment 8 Michel Dänzer 2006-07-29 08:59:48 UTC
(In reply to comment #7)
> I have EXA and composite enabled and I compiled my Xserver, I can try with the
> old ones, but I remember libfb.so:fbRasterizeEdges was still by far the most
> used function.

It accounts for less than 4% overall though, so it's unlikely the problem.

> Also I'm wondering that maybe one library is doing a lot of memcpy's and so the
> results appear in libc-2.4.so and not in an Xorg binary. 

That's indeed quite likely.

> I don't know how to check that.

You can try opreport -c, but it requires the oprofile kernel module to have the
capability of recording call graphs, and I don't know if the kernel's copy of it
has that.

If you can reproduce a situation where the X server uses up (almost) all CPU
cycles, the profile should be clear even for libc.

Other than that, make sure oprofile can find the libc symbols. Unfortunately, I
don't know how to achieve that with Gentoo.
Comment 9 Bret Towe 2006-07-29 11:22:32 UTC
i saw this bug while just skiming the reports and played with extace
on my athlon64 now i cant say if i had any regressions from 6.8
but i do see with 7.1 that perhaps libfb also needs a sse copy function
after running for a while i saw the below at the top of usage
the video card i have in use is a r300 9600 

samples  %        image name               app name                 symbol name
1390789  40.0518  libc-2.3.6.so            Xorg                     memcpy
257246    7.4081  libfb.so                 Xorg                     fbCopyAreammx
Comment 10 Bret Towe 2006-07-29 12:56:30 UTC
and here is off an athlon-xp 2400+ laptop with a igp320m
doing similar work with extace

samples  %        image name               app name                 symbol name
1048989  50.1806  libc-2.3.6.so            Xorg                     (no symbols)
616208   29.4776  libfb.so                 Xorg                     fbCopyAreammx
35236     1.6856  libfb.so                 Xorg                     fbSolidFillmmx
Comment 11 Michel Dänzer 2006-07-30 05:02:35 UTC
Is that with EXA or XAA?

With EXA, you may want to try current xf86-video-ati git and Option "AccelDFS".
Comment 12 Bret Towe 2006-07-30 13:44:08 UTC
yes with exa and now with AccelDFS on the amd64
still looks like some sse would help
I wont bother with the laptop since its on xorg 7 I'll test it when it gets 
upgraded 
also the wireframe flickers for me on both comps
would this be the apps fault or the driver?
its very annoying makes it hard to watch(ok enjoy it ;)

samples  %        image name               app name                 symbol name
2071728  75.3174  libc-2.3.6.so            Xorg                     memcpy
165089    6.0018  libfb.so                 Xorg                     fbCopyAreammx
51964     1.8891  oprofiled                oprofiled                for_one_sfile
37472     1.3623  libfb.so                 Xorg                     fbSolidFillmmx
Comment 13 Bret Towe 2006-08-14 22:02:15 UTC
been messing with this more and i found it seems to hate being resized
(at least under fluxbox)
after resizing to the size you want hide most of it off screen for a few seconds
and cpu usage drops and bring it back onscreen cpu will spike a bit
then levels off at a sane range
well saner than what it was giving :)
Comment 14 Bret Towe 2006-08-15 19:58:45 UTC
forgot to say that when the app is first loaded it needs to be hidden for
a while also then cpu usage drops
then of course have to hide it every time its resized
also larger the window the longer it needs off screen
Comment 15 Michel Dänzer 2006-08-22 03:20:58 UTC
Not sure SSE vs. MMX would make a big difference. If a significant part of the
rendering has to be done in software, you've pretty much lost... or if you mean
memcpy could use SSE, you may be right, but that would be a libc issue.
Comment 16 Bret Towe 2006-08-22 20:20:13 UTC
memcpy is using sse from what little googling I did
on getting a basic idea of how hard trying out a sse version of
fbSolidFillmmx would help things or not
I'm not sure how much it would help ether but I dont think it would
hurt and if nothing else that little extra cpu time saved on a laptop
could be the diff between changing freqs and consuming more
power or not

but that aside I dont think its the issue here as per my comment #13 and #14
Comment 17 Timo Jyrinki 2007-02-22 14:28:24 UTC
Marking broken (status null/blank) bugs in xorg with no activity in a long time as fixed. Please reopen if you think it's necessary, but first do a search if a similar bug report is already filed and in a NEW/ASSIGNED state. These bugs do not currently show in most search results as they do not have any status.

Sorry for this janitorial spam, you know where to send hate mails to when your inbox gets full of bugs you're subscribed to.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.