Bug 23085 - Radeon KMS: XGetImage very slow, as is Firefox scrolling
Summary: Radeon KMS: XGetImage very slow, as is Firefox scrolling
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: xf86-video-ati maintainers
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-01 17:09 UTC by Joel Feiner
Modified: 2009-08-22 07:54 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Oprofile report while scrolling in Firefox (20.19 KB, text/plain)
2009-08-02 09:13 UTC, Joel Feiner
no flags Details
Oprofile report while running x11perf -getimage100 (12.94 KB, text/plain)
2009-08-02 09:14 UTC, Joel Feiner
no flags Details
Relevant output from /var/log/Xorg.0.log with DEBUG_FALLBACK = 1 (446.59 KB, text/plain)
2009-08-02 09:26 UTC, Joel Feiner
no flags Details
Logging output for EXA fallbacks when doing x11perf -getimage100 (203.38 KB, text/plain)
2009-08-02 09:28 UTC, Joel Feiner
no flags Details
SysProf (2.6.31-rc5-git3 w/ DDX 6.12.99+git20090804.bd03977): Firefox3 while playing Flash10 video (202.27 KB, image/jpeg)
2009-08-04 22:20 UTC, Sedat Dilek
no flags Details
[Crappy] SysProf (2.6.31-rc5-git2-with-drmfixes20090804) w/ DDX 6.12.99+git20090723.2afc46f): Firefox3 while playing Flash10 video (172.91 KB, image/jpeg)
2009-08-04 22:24 UTC, Sedat Dilek
no flags Details
Oprofile report for Firefox scrolling after updated xf86-video-ati (21.22 KB, text/plain)
2009-08-06 07:15 UTC, Joel Feiner
no flags Details

Description Joel Feiner 2009-08-01 17:09:31 UTC
I finally managed to get KMS working on my Gentoo laptop.  I have a Radeon Mobility X300 in a ThinkPad T43.  I am using the latest Git of the entirety of the X11 stack (drivers, libs, proto, server, etc.) as of late morning August 1.  I am also using Linus's kernel tree for 2.6.31 to get the DRM driver for KMS.

Running x11perf -getimage500 results in about 7 per second and -getimage100 is about 100 something.  Using non-KMS (older build of X11 by a few weeks, but still from Git) results in -getimage500 producing around 115 per second and -getimage100 around 9500 per second.  I have a feeling this may be responsible for the slow scrolling in Firefox and other apps.  CPU time is almost all in user-space according to my CPU monitor and oprofile, when it decides to work, confirms this.  Seems to me that this would indicate a software fallback of some sort?  I'm not sure how to diagnose it beyond that since oprofile continues to fail in even more miraculous ways for me.

I am using all defaults for radeon driver options in my xorg.conf (yes, I'm still using one :/).
Comment 1 Michel Dänzer 2009-08-02 07:20:28 UTC
If Firefox (or any other app, for that matter) uses XGetImage for scrolling, it's broken and needs to be fixed.

The problem could indeed be due to software fallbacks though. It could really be interesting to see profiles obtained with sysprof or oprofile. Please stick to a single operation as much as possible during any profile run.

You could also try enabling the fallback debugging code in xserver/exa or the driver and see if antything sticks out. But note that some of those debugging messages can be red herrings without backing analysis, don't jump to conclusions.
Comment 2 Joel Feiner 2009-08-02 09:13:44 UTC
Created attachment 28264 [details]
Oprofile report while scrolling in Firefox
Comment 3 Joel Feiner 2009-08-02 09:14:06 UTC
Created attachment 28265 [details]
Oprofile report while running x11perf -getimage100
Comment 4 Joel Feiner 2009-08-02 09:20:41 UTC
(In reply to comment #1)
> If Firefox (or any other app, for that matter) uses XGetImage for scrolling,
> it's broken and needs to be fixed.
> 

I've heard that Firefox uses XGetImage when rendering "native" widgets, i.e., that it renders the widgets using the native GTK+ libraries and then uses XGetImage to grab the rendered widget and then uses that image to actually draw the widgets.  That's why I thought there might be some relationship between scrolling (pages have widgets) and XGetImage.

> The problem could indeed be due to software fallbacks though. It could really
> be interesting to see profiles obtained with sysprof or oprofile. Please stick
> to a single operation as much as possible during any profile run.
> 

I attached two oprofile reports and unlike last time I tried to file a bug report, they actually look valid.

Note that I rebuilt my X stack from Git this morning (latest master) and the kernel code is also up to date, using this Git URL: http://www.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (if you need a reference point).  The reports include line numbers so I thought that information might be useful.

> You could also try enabling the fallback debugging code in xserver/exa or the
> driver and see if antything sticks out. But note that some of those debugging
> messages can be red herrings without backing analysis, don't jump to
> conclusions.
> 

I'll attach my Xorg.0.log with the fallback enabled (have to restart X server for that...)
Comment 5 Joel Feiner 2009-08-02 09:26:14 UTC
Created attachment 28266 [details]
Relevant output from /var/log/Xorg.0.log with DEBUG_FALLBACK = 1

I tried to grab the section of Xorg.0.log generated while scrolling in Firefox (using tail -f and ctrl-C).  The first part of the output may be related to switching windows from Konsole to Firefox, but afterwards, I scrolled for about 10 seconds, so there should be more than enough data for scrolling related fallbacks.
Comment 6 Joel Feiner 2009-08-02 09:28:17 UTC
Created attachment 28267 [details]
Logging output for EXA fallbacks when doing x11perf -getimage100
Comment 7 Joel Feiner 2009-08-02 09:31:56 UTC
One last note: 2d performance is generally on the slow side, especially with Qt4 apps.  I see much slower redraws in all apps to handle exposure events.  Qt4 is particularly bad, with this process taking upwards of a second or more.  Window resizing seems to be okay, though.  Performance when running a compositing manager (xcompmgr in my case) is about the same as without one.  KWin's compositor is still slow, but also not slower than it was on the older version of X (non-KMS, non-DRI2).
Comment 8 Clemens Eisserer 2009-08-02 13:43:09 UTC
> I've heard that Firefox uses XGetImage when rendering "native" widgets, i.e.,
Yes, it has to do this because GTK's theme engine design is badly broken.
The widget is rendered *twice*, one time over a white background, the other time over a black one and both results are compared in userspace (2x XGetImage) to get knowledge about trasparent areas.
Hopefully they'll change it to a pure cairo-based interface in GTK3, however as it seems now there isn't a lot of movement in its development :-/
Comment 9 Michel Dänzer 2009-08-03 09:12:47 UTC
Now that you guys mention this, I remember hearing about it before... I guess my mind keeps trying to forget about it because it's so wrong. :} I guess this must mean Firefox 3 isn't usable on remote displays...

Anyway, XGetImage performance will become better once someone gets around to implementing an accelerated DownloadFromScreen hook again for KMS. There was something in the Git branch kms-support which hasn't been ported to the new KMS support in the master branch yet.

In the long run, we may be able to make XGetImage perform even better with KMS than would ever be possible without by using TTM user buffers.
Comment 10 Sedat Dilek 2009-08-04 22:20:15 UTC
Created attachment 28359 [details]
SysProf (2.6.31-rc5-git3 w/ DDX 6.12.99+git20090804.bd03977): Firefox3 while playing Flash10 video
Comment 11 Sedat Dilek 2009-08-04 22:21:51 UTC
With Jerome Glisse's patch "radeon/kms: add simple DownloadFromScreen implementation" to DDX and Linux kernel 2.6.31-rc5-git3 (contains drm-fixes 20090804) ExaGetImage CPU-usage was settles down from approx. 70-75% down to approx. 25% (see attached screenshot).

[1] <http://cgit.freedesktop.org/xorg/driver/xf86-video-ati/commit/?id=22074cf0e58fddba743924532625e6fca49b6bdc>
Comment 12 Joel Feiner 2009-08-04 22:23:05 UTC
(In reply to comment #11)
> With Jerome Glisse's patch "radeon/kms: add simple DownloadFromScreen
> implementation" to DDX and Linux kernel 2.6.31-rc5-git3 (contains drm-fixes
> 20090804) ExaGetImage CPU-usage was settles down from approx. 70-75% down to
> approx. 25% (see attached screenshot).
> 
> [1]
> <http://cgit.freedesktop.org/xorg/driver/xf86-video-ati/commit/?id=22074cf0e58fddba743924532625e6fca49b6bdc>
> 

I may give that patch a try.  I saw it come through yesterday (or was it today?), but I was afraid to try it given Dave Airlie's concerns on the mailing list about the state of DFS (and UTS) under KMS.
Comment 13 Sedat Dilek 2009-08-04 22:24:24 UTC
Created attachment 28360 [details]
[Crappy] SysProf (2.6.31-rc5-git2-with-drmfixes20090804) w/ DDX 6.12.99+git20090723.2afc46f): Firefox3 while playing Flash10 video
Comment 14 Jerome Glisse 2009-08-06 03:18:22 UTC
DFS code was added recently to ddx master could you try and report if it helps for you ?
Comment 15 Joel Feiner 2009-08-06 07:14:12 UTC
(In reply to comment #14)
> DFS code was added recently to ddx master could you try and report if it helps
> for you ?
> 

x11perf -getimage500 now reports 30/sec and -getimage100 reports 770/sec.

Firefox scrolling is still slow, but it doesn't seem as bad as before (that's a subjective statement, but I have no better way to measure it).  I have attached a new oprofile profile made while scrolling in Firefox.  It may or may not be useful.
Comment 16 Joel Feiner 2009-08-06 07:15:00 UTC
Created attachment 28400 [details]
Oprofile report for Firefox scrolling after updated xf86-video-ati
Comment 17 Joel Feiner 2009-08-07 07:35:01 UTC
I rebuilt from Git this morning and the scrolling problem has been much reduced (it is about as fast as before).  I saw that there were commits to fix a few issues in the DFS hook and add the UTS hook.  One of those two seemed to help.

Getimage performance is still on the low side, with:
-getimage500: 30/sec
-getimage100: 770/sec (is usually around 5000/sec)

And (shm)putimage is also pretty low compared to what I usually get:
-putimage500: 80/sec (is usually around 150/sec)
-shmput500: 91/sec (is usually around 400/sec)
Comment 18 Michel Dänzer 2009-08-07 08:24:48 UTC
(In reply to comment #17)
> I rebuilt from Git this morning and the scrolling problem has been much reduced
> (it is about as fast as before).

So this report can be resolved? (You can do that yourself :)

> And (shm)putimage is also pretty low compared to what I usually get:
> -putimage500: 80/sec (is usually around 150/sec)
> -shmput500: 91/sec (is usually around 400/sec)

Clearly there's still room for improvement, but I'm confident we can address that with time.
Comment 19 Joel Feiner 2009-08-22 07:54:17 UTC
Closing as requested.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.