Testcase: Open up an xterm, start a screen session with a few windows (actually, two windows are enough) and then cycle around the windows. It's awfully slow. BTW: Running the same screen on the linux console (radeondrmfb) yields fast results. The most CPU cycles are burned in * fbGlyph32 (by fbImageGlypgBlt by ExaCheckImageClyphBlt by damageText by damageImageText16 by doImageText by ImageText by ProcImageText16 by Dispatch) * memcpy (by RADEONCopySwap by RADEONDownloadFromScreenCS (and others) by exaCopyDirty by exaCopyDirtyToSys (and others) by exaPrepareAccessReg_mixed by exaPrepareAccess by ExaCheckImageClyphBlt [and so on, see above]) Would be nice to have this function use 2D accelleration - it's really important for usability. Regards, Bodo PS: If you need dmesg, etc just ask, xorg.conf is same as in bug #35196
The fonts you are using in your xterm are hitting un-accelerated paths. Try changing the font. Alternatively, you can try the following options in the device section of your xorg.conf and see if either of them helps. Option "NoAccel" "True" or Option "ColorTiling" "False"
Created attachment 44478 [details] [review] EXA: Avoid GPU memory readback on ImageGlyphBlt fallback. Does this xserver patch help?
(In reply to comment #2) > Created an attachment (id=44478) [details] > EXA: Avoid GPU memory readback on ImageGlyphBlt fallback. > > Does this xserver patch help? Yes, that's a MAJOR improvement. ;) I'll make a test run on callgrind later today (or tomorrow) and report back, if there is still something burning CPU in exa_unaccel, however with the patch applied, using the system actually makes fun again. Setting Status to RESOLVED(FIXED) for now, if the need arises, I can reopen it anyways. Regards, Bodo @Alex: Thank you for your suggestions, didn't have time to test them yet, but as the problem seems to be solved by Michel's patch, I'll skip your suggestions - no offense meant ;)
The patch hasn't been applied yet.
Hello Michel A new callgrind run shows, that fbGlyph32 (with same trace) is still the winner (it is called not as often by fbImageGlypgBlt as before, but it still seems to be a kind of bottleneck). The memcpy is now the second winner. It is called indirectly not only by RADEONDownloadFromScreenCS but by RADEONUploadToScreenCS as well. Before the patch, RADEONDownloadFromScreenCS and RADEONUploadToScreenCS both called memcpy to do about the same amount of work, after the patch, RADEONDownloadFromScreenCS calls it only for about half the work (making RADEONUploadToScreenCS the clear winner). The rest of the trace remains the same. So, the patch improves things greatly (as I told yesterday already) but there is still much potential for optimization (if not call it need - especially for users with slower systems than mine). Regards, Bodo PS: I didn't note any crashes or other side effects after applying the patch.
I think this is a duplicate of this issue I filed: https://bugs.freedesktop.org/show_bug.cgi?id=34486
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/issues/16.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.