Commit : RADEONPrepareAccess_CS: fallback to DFS when pixmap is in VRAM f8fb9312d791af1f77020e8c2d35bb30841ed9aa introduces serious sluggishness with KDE4 editor components / apps. For instance with Kate, switching opened files or scrolling file content is extremely sluggish and makes Xorg eat 80-90% CPU Same behavior with Kwrite. No other slowdowns noticed apart from the KDE editor component.
Are you using bitmap fonts? Or do you have antialiasing off for the fonts in use? This is one case where the PrepareAccess path would be faster because it is write-only access. Unfortunately, PrepareAccess doesn't know whether access will be write-only. And this kind of write-only access is a bit different from what EXA normally considers write-only. EXA normally only considers an operation write-only if a whole region will be written. Here the glyphs are written but I assume the pixels in-between are not touched.
I use EFont / ETL bitmaps as editor fonts. Using TTFs for instance default Monospace, it becomes much less noticeable althought not as smooth as before.
Please attach the full Xorg.0.log.
Created attachment 39290 [details] Xorg.0.log with ddx-radeon-git See attached Xorg.0.log. Nothing relevant about this issue as far as I can see...
I see what I assume is something similar when using bitmap fonts with Konsole. (I'm not detecting any slowness with other fonts, even when not "Smooth"ed.) xterm and emacs don't show any issues with bitmap fonts, so it seems that Konsole is using a different path.
Created attachment 39332 [details] oprofile CPU_CLK_UNHALTED sample counts per symbol
Created attachment 39333 [details] oprofile CPU_CLK_UNHALTED callgraph (gzip compressed) Most samples are in kernel space. User space samples indicate many DownloadFromScreen calls induced by damagePolyText16. I'm assuming the kernel space calls are also due to DFS.
There is a reasonable amount of region work in the samples. Given PolyText16 is used here, I wonder whether the DamagePendingRegion is quite complex, which could make its intersection with CopyReg in exaCopyDirty complex, leading to multiple DownloadFromScreen calls. Option "EXAOptimizeMigration" "off" in the Device section seems to workaround the issue. Perhaps the (RegionNumRects(pValidDst) > 10) path or similar could also be used when RegionNumRects(pending_damage) is large. On the driver side, DFS could perhaps skip the GPU blit for small rectangles, or maybe consider moving the BO to GTT. The way to get best performance for this particular case I think would be to modify the EXA/driver ABI so that EXA can indicate to PrepareAccess (or equiv) that it only needs write access. However, I suspect that would require significant auditing and perhaps refactoring of EXA and likely other DIX to know which operations are write-only.
*** Bug 30785 has been marked as a duplicate of this bug. ***
Option "EXAOptimizeMigration" "off" does not workaround the problems for me. (main slugishness here is not in KDE but in combination of e16 & Eterm, but there are also other symptoms.)
I can confirm : Option "EXAOptimizeMigration" "off" doesn't workaround the problem. It just makes the case issues somehow less sluggish, but doesn't make it smooth as it was before commit f8fb9312d791af1f77020e8c2d35bb30841ed9aa
*** Bug 32282 has been marked as a duplicate of this bug. ***
A data point for the "breaks gimp perspective tool" issue originally reported as #32282 : that problem only applies on my radeon 7500, my 9200se is fine with 6.13.2.
I have just made a rather odd discovery using current master HEAD DDX, latest Mesa from git master on Linux 2.6.37-rc8, X.org server 1.9.3 RC-2 from Debian experimental (1.9.2.902). The bitmap font I prefer for Konsole is Terminus, and it comes in a few fixed sizes such as Regular 9, Regular & Bold 11, Regular & Bold 12. I used to run Konsole using Terminus Bold 11 until a few days ago, when I figured out that it messed up the TrueType fallback for characters not in Terminus' glyph set such as Japanese hiragana. I now use Terminus Bold 12 instead, which seems to work with the aforementioned characters fine -- and this night I decided to give the latest DDX revision a try to see if the drm/radeon changes in Linux 2.6.37 could make a difference regarding this current bug. It turns out that X will use less than 10% CPU while idle [*], just like with Terminus Bold 11 with the DDX revision previous to commit f8fb9312d791af1f77020e8c2d35bb30841ed9aa "RADEONPrepareAccess_CS: fallback to DFS when pixmap is in VRAM". Performance drops again if I switch back to Terminus Bold 11, CPU usage sky-rocketing above 50%. I can't perceive much difference between both font sizes other than line spacing. Perhaps this has something to do with this bug -- overlapping glyph regions maybe? I'm not sure if this contributes anything to the possibility of fixing this bug, but I thought I'd mention just in case. [*] "idle" here stands for a maximized Konsole window on a 1280x800 display with 'top' running on foreground in the active tab.
Does the patch from bug 35197 help?
Created attachment 44845 [details] [review] EXA: Avoid GPU memory readback for PolyGlyphBlt fallbacks. Karl pointed out the patch from the other bug wouldn't help as this hits a different path, so here's another patch for that you can try instead of or in addition to the other patch.
Both patches ( 44478 + 44845 ) doesn't help on that particular issue, there's no noticeable improvement as far as I see it. Test env is Kate / Kwrite on KDE 4.4 with Fixed [EFont].
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/issues/12.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.