|Summary:||Performance issue using texture from pixmap (tfp) glx extension on 945|
|Product:||xorg||Reporter:||Chris Lord <chrislord.net>|
|Component:||Driver/intel||Assignee:||Chris Wilson <chris>|
|Status:||RESOLVED FIXED||QA Contact:||Xorg Project Team <xorg-team>|
|Priority:||high||CC:||diegoe, josh, kai.kasurinen|
|i915 platform:||i915 features:|
Description Chris Lord 2009-12-01 05:40:12 UTC
Created attachment 31621 [details] A rubbish tfp performance test When using texture-from-pixmap on an X Pixmap, there is a huge performance drop when the pixmap size approaches 1024. With the attached test (requires Clutter), on my Thinkpad X60s, there's a 50% reduction of frame-rate between tfp on a pixmap of 1000x768 and 1024x768. The attached patch, provided by Eric Anholt, fixes the performance issue, although I notice that some glyphs in gtk2 apps get corrupted after coming up from suspend with it applied. This is an issue for Moblin, as the Moblin web browser uses tfp to take advantage of cairo xrender acceleration and to avoid slow-down with windowless plugins (which require an X drawable). With this issue, we can't realistically ship the Moblin web browser. (a side-note, the attached test is just a quick rubbish test I hacked up to demonstrate the problem using a similar method as the web browser - it doesn't have determinate behaviour and the numbers are just meant to be a rough guide) Some relevant discussion from IRC: Nov 26 17:10:18 <anholt> I would expect that linear things rounding to 1024 (so, that's what... 1009-1024?) would get very unhappy on dual-channel due to the page misses to the same channel for each 2x2 subspan. ... Nov 26 18:06:12 <anholt> Cwiiis: we want to be tiling X pixmaps. we aren't today. that diff isn't complete, since you can hit sw fallbacks that in the right conditions may end up rendering wrong Nov 26 18:06:32 <anholt> there's also the complication of fence management on 915, so we may or may not ever tile x pixmaps Nov 26 18:07:06 <Cwiiis> anholt: So windows are tiled but pixmaps aren't? Nov 26 18:07:28 <ickle_> only the frontbuffer on X is tiled Nov 26 18:07:32 <ickle_> (currently) Nov 26 18:07:51 <anholt> the back and depth buffers of windows and the scanout buffer are tiled but pixmaps aren't, including composite extension backing store ... Nov 26 18:11:20 <anholt> app windows end up tiled if their storage is in the scanout buffer. if they're composited, they aren't.
Comment 1 Chris Lord 2009-12-01 05:41:22 UTC
Created attachment 31622 [details] [review] A patch that 'fixes' the performance problem
Comment 2 Eric Anholt 2009-12-01 10:10:23 UTC
Basically what remains for this patch is making sure that we only use gtt mapping for fallbacks to the tiled BOs, to work around the a6 swizzling madness.
Comment 3 Carl Worth 2009-12-02 08:52:04 UTC
(In reply to comment #2) > Basically what remains for this patch is making sure that we only use gtt > mapping for fallbacks to the tiled BOs, to work around the a6 swizzling > madness. Thanks Eric, I'll work on improving this patch, and will look forward to guidance from you on doing that. -Carl
Comment 4 Joshua Lock 2009-12-03 03:41:58 UTC
It seemed worth noting that we have been having performance issues with Moblin on large panels (netbooks plugged into a 22" 1920*1200 display, say). Using the attached "fix" seems to help. Once the CPU has settled after boot we can now observe animations on the toolbar icons whereas before they where not seen.
Comment 5 Chris Wilson 2010-03-23 10:37:11 UTC
Created attachment 34370 [details] [review] Current patch to switch default to TILING_X
Comment 6 Carl Worth 2010-03-24 08:45:58 UTC
Chris is planning to commit a fix for this issue today. Reassigning to him. -Carl
Comment 7 Chris Wilson 2010-03-24 09:48:31 UTC
commit 2eec53d0b9232970fe3d03ce6c8940ebeea44bee Author: Chris Wilson <email@example.com> Date: Tue Mar 23 17:28:22 2010 +0000 uxa: Default to using TILING_X for pixmaps. On memory constrained hardware, tiling is vital for good performance as it minimizes cache misses. The downside is that for older hardware (which often suffers from the lack of bandwidth) requires the use of fences for many operations, which are in short supply and so may cause shorter batchbuffers. However our batch buffers are typically short and so this is unlikely to be a concern and not affect the performance wins. A quick bit of testing suggests the effect is inconclusive on firefox/i945: linear tiled xcb 205.470 206.219 xcb-render-0.0 404.704 388.413 xlib 166.410 170.805 A secondary effect of the patch is to workaround a G31 specific hang when attempting to use linear 2048x2048 surfaces. Bonus!
Comment 8 Diego Escalante Urrelo 2010-03-27 14:35:20 UTC
Hey Chris, this was really life-changing for my performance (855GM) but it seems that after resuming from suspend performance degrades. Nothing special in Xorg.0.log nor dmesg. Perhaps something is missing?
Comment 9 Chris Wilson 2010-03-27 14:51:55 UTC
Any visual corruption? Can you measure performance with and without this patch, and after resume? Is that indicative that tiling is no more? Can you also watch /sys/kernel/debug/dri/0/i915_gem_fence_regs before and after resume. Are we using fences afterwards? Lots of room for potential mischief here. Probably deserves a new bug, though.
Comment 10 Diego Escalante Urrelo 2010-03-27 14:58:37 UTC
(In reply to comment #9) > Any visual corruption? Can you measure performance with and without this patch, > and after resume? Is that indicative that tiling is no more? Can you also watch > /sys/kernel/debug/dri/0/i915_gem_fence_regs before and after resume. Are we > using fences afterwards? > > Lots of room for potential mischief here. Probably deserves a new bug, though. > Seems I can't reliably reproduce, perhaps it's something else. I'll keep an eye open and try to file a new one when I isolate the problem.