System Environment: ---------------------------------------------- Platform: Ironlake Libdrm: (master)libdrm-2.4.46-2-gfea5408098c3c3057958e85ea9d7146f0b08749e Mesa: (master)cd90ebefd4f63d00eb65af6069fb078e454c43f1 Xf86_video_intel:(master)2.21.12-36-g2737aaab77cc6233826077b90d001fe45efc376c Kernel: (drm-intel-nightly)79e7b6f2cd7a75ff57207f763abf46cfe709e8de Bug detailed description: ---------------------------------------------- x11perf –aa10text performance reduced by ~37% enabling SNA on Ironlake. It’s xf86_video_intel regression, bisecting show the first bad commit is that: commit 16fb786bd0791986fcac8b11a7e182090c5b6249 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Jun 23 09:00:17 2013 +0100 sna/gen5: Force a write flush when changing blend modes Performance(x11perf –aa10text): ------------------------------------------ SNA disable SNA git-9eabe28 9650000 1520000 git-16fb786 15500000 1520000 Reproduce steps: --------------------------------------------- 1. xinit& 2. vblank_mode=0 ./x11perf_x86_64 -aa10text
There isn't much I can do since that flush is required to workaround a hardware issue. On my mobile systems (i3-330), the degradation is closer to 10%, 7.9Mglyphs/s to 7.3Mglyphs/s, which is not that worrying. Can you please describe how you avoid the IPS penalty on Ironlake for benchmarking?
I reworked the flush and was happy that I couldn't see the corruption return and that x11perf improved here. commit 7b1a5024df96070bab70744ad7e7d78a41fb0f72 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jul 22 10:51:05 2013 +0100 sna/gen5: Rework the flush after blend state changes QA complained that the full flush to memory was too much of a performance hit, so let's try a lighter weight non-stalling pipe flush without. References: https://bugs.freedesktop.org/show_bug.cgi?id=67157 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Can you please remeasure and see what the current perf hit is?
(In reply to comment #2) The above performance data is on Piketon. I test on Calpella(mobile), the degradation is about ~16%. I retest the 7b1a5024df96070b and make a summary: Piketon Calpella git-9eabe28: 15500000 7760000 git-16fb786: 9650000(37%↓) 6470000(16.6%↓) git-7b1a502: 13700000(11.6%↓) 6820000(12%↓) Calpella: Core(TM) i5 CPU M 520 @ 2.40GHz Piketon : Core(TM) i5 CPU 650 @ 3.20GHz hi Chris, what does "IPS" mean?
IPS is "Intelligent Power Sharing". It adjusts the performance of the GPU by ~2x in order to split power between the CPU/GPU based on demand, and takes a good couple of minutes to respond. That is it takes several runs of a benchmark on Ironlake in close succession for the results to stablise. Hmm, it's strange that after 7b1a502 the degradation on my i3-330 was <1% (~7.9 to ~7.8 MGlyphs/s). But a 10% perf hit for a flushing correctness fix here is not too bad - though I don't see why we would need to inject a flush between glyph runs, so I will take another look.
This should eliminate more of the flushes: commit 74dd10d252df2d826b6db5da7e53bd7c5d76dec0 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jul 23 09:37:15 2013 +0100 sna/gen5: The cached value of the pipelined pointers key requires 32-bits Storing only the low 16-bits of the key for the pipelined state meant that we forced an update with every new drawing op - with the side effect of flushing the render cache. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67157 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Can you please remeasure?
(In reply to comment #5) With the commit, x11perf –aa10text on Piketon is 15200000. The problem is fixed.
Thanks for testing. Hopefully, the render error in facebook.com (and friends) doesn't reappear...
Verified it.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.