| Summary: | [i915 bisected] x11perf has a regression | ||||||
|---|---|---|---|---|---|---|---|
| Product: | xorg | Reporter: | zhao jian <jian.j.zhao> | ||||
| Component: | Driver/intel | Assignee: | Chris Wilson <chris> | ||||
| Status: | VERIFIED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||
| Severity: | normal | ||||||
| Priority: | low | CC: | chris | ||||
| Version: | unspecified | ||||||
| Hardware: | All | ||||||
| OS: | Linux (All) | ||||||
| Whiteboard: | |||||||
| i915 platform: | i915 features: | ||||||
| Attachments: |
|
||||||
|
Description
zhao jian
2009-11-12 22:33:40 UTC
Interesting, I was expecting a boost since the shader is simpler and we're transferring fewer bytes. Can you report the scale of the regression? Since centre-point sampling in combination with 1x1R textures is buggy, I can't simply revert this change (and the centre-point sampling is required to prevent off-by-one rendering errors, i.e the occasional black rectangle around images). On my i945, prior to this commit I get 378k/s, and afterwards 370k/s. Alternates that I have tried so far: using per-vertex colors: 361k/s using shader constants instead of defaults: 359k/s. On 945GM ia32, it drops 10% from 602000.0 to 536000.0 with rgb10text, and drops 11% from 812000.0 to 716000.0 with aa10text. On 945GME ia32, its rgb10text drops 19% from 330000.0 to 266000.0 but its aa10text only drops a little from 347000.0 to 338000.0. Dropping priority as we seem to be hitting a gpu bottleneck on a path that I believe is required for correct rendering elsewhere (with similar setup). Ok, I think I've found the cause of the damage here. i915: Baseline: 3600000 trep @ 0.0071 msec (142000.0/sec) 3200000 trep @ 0.0082 msec (122000.0/sec) Adjusting libXft to use SolidFills: 4000000 trep @ 0.0066 msec (150000.0/sec) 3600000 trep @ 0.0076 msec (132000.0/sec) Improving the driver to avoid reading back (from system memory) a pixel to determine color for SolidFills: 4000000 trep @ 0.0067 msec (149000.0/sec) 3600000 trep @ 0.0068 msec (147000.0/sec) And for good measure reverting the change to libXft i.e. back to using a solid pixmap (this should be close to the original code & performance): 8000000 trep @ 0.0062 msec (162000.0/sec) 4000000 trep @ 0.0069 msec (145000.0/sec) PineView: Baseline: 8000000 trep @ 0.0040 msec (248000.0/sec) 8000000 trep @ 0.0043 msec (232000.0/sec) Updating libXft: 8000000 trep @ 0.0040 msec (249000.0/sec) 8000000 trep @ 0.0043 msec (231000.0/sec) Improved driver: 8000000 trep @ 0.0038 msec (262000.0/sec) 8000000 trep @ 0.0040 msec (251000.0/sec) And reverting the change to libXft...: 8000000 trep @ 0.0039 msec (259000.0/sec) 8000000 trep @ 0.0041 msec (244000.0/sec) So, the cause would appear to be the readback of the single pixel. The remaining question is whether to use a pixmap or a diffuse color. (Still a mystery 3x performance hit, but the improvement is consistent and the code should be pretty close to the original, so...) commit 21c1c3c7f6eb2b5070d2153b15a8fb1afe938bbb Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon May 10 10:19:28 2010 +0100 i915: Use 1x1R pixmap for solid drawables x11perf has a regression https://bugs.freedesktop.org/show_bug.cgi?id=25068 caused by commit e581ceb7381e29ecc1a172597d258824f6a1d2d3 i915: Use the color channels to pass along solid sources and masks. Do not convert 1x1R pixmaps into a solid color as the readback from the bo negates all the performances advantages of using a smaller vertex buffer and fewer samplers. Before (PineView): aa=66800 glyph/s, rgb=28800 glyphs/s Now: aa=96800 glyphs/s, rgb=48500 glyphs/s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Rechecked on my i945GME: Before: 12000000 trep @ 0.0025 msec (404000.0/sec): Char in 80-char aa line (Charter 10) 12000000 trep @ 0.0026 msec (380000.0/sec): Char in 80-char rgb line (Charter 10) After: 12000000 trep @ 0.0024 msec (417000.0/sec): Char in 80-char aa line (Charter 10) 12000000 trep @ 0.0025 msec (399000.0/sec): Char in 80-char rgb line (Charter 10) which seems consistent with the original regression. x11perf improves a lot on pineview(i915) recently, so verified. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.