Bug 37421 - pixman performance regression in render_bench (http://www.rasterman.com/files/render_bench.tar.gz)
Summary: pixman performance regression in render_bench (http://www.rasterman.com/files...
Alias: None
Product: pixman
Classification: Unclassified
Component: pixman (show other bugs)
Version: git master
Hardware: All All
: medium normal
Assignee: Søren Sandmann Pedersen
QA Contact: Søren Sandmann Pedersen
Depends on:
Reported: 2011-05-20 14:07 UTC by Siarhei Siamashka
Modified: 2011-09-21 16:13 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Description Siarhei Siamashka 2011-05-20 14:07:40 UTC
Tested on ARM Cortex-A9:

=== pixman-0.18.4 ===

*** ROUND 1 ***
Test: Test Xrender doing non-scaled Over blends
Time: 0.331 sec.
Test: Test Xrender (offscreen) doing non-scaled Over blends
Time: 0.277 sec.
Test: Test Imlib2 doing non-scaled Over blends
Time: 1.016 sec.

=== current git (dd449a2a8ee1381fdc5297257917bc0786bf0ac4) ===

*** ROUND 1 ***
Test: Test Xrender doing non-scaled Over blends
Time: 1.620 sec.
Test: Test Xrender (offscreen) doing non-scaled Over blends
Time: 0.544 sec.
Test: Test Imlib2 doing non-scaled Over blends
Time: 1.018 sec.
Comment 1 Siarhei Siamashka 2011-05-20 14:09:49 UTC
c97881fe3c3a0af78cf5953d2c135654440b0269 is the first bad commit
commit c97881fe3c3a0af78cf5953d2c135654440b0269
Author: Søren Sandmann Pedersen <ssp@redhat.com>
Date:   Thu Sep 16 08:35:05 2010 -0400

    Move some of the FAST_PATH_COVERS_CLIP computation to pixman-image.c
    When an image is solid or repeating, the FAST_PATH_COVERS_CLIP flag
    can be set in compute_image_info().
    Also the code that turned this flag off in pixman.c was not correct;
    it didn't take transformations into account. With this patch, pixman.c
    doesn't set the flag by default, but instead relies on the call to
    compute_samples_extents() to set it when possible.
Comment 2 Siarhei Siamashka 2011-05-21 04:23:35 UTC
I have just realized that it's the same problem as discussed on #cairo two months ago:

Mar 16 18:09:16 <ssvb>  ssp: regarding the use of bilinear filter, we also have the following problem: http://cgit.freedesktop.org/~siamashka/pixman/commit/?h=unscaled-bilinear-fix&id=b8a2943696ad98b6588ee9b39335ac78fed0ab6a
Mar 16 18:12:16 <ssvb>  ssp: that's somewhat similar to the old problem taking general path with no transform and PAD repeat
Mar 16 18:16:09 <ssp>   ssvb: Can we simply set FAST_PATH_NEAREST if it the transformation is ID?
Mar 16 18:18:39 <ssvb>  yes, that's probably one of the options, but it may become a bit tricky if the flag bits diverge from real settings
Mar 16 18:18:53 <ssvb>  I mean tricky for caching flags
Mar 16 18:20:28 <ssp>   I guess the flags could be interpreted "it's safe to treat the image as if this were true"
Mar 16 18:21:39 <ssp>   Maybe _both_ the nearest and bilinear flags could be set in that case ...
Mar 16 18:23:25 <ssvb>  it may cause problems if some general path or fast path gets activated by BILINEAR_FILTER and SAMPLES_COVER_CLIP flag
Mar 16 18:23:42 <ssvb>  that's exactly what is worrying me
Mar 16 18:24:50 <ranma42>       jrmuizel: the patch which fixes quartz glyphs advances is here: http://cgit.freedesktop.org/~ranma42/cairo/commit/?h=wip/quartz-text&id=10da05216dcc99b28a896f6fb2f2e21309e74136
Mar 16 18:24:50 <ssvb>  and if this happens, we may get a rare and difficult to reproduce off-by-one read problem
Mar 16 18:25:48 <ranma42>       the same branch contains a test for that, but it still needs some cleanup (and decent commit messages). tomorrow I will be travelling, so I should have some time to improve that
Mar 16 18:27:04 <ssvb>  ssp: practically either way should be safe now, but some innocent or unintentional change would have a potential chance to introduce problems
Mar 16 18:30:57 <ssvb>  ssp: so one of the choices is to change BILINEAR flag to NEAREST in 'compute_image_info' for identity transform?
Mar 16 18:31:23 <ssp>   ssvb: How about something like this:   "if (flags have (BILINEAR | COVER_CLIP | ID_TRANSFORM)) { flags |= NEAREST;  }"  added after analyze_extents() has run?
Mar 16 18:32:53 <ssvb>  ssp: there will be no COVER_CLIP flag set if the composite operation covers the whole source image including the last pixel in the scanline
Mar 16 18:34:37 <ssp>   ssvb: Is it a common case where both the (a) clip rectangle precisely matches the image, and (b) the filter is bilinear, and (c) the transformation is id?
Mar 16 18:34:48 <ssp>   I guess I can see how applications might generate that
Mar 16 18:35:19 <ssvb>  ssp: yes, if filter is set to bilinear just in case, and then doing nonscaled blits of the whole image
Mar 16 18:37:44 <ranma42>       ssvb, ssp: cairo tries to always use NEAREST when the transform is an identity: http://cgit.freedesktop.org/cairo/tree/src/cairo-image-surface.c#n1493
Mar 16 18:37:56 <ranma42>       but I guess that other X application will not do this
Mar 16 18:38:44 <ssvb>  ranma42: yes, I found this issue in some artificial pixman tests done with pixman
Mar 16 18:38:46 <ranma42>       (dafaulting to BILINEAR looks a reasonable choice, so I would expect apps to do that)
Mar 16 18:40:44 <ssvb>  ranma42: applications have accumulated lots of ugly hacks and workarounds in order to avoid taking performance hit, so it may be not a big real problem
Mar 16 18:40:55 <ssvb>  but still solving it in pixman would be nice
Mar 16 18:44:36 <ranma42>       sure
Mar 16 18:45:55 <ranma42>       I can see why nearest + identity should be the same as bilinear + identity, but I can't understand why cover_clip is involved. what's the matter with that?
Mar 16 18:46:26 <ssvb>  ranma42: also such flipping between NEAREST and BILINEAR in cairo may be a bit slow, because I think changing filter will cause pixman to recalculate flags each time
Mar 16 18:47:27 <ssp>   ranma42: The problem is that a fast path selecting for (cover_clip + bilinear) might read edge pixels outside the image
Mar 16 18:47:53 <ssp>   ranma42: But if the actual image only supports cover clip when the filter is nearest, then those edge pixels may not exist
Mar 16 18:49:10 <ranma42>       I see. so basically cover_clip + bilinear also guarantees a "border" around the visible part?
Mar 16 18:49:40 <ssp>   Yes, cover_clip guarantees that all pixels required by the filter are there
Mar 16 18:51:27 <ranma42>       ok, got it. basically each interpolation mode has a different cover_clip condition, so it cannot be used indifferently even if the two interpolation modes should produce the same result

I stopped looking for a proper fix that time because ranma42 mentioned that this is not an issue for cairo as it changes the filter to NEAREST automatically in such cases.

Too bad that this problem prevents us from revisiting the old render_bench benchmark to show that software rendering backend in X server is not so bad as it used to be years ago, just like Carl Worth did it for hardware acceleration with EXA: http://cworth.org/exa/i965/render_bench/ :)
Comment 3 Siarhei Siamashka 2011-05-22 14:18:55 UTC
Proposed fix sent to http://lists.freedesktop.org/archives/pixman/2011-May/001259.html
Comment 4 Søren Sandmann Pedersen 2011-09-05 18:35:35 UTC
Another proposed fix:


(The reason the numbers are much higher there is that I changed the number of repetitions in render_bench from 4096 to 65536).
Comment 5 Søren Sandmann Pedersen 2011-09-21 16:13:24 UTC
Fixed in master.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.