Hi, Compared to cairo-1.8 I get quite a serious performance regression running the stupid micro-benchmark attached. time ./cbench_cairo1_10 real 0m34.819s user 0m32.556s sys 0m1.567s time ./cbench_cairo1_8 real 0m18.965s user 0m17.765s sys 0m0.938s Guess its caused by the scan rasterizer. - Clemens
I'm holding my breath waiting for the benchmark... I think I can guess which path is underperforming... pixman_shader_t, ftw... ;-)
Created attachment 35570 [details] synthetic micro-benchmark
sorry, somehow I managed to forget the attachement ;) I modified the benchmark a bit, now I get: cairo-1.8 0m8.448s cairo-1.9.6 0m24.570s It seems the new approach has problems with complex paths or probably many intersections.
@Chris: By the way, do you see any chance of extending XRender with RLE encoded masks? I guess it should give better performance than the trapezoid approach even there.
Oddly enough, with lots of intersections like that, the Tor scan rasteriser should be much faster than Bentley-Ottmann. Looks like something has gone very wrong. Clemens, can I use that benchmark under a liberal licence like MIT - then I can include in the synthetic tests. Passing RLE masks to RENDER is definitely a task to be done, just enhancing RENDER itself is a very low priority compared with making direct rendering (i.e. mesa) fast (and just getting the 2D drivers to work would be a good start!).
sure, MIT licensing is fine for me. strange, glad the report contains some value after all ;)
Joonas reminded me that the bigger change between 1.8 and 1.10 that is affecting this benchmark is self-intersection removal. Whilst stroking in 1.8 we would generate a trap for each segment at a time, causing incorrect results on overlapping segments and joints. In 1.10, we generate the mask for all the edges in a single pass, which is much slower as the alogrithms scale in the number of edges and intersections O((n+k) log n) [best case, we suspect that our implementation scales nowhere nearly that well!] but visually much more pleasing. Short answer, we might not be able to fix the regression because we have chosen correctness over performance here. Longer answer, changing the stroker has a significant impact on the number of edges and intersections feed into the rasteriser and may recover the lost performance...
Replacing the insertion sort with mergesort alleviates this problem (in particular it guarantees that times scale about linearly when increasing the lines in the path to be stroked), but doesn't catch up with 1.8. See http://cgit.freedesktop.org/cairo/commit/?id=56ea51fdcc273531b5e86b921aad19237a1c9415
*** Bug 31589 has been marked as a duplicate of this bug. ***
(In reply to comment #8) > Replacing the insertion sort with mergesort alleviates this problem (in > particular it guarantees that times scale about linearly when increasing the > lines in the path to be stroked), but doesn't catch up with 1.8. > See > http://cgit.freedesktop.org/cairo/commit/?id=56ea51fdcc273531b5e86b921aad19237a1c9415 Andrea, can you port that mergesort to 1.12 and lets see how it performs on the traces.
cairo-1.8: 4.98user 0.00system 0:04.99elapsed 99%CPU cairo-1.10: 11.48user 0.00system 0:11.50elapsed 99%CPU cairo-1.12: 9.91user 0.00system 0:09.92elapsed 99%CPU
Andrea committed his patch 18 months ago, no wonder I had a strange conflict when trying to apply it!
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/cairo/cairo/issues/261.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.