Bug 31589

Summary: Very high cpu usage of _cairo_bentley_ottmann_tessellate_polygon() in transmission-gtk
Product: cairo Reporter: veldt <temp.20.nurkle>
Component: generalAssignee: Carl Worth <cworth>
Status: RESOLVED DUPLICATE QA Contact: cairo-bugs mailing list <cairo-bugs>
Severity: normal    
Priority: medium CC: temp.20.nurkle
Version: 1.10.0   
Hardware: x86 (IA32)   
OS: Linux (All)   
i915 platform: i915 features:

Description veldt 2010-11-12 20:50:11 UTC
A code path involving
  pango_cairo_layout_path (cr, layout);
  murrine_set_color_rgba (cr, &temp, 0.5);
  cairo_stroke (cr);

, in Ambiance, ubuntu's default theme, in the application transmssion-gtk, leads to an amazing amount of cpu usage primarily in _cairo_bentley_ottmann_tessellate_polygon().

Bug https://bugs.launchpad.net/ubuntu/+source/transmission/+bug/655024 has lots of detail.

(non-murrine-based themes are still fine; other murrine themes Radiance and MurrinaChrome have same effects.)

The possibility that it is not cairo but incorrect invocation of it has not been eliminated. Pango is the least-researched area in this regard.
Comment 1 M Joonas Pihlaja 2010-11-13 09:19:12 UTC
This is likely due to a bugfix in the stroker which fixes the antialiasing of self-intersecting strokes.  The worst hit backends are those which use trapezoids as their geometric primitive, such as the xlib backend, and we can't quickly deal with possible intersections using a fast path; for example for rectilinear paths we could defer to the rectilinear stroker+tessellator combo (but I'm not sure we do that currently).

From the code path snippet provided, I expect what's happening is that the outline path involves lots of curves, and unfortunately those cannot be fastpathed to bypass the stroker's call to tessellate the stroked outline when using the xlib surface.

In any case, the image backend uses a different rasterisation method in 1.10 so the regression doesn't occur as much, so a possible workaround might be to use a temporary image surface to render the strokes.
Comment 2 veldt 2010-11-13 22:07:03 UTC
(In reply to comment #1)
> In any case, the image backend uses a different rasterisation method in 1.10 so
> the regression doesn't occur as much, so a possible workaround might be to use
> a temporary image surface to render the strokes.

Indeed! The image backend uses only about 13% of the CPU time xcb or xlib does. (re: cairo-perf-trace at https://bugs.launchpad.net/ubuntu/+source/transmission/+bug/655024/comments/31)

I initially considered this a text-shadows bug, but am now thinking it may be a text-shading issue. As an amateur, is there a reason black text "ghosted" to 0.5 alpha "grey" would overwork the tesselator?
Comment 3 M Joonas Pihlaja 2010-11-14 02:49:26 UTC
If you want ghosted text rendered at 0.5 alpha, then just render the text at 0.5 alpha.  This avoids all use of the stroker. :)  There are likely other methods to get shadows or ghosted text suitable for you, not involving the stroker, so if using a temporary image surface isn't a viable option, then perhaps investigating those in the context of your app might be a way forward?  Please drop by the #cairo irc channel on freenode or post to the cairo-l mailing list if feel like it.

Regarding overuse of the tessellator, here's approximately what happens when you stroke a path using the xlib backend: 1) The outline of the path is computed as a polygon. 2) The stroker notices it cannot prove that there are no self-intersections in the resulting polygon so it calls on the tessellator to remove the intersections and return a list of non-overlapping trapezoids.
In fact, due to how the outline's polygon is constructed, it's nearly always the case that there are self-intersections in the polygon.  3) The trapezoids are composited to the target surface.

The main change in 1.10 is in step two where the self-intersections are removed.  Before, the stroker just didn't care so it would emit trapezoids directly, disregarding possible overlaps.  The end result when composited is an annoying "sparkling" at the edges where the outline polygon crosses itself.  To mitigate the slowness we've implemented a fast path for rectilinear strokes, but unfortunately it doesn't kick in if the path to stroke has curves or the line join- and cap-styles aren't suitable.

For the image backend the story is slightly different in that the rasterization method (new in 1.10) used by the image backend can deal with self-intersecting polygons just fine, so there's no need for tessellating the outline polygon.

I'm really sorry your stroking code was impacted so badly.  If you could send us a cairo-trace of your application for the cairo-traces collection it could help us find a solution for stroking in the future that addresses the self-intersection bugs yet isn't slow as molasses on trapezoid backends.  Traces of real world performance and regressions like this are key in directing the development focus.
Comment 4 veldt 2010-11-14 19:32:10 UTC
Thanks! Would you consider the first pseudo-code snippet in this link an applicable illustration of temporary surface use?


On another note, again speaking as an amateur, is there a common reason for already-rendered items (even if that rendering is slow) to continue to consume massive CPU, with no user activity at all, as in the current case? That is, is some level of code re-rendering things unnecessarily?
Comment 5 M Joonas Pihlaja 2010-11-15 18:27:17 UTC
Thank you for the cairo trace of the application!  It confirms that the slow part is stroking glyph outlines to create a drop-shadow effect around text.  A faster way of creating shadows is to render the text once normally to a temporary surface, and then masking and painting that surface to the target at different offsets.

Re: the pseudo-code snippet linked to in comment #4, it's not a very good model for this use case, as it is actually invoking the stroker to draw an edge around a filled path.  It also uses cairo_push_group() to create the temporary surface which, while usually a good thing and what you want to do since it creates a "similar" surface optimised for the actual target surface, would not be a good thing in this case as stroking onto that surface is exactly as slow as onto the target surface.

As to already-rendered items continuing to consume CPU (after a surface flush), the only reason why that would happen is if the application itself is actually rerendering the items, say due to a new expose event from the window manager.  There are currently no secret elves in cairo to burn your CPU just because they can when you're not looking. :)

*** This bug has been marked as a duplicate of bug 28067 ***