Bug 88203

Summary: Replaying recording surfaces with OVER has far worse performance than when using the SOURCE operator
Product: cairo Reporter: Emanuele Aina <emanuele.aina>
Component: generalAssignee: Chris Wilson <chris>
Status: RESOLVED MOVED QA Contact: cairo-bugs mailing list <cairo-bugs>
Severity: normal    
Priority: medium CC: marco.barisione
Version: 1.12.16   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: spans: Directly replay recording surfaces for OVER to avoid a copy

Description Emanuele Aina 2015-01-08 16:29:36 UTC
I have some code that collects some partial redrawing of a surface in a recording surface and then later replays it on the target surface.

I've noticed that when using the OVER operator the replay is surprisingly slow, ie. the overhead over painting directly on the target is noticeable.

When using the SOURCE operator things get back to my expectations and the overhead is negligible, but unfortunately the SOURCE operators clears the target surface, destroying all the areas that don't get redrawn and thus making it unfit for my purposes.

I'd like to have a way to replay the recording surface without any clearing and without any intermediate surface.

Shouldn't the OVER operator do that? In what cases replaying directly on the target would differ from replaying to an intermediate surface and composite it with OVER?
Comment 1 Emanuele Aina 2015-01-08 16:49:03 UTC
Note that I'm really not knowledgeable enough to understand how the different compositing operators should work with recording surfaces, but I was surprised that replaying a recording surface is actually replaying it on a clear surface and then compositing it, rather than replaying it directly.

Citing the documentation:
> If you want to replay a surface so that the results in target will be 
> identical to the results that would have been obtained if the original 
> operations applied to the recording surface had instead been applied to the 
> target surface, you can use code like this: [snip]

With the extra copy this does not seem the case: if rendering directly would give different results then the documentation is misleading, but if rendering directly would give the same result then the intermediate surface would not be needed.

To be honest I'm not sure what should be the meaning of using operators other than SOURCE and OVER to replay a recording surface. If the user actually meant to use an intermediate surface, why bother using a recording surface then?
Comment 2 Emanuele Aina 2015-01-08 16:58:29 UTC
Created attachment 111964 [details] [review]
spans: Directly replay recording surfaces for OVER to avoid a copy

This is a tentative patch, I don't have clear all the implications of using OVER to mean "replay directly without clearing", but this is what I would have expected by reading the documentation.
Comment 3 Chris Wilson 2015-01-08 20:50:37 UTC
Right, an OVER operator with no mask need no incur the replay onto an intermediate surface.

I think it would be simpler as

diff --git a/src/cairo-spans-compositor.c b/src/cairo-spans-compositor.c
index efbae25..34b11ae 100644
--- a/src/cairo-spans-compositor.c
+++ b/src/cairo-spans-compositor.c
@@ -579,7 +579,7 @@ composite_aligned_boxes (const cairo_spans_compositor_t     
     }
 
     /* Are we just copying a recording surface? */
-    if (inplace &&
+    if ((inplace || (op == CAIRO_OPERATOR_OVER && no_mask)) &&
        recording_pattern_contains_sample (&extents->source_pattern.base,
                                           &extents->source_sample_area))
     {
@@ -591,7 +591,7 @@ composite_aligned_boxes (const cairo_spans_compositor_t     
        /* XXX could also do tiling repeat modes... */
 
        /* first clear the area about to be overwritten */
-       if (! dst->is_clear) {
+       if (! dst->is_clear && op_is_source) {
            status = compositor->fill_boxes (dst,
                                             CAIRO_OPERATOR_CLEAR,
                                             CAIRO_COLOR_TRANSPARENT,

or even replace that op_is_source with inplace for consistency. Can you please check that the recording_surface_over test hits the new path and check that it missed the optimisation before.
Comment 4 Emanuele Aina 2015-01-08 22:41:26 UTC
> Right, an OVER operator with no mask need no incur the replay onto an
> intermediate surface.

\o/
 
>      /* Are we just copying a recording surface? */
> -    if (inplace &&
> +    if ((inplace || (op == CAIRO_OPERATOR_OVER && no_mask)) &&

Is it safe to ignore need_clip_mask when the operator is OVER?

>         /* first clear the area about to be overwritten */
> -       if (! dst->is_clear) {
> +       if (! dst->is_clear && op_is_source) {
>             status = compositor->fill_boxes (dst,
>                                              CAIRO_OPERATOR_CLEAR,
>                                              CAIRO_COLOR_TRANSPARENT,


If I got the code right, using op_is_source instead of checking for CAIRO_OPERATOR_SOURCE here would produce some redundant clears in case we're using CAIRO_OPERATOR_OVER or CAIRO_OPERATOR_ADD with an opaque pattern.

> or even replace that op_is_source with inplace for consistency.

I'm not sure what you meant here, sorry. :(

> Can you please check that the recording_surface_over test hits the new path
> and check that it missed the optimisation before.

I'm having some issue with the testsuite.

Even when using `Xvfb -screen 0 1680x1024x24 -ac -nolisten tcp :2` and `DISPLAY=:2 CAIRO_TEST_TARGET=xlib ./cairo-test-suite recording-surface-over` I only get xlib.argb32 and xlib-render-0_0.rgb24 to PASS, while xlib.rgb24, xlib-window.rgb24 and xlib-fallback.rgb24 FAIL (with or without the patch applied).

The two tests using CAIRO_TEST_TARGET=image PASS cleanly, without limiting CAIRO_TEST_TARGET I get other failures.

But I also get lots of failures when I run the full testsuite, so I'm quite at a loss here. :(
Comment 5 Chris Wilson 2015-01-08 22:54:09 UTC
(In reply to Emanuele Aina from comment #4)
> > Right, an OVER operator with no mask need no incur the replay onto an
> > intermediate surface.
> 
> \o/
>  
> >      /* Are we just copying a recording surface? */
> > -    if (inplace &&
> > +    if ((inplace || (op == CAIRO_OPERATOR_OVER && no_mask)) &&
> 
> Is it safe to ignore need_clip_mask when the operator is OVER?

Indeed, the clip_mask needs to be considered as well. My main point is that we don't try to overload inplace with the secondary meaning of being able to replay over top.
 
> >         /* first clear the area about to be overwritten */
> > -       if (! dst->is_clear) {
> > +       if (! dst->is_clear && op_is_source) {
> >             status = compositor->fill_boxes (dst,
> >                                              CAIRO_OPERATOR_CLEAR,
> >                                              CAIRO_COLOR_TRANSPARENT,
> 
> 
> If I got the code right, using op_is_source instead of checking for
> CAIRO_OPERATOR_SOURCE here would produce some redundant clears in case we're
> using CAIRO_OPERATOR_OVER or CAIRO_OPERATOR_ADD with an opaque pattern.
> 
> > or even replace that op_is_source with inplace for consistency.
> 
> I'm not sure what you meant here, sorry. :(

Hmm, this could with a big comment. It indeed only does need to be op == SOURCE, because in the cases where op_is_source/inplace cause us to take this path with OVER/ADD dst->is_clear by definition. Tricky, but you were right.


> > Can you please check that the recording_surface_over test hits the new path
> > and check that it missed the optimisation before.
> 
> I'm having some issue with the testsuite.
> 
> Even when using `Xvfb -screen 0 1680x1024x24 -ac -nolisten tcp :2` and
> `DISPLAY=:2 CAIRO_TEST_TARGET=xlib ./cairo-test-suite
> recording-surface-over` I only get xlib.argb32 and xlib-render-0_0.rgb24 to
> PASS, while xlib.rgb24, xlib-window.rgb24 and xlib-fallback.rgb24 FAIL (with
> or without the patch applied).

It's just aliasing noise in the edge rendering. The references need to be regenerated, however Xvfb doesn't render correctly (that's another can of worms), and actually shouldn't be hitting this code path anyway.

> The two tests using CAIRO_TEST_TARGET=image PASS cleanly, without limiting
> CAIRO_TEST_TARGET I get other failures.
> 
> But I also get lots of failures when I run the full testsuite, so I'm quite
> at a loss here. :(

For testing this change, verifying image is sufficient. The testcase will need to modified slightly (well a new one to be written) as the current OVER replay is designed to hit the old optimised path. We will want another OVER replay that uses a mask so that it is never optimised away, and another OVER replay that will be optimised by your new code but would not be by the old (i.e. it should replay over existing content).
Comment 6 Chris Wilson 2015-01-08 23:19:33 UTC
Ah! Clearly I haven't been thinking enough about Cairo recently.

There is a problem with using OVER as a shortcut here over existing content. The problem is that the command is to blend the final result of the replay with the existing content, but the inplace replay with apply its own operators with the existing content (when the intent of the replay is to apply those only to itself).

Sorry, the requirement is that the dst->is_clear for OVER to be able to replay in place. You could analyse the replay and decide that it doesn't use any operators that would mess up inplace though.
Comment 7 Emanuele Aina 2015-01-08 23:31:43 UTC
> There is a problem with using OVER as a shortcut here over existing content.
> The problem is that the command is to blend the final result of the replay
> with the existing content, but the inplace replay with apply its own
> operators with the existing content (when the intent of the replay is to
> apply those only to itself).

Shame, that's what I feared. This means that the documentation is somewhat misleading: as I read it, I interpreted the recording surfaces as a mechanism to get delayed execution, for which I would have expected that the replay operators *would* mess with the target surface.

My understanding is that an alternative way to get the current behaviour is to always explicitly replay the recorded operations to a temporary surface and then composite it, but there's no way to get the delayed execution which is somewhat described in the documentation.

If the point of recording surfaces is not delayed execution, what's their real use-case?

> Sorry, the requirement is that the dst->is_clear for OVER to be able to
> replay in place. You could analyse the replay and decide that it doesn't use
> any operators that would mess up inplace though.

This is definitely well above my current understanding of cairo's code, sorry. :(

I can probably hack my application to use the SOURCE operator and some smart clipping to avoid the performance hit, but I honestly find the current behaviour quite surprising.
Comment 8 GitLab Migration User 2018-08-25 13:33:36 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/cairo/cairo/issues/66.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.