Bug 12996 - Xlib source surface fast-paths do not use IncludeInferiors, while slow paths do
Xlib source surface fast-paths do not use IncludeInferiors, while slow paths do
Product: cairo
Classification: Unclassified
Component: xlib backend
x86-64 (AMD64) All
: medium normal
Assigned To: Carl Worth
cairo-bugs mailing list
Depends on:
  Show dependency treegraph
Reported: 2007-10-29 20:18 UTC by Nathaniel Smith
Modified: 2009-12-07 13:14 UTC (History)
1 user (show)

See Also:

test case -- all squares should blink red/yellow (3.48 KB, text/x-python)
2007-10-29 20:21 UTC, Nathaniel Smith
xlib-IncludeInferiors.c -- cairo test suite test -- untested (9.06 KB, text/x-csrc)
2007-11-13 00:22 UTC, Nathaniel Smith

Note You need to log in before you can comment on or make changes to this bug.
Description Nathaniel Smith 2007-10-29 20:18:51 UTC
Attached is a PyGtk test program.  It has a simple widget that simply blinks every 500 milliseconds, and then this widget is placed into various Composite-ful situations to see how they work.

The general strategy for this use of Composite involves at least 4 windows.  First, there is the blinker itself, which has manual compositing enabled on it.  This is then nested within a window owned by an 'ExposeForwarder' widget -- the purpose of this widget is to receive Damage notifications from the composited window, and forward them on to the widget that will render the composited window.    This, in turn is nested within a window owned by the widget "BlinkWatcher", which is the one that actually will draw the composited blinker.  However, it cannot draw it on its own window, because cairo does not support IncludeInferiors on the target surface, and its drawing would be clipped by the 'ExposeForwarder' window.  Therefore, it also creates a child window named '_other_window', and does its actual drawing onto that.

The test program creates 5 blinkers and packs them into a vbox.  From top to bottom, they are:
  1) a raw blinker, for comparison
  2) a blinker that is composited using paint_with_alpha(0.99)
  3) a blinker that is composited using paint_with_alpha(1)
  4) a blinker that is packed inside a secondary window, and then that secondary window is composited using paint_with_alpha(0.99)
  5) like (4), but using paint_with_alpha(1).

(1)-(4) all display perfectly; (5) simply shows up as a black square.  From the fact that triggering this bug requires both alpha=1 *and* putting the window that is being drawn on as a child of the window being composited, I suspect that Cairo's non-Render pathway is not setting IncludeInferiors correctly.

Some further evidence for this is that running through xtrace, the relevant GC being passed to CopyArea is created as
  CreateGC cid=0x03e0003c drawable=0x03e00003 values={graphics-exposures=false(0x00)}
and has its clip mask tweaked from time to time, but I see no mention of IncludeInferiors.  The calls to RenderCreatePicture, on the other hand, do mention IncludeInferiors.
Comment 1 Nathaniel Smith 2007-10-29 20:21:39 UTC
Created attachment 12252 [details]
test case -- all squares should blink red/yellow

Here's the test case.  Just run with 'python test_cairo_composite.py'.

(Forgot to mention before -- all tests were run with debian's libcairo2 version 1.4.10-1+b2.)
Comment 2 Nathaniel Smith 2007-11-12 22:13:39 UTC
Updating title to make bug clearer.  If one looks at the cairo source, one will notice that there are two render contexts, one used for sources and one for destinations.  The source context has IncludeInferiors set, the destination context does not.  So everything works as expected when using Render.

However, some cases get optimized down to XCopyArea, and there is only one GC that is used for both source and destination surfaces, and this GC does not have IncludeInferiors set.  Therefore, whether IncludeInferiors is used depends entirely on how much optimization cairo thinks it can get away with.  Obviously, this is broken.

One fix would be to disable the XCopyArea fallback.  I don't know what the speed  implications of this would be.  I'd *hope* the server performed identity-transform blits equally fast no matter whether they were requested via Render or XCopyArea, and Render certainly does *have* code to notice this case and optimize it, but I don't actually *know* that it is equally fast.  

The other obvious fix would be to split up cairo's GCs just like the Render contexts are split up.  I had a try at this, and it got ugly; there is lots of code that touches the gc and is shared between the paths for source and dest surfaces, so you end up starting to make two copies of lots of other functions, bleh.  Maybe there is a simpler way for someone with more local knowledge of the codebase, who would have some confidence to refactor instead of just blindly copying.

The former approach would certainly be simpler, and involve removing code rather than adding it; inasmuch as Correctness > Speed, maybe it should simply be adopted.

Also to note, a workaround is to use push_group/pop_group_to_surface; compositing to the temporary surface works (who knows why), and it's handy for double-buffering anyway.
Comment 3 Nathaniel Smith 2007-11-13 00:22:51 UTC
Created attachment 12498 [details]
xlib-IncludeInferiors.c -- cairo test suite test -- untested

Well, here's a potential test case -- I'm not actually sure how to run or even build this, so... I haven't.  I have not even proven that it works.
Comment 4 Chris Wilson 2009-08-26 12:48:51 UTC
commit 40aefac5d714bf7536632ed38c7a8ee05049f30b
Author: Benjamin Otte <otte@gnome.org>
Date:   Wed Aug 26 21:22:07 2009 +0200

    [xlib] DO_XCOPYAREA and DO_XTILE optimizations break with Window source

    Cairo should include the contents of subwindows when using a Window as a
    source but will clip to subwindows when using a Window as a destination.
    This can be set using the GC's subwindow_mode.

    XCopyArea and XFillRectangle can however only use one GC for both source
    and destination. Cairo's mode is set to (the default) ClipByChildren.
    This means that copying from a Window is broken, so we only allow the
    optimization when we know that the source is a Pixmap.

    The performance impact of this change has not been tested. It should be
    small, as the code will use XRender otherwise.

    If it turns out to be a bigger impact, the optimizations could be
    improved by doing a two-step copy process:
    1) Copy to an intermediate Pixmap with IncludeInferiors
    2) Copy to the destination with ClipByChildren
    (potentially omitting one one of the steps if source or destination are
    known to be Pixmaps).

    commit 0c5d28a4e5ce5e4dd72c0f416ce5e960e92b808b

Likely places where it will break are old and slow machines, which were the motivation for this fast path... Let's see how many of those still exist.
Comment 5 Eric Piel 2009-12-07 06:53:39 UTC
It seems this commit breaks firefox (crashes when closing tabs), as mentioned in:
Comment 6 Chris Wilson 2009-12-07 13:14:00 UTC
(In reply to comment #5)
> It seems this commit breaks firefox (crashes when closing tabs), as mentioned
> in:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=551570

If you xtrace the firefox crash, you will see that it is a mozilla bug.