Bug 43397 - EXTEND_NONE is used instead of EXTEND_PAD when src sample area is entirely within the surface
Summary: EXTEND_NONE is used instead of EXTEND_PAD when src sample area is entirely wi...
Status: RESOLVED MOVED
Alias: None
Product: cairo
Classification: Unclassified
Component: general (show other bugs)
Version: 1.10.3
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Carl Worth
QA Contact: cairo-bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 35579
  Show dependency treegraph
 
Reported: 2011-11-30 15:47 UTC by Konstantin Svist
Modified: 2018-08-25 13:57 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Work-in-progress patch for X Server (4.03 KB, patch)
2012-03-23 01:41 UTC, Andrea Canciani
Details | Splinter Review

Description Konstantin Svist 2011-11-30 15:47:39 UTC
Please see bug at https://bugzilla.mozilla.org/show_bug.cgi?id=679257, especially comment 5
Comment 1 Chris Wilson 2011-12-01 04:58:36 UTC
Fix the EXA/driver as EXTEND_NONE is the default surface extend mode and so you will run across this bug anyway. The driver is choosing to fallback on an operation it can handle..
Comment 2 Konstantin Svist 2011-12-01 09:44:39 UTC
Chris, please comment on bug 35579, last I checked, they were convinced it's not a driver bug
Comment 3 Konstantin Svist 2011-12-05 20:42:39 UTC
Reopening until someone accepts the bug. The browser behavior is wrong - a.k.a. a bug. If you believe this isn't Cairo bug, please convince firefox or driver guys that it's theirs. It can't be nobody's bug!
Comment 4 Chris Wilson 2011-12-06 01:32:21 UTC
It is a driver bug, pure and simple. Bug 35579 concurs with that assessment, it just requires a motivated individual to first fix EXA to pass sufficient information to the driver for it to be able to detect that it can handle this efficiently, or to fix EXA to provide an alternate "fall-forward" path to transform the source into something the driver can handle.
Comment 5 Michel Dänzer 2011-12-06 06:47:29 UTC
That makes little sense to me. Cairo incurs overhead to disobey the application's intent (why?), and the X server is supposed to incur even more overhead to undo that again?

Changing RepeatNone to RepeatPad when possible might be a nice optimization in EXA for apps which don't know to use the latter, but there seems little doubt that this is a bug in Cairo. If you're refusing to fix it, the appropriate resolution would be WONTFIX.
Comment 6 Benjamin Otte 2011-12-06 07:11:43 UTC
I understand the view that it's not a Cairo bug, but I think it's possible to optimize it in Cairo in a way that makes more people happy. But I also think that it's EXA's or the driver's job to fix it - potentially in the same way.

The way I see it, the problem goes like this:
1) Do an operation where the EXTEND doesn't matter
2) Select any EXTEND (most likely randomly)
3) Performance depends on what you pick in (2)
The expected result should be:
3) Performance is identical (and optimal), no matter what EXTEND you pick.

Cairo has always taken the view that it's not the application author's job to know about performance implications and it's Cairo's job to always pick the fastest way to do something.
In fact, I would consider it preferable if all EXTEND modes were equally slow to NONE being fast and PAD being slow, because then applications would get into the business of adding extra code to compute a "proper" EXTEND when they shouldn't.

The reason why I assume Chris closed this bug is for 2 reasons:
1) Cairo overloads EXTEND_NONE with "EXTEND doesn't matter".
2) Historically, EXTEND_NONE was the fastest extend with XRender.

I think problem (1) is something that should be fixed inside Cairo (and might be already, I haven't looked at recent changes), either by introducing an internal EXTEND_ANY value that the backends transform to their fastest extend mode on demand. Or by having code in the backends that can check if extend matters for a certain information and then can pick the per-backend/operation fastest extend mode.

However, problem (2) is not something Cairo can fix. Because if Cairo solved problem (1), it would pick EXTEND_NONE, because that's the default extend to use with XRender. So in that sense it is a bug in X. And I'm not sure if the correct fix is to change the X specification to say "the fastest mode is PAD" (because then you will have to guarantee that in the future), to add an EXTEND_ANY to the XRender specification or to make X detect that the mode doesn't matter. I would clearly vote for the detection, but I'll leave that for you to figure out in your bug.

But considering that we do 2 things correctly:
1) Have the same performance characteristics for any EXTEND when it doesn't matter
2) Pass the desired mode for EXTEND_ANY
I don't think it's a bug in Cairo.
Comment 7 Karl Tomlinson 2011-12-06 14:18:35 UTC
It looks like the relevant code was added in the commit
http://cgit.freedesktop.org/cairo/commit/?id=c0af8c70635d641fc5770afc0cd6e9285122fd72
but it is not clear to me from the comment exactly which situations this would optimize.

It seems that RepeatNone is the default because early versions (before RENDER_MINOR 10) didn't have RepeatPad.  Perhaps this is necessary for backward compatibility reasons, but I can't think of a reason for anyone choosing to use RepeatNone.
Comment 8 Konstantin Svist 2011-12-09 11:38:58 UTC
please keep discussing
Comment 9 Alex Deucher 2011-12-09 12:13:15 UTC
Most 3D hardware has to jump through a lot more hoops to implement render semantics for REPEAT_NONE compared to REPEAT_PAD.
Comment 10 Benjamin Otte 2011-12-09 12:27:32 UTC
There might be 3D hardware that has to jump through a lot more hoops to implement render semantics for REPEAT_PAD compared to REPEAT_NONE.
I already pointed out that it's the job of render to figure out a feature that makes Cairo say EXTEND_ANY or fix it their way. Which is the crux for why this is not a bug in Cairo. Cairo cannot know which is the extend mode it should use if the extend mode doesn't matter.

And the Xrender spec is very clear that the default value is None. See http://www.x.org/releases/current/doc/renderproto/renderproto.txt for details.

I can understand that it makes your life as an X server or driver developer more complicated, but afaics there is nothing we can do on the Cairo side without potentially breaking things in competing implementations.

So this is not a bug in Cairo, Cairo is doing the right thing. It's a bug either in the X server failing to implement an obvious optimization or a bug in the render protocol.

Or do you really expect us to ignore the X specifications so that we make the life of a single X server easy that is incapable of optimizing a simple case? Especially because there are libraries competing with Cairo that will run into a similar issue...

So I'm going to again close this as NOTOURBUG. Feel free to reopen it when you assign it to the X server or somewhere else.
Or feel free to propose a solution inside Cairo that does not break backwards compatibility with systems where NONE is faster than PAD.
Comment 11 Michel Dänzer 2011-12-13 08:59:18 UTC
(In reply to comment #10)
> There might be 3D hardware that has to jump through a lot more hoops to
> implement render semantics for REPEAT_PAD compared to REPEAT_NONE.

Do you have any evidence to substantiate that?


> I already pointed out that it's the job of render to figure out a feature that
> makes Cairo say EXTEND_ANY or fix it their way. Which is the crux for why this
> is not a bug in Cairo. Cairo cannot know which is the extend mode it should use
> if the extend mode doesn't matter.

But this bug is isn't about EXTEND_ANY. It's about the app asking for EXTEND_PAD but Cairo using something else.


> And the Xrender spec is very clear that the default value is None. See
> http://www.x.org/releases/current/doc/renderproto/renderproto.txt for details.

Yes, this is one unfortunate result of the RENDER extension being designed with little consideration for how 3D hardware works.


> I can understand that it makes your life as an X server or driver developer
> more complicated,

As I pointed out in comment #5, it also incurs unnecessary overhead, both in Cairo and in the X server.


> Or feel free to propose a solution inside Cairo that does not break backwards
> compatibility with systems where NONE is faster than PAD.

Such systems must be using old / buggy X servers / drivers, so it should be possible to only enable this overhead when it's really necessary.
Comment 12 Benjamin Otte 2011-12-13 10:43:14 UTC
(In reply to comment #11)
> But this bug is isn't about EXTEND_ANY. It's about the app asking for
> EXTEND_PAD but Cairo using something else.
> 
That is not true. This bug is about the extend mode not mattering and Cairo selecting one that you don't like.

We disobey applications a lot if we know that what they want to do is stupid and we can make it better.
Comment 13 Karl Tomlinson 2011-12-13 11:19:40 UTC
(In reply to comment #12)
> We disobey applications a lot if we know that what they want to do is stupid
> and we can make it better.

I think the point is that it seems that cairo is trying to make things better but in fact making things much worse.  If it can't be sure of making things better, it would be better to stay out of the way.
Comment 14 Benjamin Otte 2011-12-13 11:38:49 UTC
No, Cairo is not making things worse (I pointed that out in detail in comment 10). Cairo is doing exactly what you'd expect from an optimized software.

The X server is making things worse by actually not adhering to the spec.
Comment 15 Bill Spitzak 2011-12-13 12:45:11 UTC
Nonsense, Cairo is definitely doing the wrong thing. It is changing EXTEND_PAD to EXTEND_NONE and getting the slower result on the backend.

If the backend is expected to detect that the extend does not matter (so it can use EXTEND_PAD even if EXTEND_NONE is specified) then there is still no reason for Cairo to change it, because the backend can do this detection no matter what extend mode is asked for!

The addition of EXTEND_ANY, and fixing Cairo to use this instead of EXTEND_NONE for "the extend does not matter", would fix this bug. But that is not in there now so Cairo certainly does have a bug. You can't claim that something will fix a problem and then claim there is no problem.

Furthermore if you think any graphics card made in the last 10 years does EXTEND_NONE faster than EXTEND_PAD then you are incredibly ignorant of how graphics works. EXTEND_NONE is *useless* for texture mapping on 3D and therefore is not implemented in hardware in most cases. It is useless because if the texture's edge lines up with the polygon edge then it would produce double-premultiplied results that produce black borders on the objects.

In face EXTEND_NONE is so useless that I would very much like to see Cairo fixed so it instead means "draw as though EXTEND_PAD (or reflections) is on, but temporarily intersect the mask with a quadrilateral that is the boundary of the source surface". This is in fact what users of Cairo expect and want and are currently doing by turning on EXTEND_PAD and manually masking. This would mean that all use of EXTEND_NONE by the backend would be removed and backends would never have to implement it at all.

Please don't let a mistake in the design of XRender propagate forever as a serious defect in Cairo.
Comment 16 Benjamin Otte 2011-12-13 15:37:40 UTC
(From comment #10)
> And the Xrender spec is very clear that the default value is None. See
> http://www.x.org/releases/current/doc/renderproto/renderproto.txt for details.
> 
Again: Cairo does the right thing by using NONE. The library it is using explicitly says "if extend doesn't matter, use NONE".
Comment 17 Konstantin Svist 2011-12-13 15:43:19 UTC
Can you add one that would be WHATEVER_IS_FASTEST?
Comment 18 Bill Spitzak 2011-12-13 18:14:57 UTC
I believe the proposed EXTEND_ANY is the same as "WHATEVER_IS_FASTEST".

Like several others, I do believe Cairo is doing the wrong thing. You are assuming that EXTEND_NONE is fastest, which is false for everything except XRender.

This means that whatever work Cairo did to figure out that "extend does not matter" is wasted, because the backend has to re-run these tests whenever it gets EXTEND_NONE in order to realize it can switch to a faster one.

Adding EXTEND_ANY will fix this, because the information that the front end figured out that extend did not matter is preserved.

But it might make a lot more sense to leave extend unchanged and have the backends figure out if it matters. In many cases (especially when EXTEND_PAD is used on modern 3D hardware) it is actually pointless to do the test, because it cannot be made faster by switching in any way, so the time spent doing the test is wasted. Only a backend can know this answer.
Comment 19 Konstantin Svist 2011-12-20 14:18:20 UTC
So, what's the verdict?
Comment 20 Konstantin Svist 2012-01-11 17:10:55 UTC
*ping*
Comment 21 Andrea Canciani 2012-01-12 01:31:27 UTC
Both Cairo and Xlib use NONE as the default extend mode.
This might be considered unfortunate if current hardware is unable to accelerate it, but it's unlikely to change, as it would break a lot of existing software.

On the good side, Cairo normalizes repeats and guarantees that if you get REPEAT_PAD, REPEAT_REFLECT or REPEAT_NORMAL, you will need to do that.
Conversely, when a Cairo extend mode results in Xlib's REPEAT_NONE, it can actually mean:
 - REPEAT_NONE is actually needed
 - REPEAT_ANY cannot be represented in Xlib, hence use the default (REPEAT_NONE)

My opinion is that it would be a good idea if drivers were able to automatically detect if they can or cannot accelerate the operation. This would be beneficial both for Cairo and for other clients that use Xlib directly the default mode.

How about adding a is_repeat_any() utility function for drivers to call so that they can detect if the repeat mode affects the result (and if not they just use whatever repeat mode is the fastest one)?
I cannot see major cons and it seems like it could be beneficial to most driver clients.
Comment 22 Andrea Canciani 2012-01-12 01:54:55 UTC
While having a look at the ati drivers I found that they should already be optimizing it for the simple case where transform is an identity matrix.
AFAICT (recent versions of) Cairo should be providing an identity transform whenever possible.
Can you xtrace the RENDER requests and check this?
Comment 23 Karl Tomlinson 2012-01-12 11:22:29 UTC
Cairo wouldn't be able to provide an identity transformation in this situation.  The source surfaces are scaled and rotated.
Comment 24 Bill Spitzak 2012-01-12 13:07:20 UTC
It seems wasteful to have another call to distinguish the "any" case from "none", as Cairo must have already done the work to figure this out. Unless cairo caches this result it will have to run the calculation again.

Would prefer that a new symbol be added that indicates "any" repeat mode is ok, since it is apparent that Cairo figured this out just so it could send "none".

So rather than a new call I would vastly prefer just an extra enumeration value.
Comment 25 Andrea Canciani 2012-01-13 00:15:41 UTC
(In reply to comment #24)
> It seems wasteful to have another call to distinguish the "any" case from
> "none", as Cairo must have already done the work to figure this out. Unless
> cairo caches this result it will have to run the calculation again.
> 
> Would prefer that a new symbol be added that indicates "any" repeat mode is ok,
> since it is apparent that Cairo figured this out just so it could send "none".
> 
> So rather than a new call I would vastly prefer just an extra enumeration
> value.

That "new enumeration value" would not be within Cairo scope.
It would be a new RENDER protocol value *and* a new value for video drivers.

A much simpler change would be to leave the RENDER protocol intact and only add it internally to Xorg and the drivers (the computation is quite cheap, you just have to transform a rectangle and check if it's inside another one).
This would still require changes to all the drivers that make use of it (obviously) and it would probably require something like an EXA version bump or something like new caps values (to avoid breaking older driver which do not support the new enum value), too.

I believe that if the developers of the ati driver think that this change can provide major performance boost, it might be an interesting suggestion, yet this would  would better be discussed in xorg-devel.
Are they following this discussion or should they be pinged about it?
Comment 26 Bill Spitzak 2012-01-13 13:42:30 UTC
I like the proposed simpler change. However I think this should mean that Cairo should *never* change the extend mode, it should send exactly the same one selected by the cairo commands to the back end. Putting it in the backend only will make cairo faster. No matter how inexpensive the test is, saving one or two runs of it is still going to be faster!

1. No time is wasted by Cairo running the test, since the backend has to do it anyway (for the vast majority of back ends where NONE is in fact the slowest mode).

2. The backend can skip the test if the current extend mode is already the one it would use.

Preserving the extend mode can also be used by software to get around bugs in the backends by making sure the extend is one that works.
Comment 27 Benjamin Otte 2012-01-13 13:55:00 UTC
(In reply to comment #26)
> Preserving the extend mode can also be used by software to get around bugs in
> the backends by making sure the extend is one that works.
>
And this is the biggest reason why Cairo should change the extend mode. But I've pointed that out repeatedly before, but you either fail to understand what I'm saying or ignore it.
Comment 28 Konstantin Svist 2012-01-20 16:41:11 UTC
So, just to make sure I understand:
Cairo is explicitly disobeying the call's EXTEND_PAD (as per Comment 15) and replaces it with EXTEND_NONE in this piece of code:

if (attr->extend != CAIRO_EXTEND_REPEAT) {
 ...
} else {
 if (sampled_area.x >= extents.x &&
 sampled_area.y >= extents.y &&
 sampled_area.x + (int) sampled_area.width <= extents.x + (int) extents.width &&
 sampled_area.y + (int) sampled_area.height <= extents.y + (int) extents.height)
 {
 /* source is wholly contained within extents, drop the REPEAT */
 extents = sampled_area;
 attr->extend = CAIRO_EXTEND_NONE;
 }
...


Wouldn't it make more sense to simply check for EXTEND_PAD in this code and skip conversion to EXTEND_NONE? Or maybe check whether or not XRENDER is being used and if not, don't touch the extend flag?
Comment 29 Andrea Canciani 2012-01-21 03:21:56 UTC
(In reply to comment #28)
> So, just to make sure I understand:

I'm afraid you missed a couple of points.

> Cairo is explicitly disobeying the call's EXTEND_PAD (as per Comment 15) and
> replaces it with EXTEND_NONE in this piece of code:

Cairo is not disobeying anything. It accepted a CAIRO_EXTEND_PAD and decided to internally replace with CAIRO_EXTEND_NONE, because they are equivalent and CAIRO_EXTEND_NONE has some desirable properties.
Please notice that this change is *NOT* observable from outside Cairo in any way.
Cairo could have legitimately replaced it with CAIRO_PRIVATE_EXTEND_ANY and everything would still be ok from the Cairo point of view (i.e. the result is correct, the backends have the information needed to perform any appropriate optimization).
It happens that (the non-existing) CAIRO_PRIVATE_EXTEND_ANY and CAIRO_EXTEND_NONE both map to RepeatNone in the xlib backend.
In the first case this happens because, since any of Repeat{None,Pad,Reflect,Normal} would produce the same result, Cairo goes for the default one, hoping/expecting that defaults are optimized as much as possible in the underlying layers.
In the second case Cairo *must* use RepeatNone, because it believes that otherwise the result will be different.

> 
> if (attr->extend != CAIRO_EXTEND_REPEAT) {
>  ...
> } else {
>  if (sampled_area.x >= extents.x &&
>  sampled_area.y >= extents.y &&
>  sampled_area.x + (int) sampled_area.width <= extents.x + (int) extents.width
> &&
>  sampled_area.y + (int) sampled_area.height <= extents.y + (int)
> extents.height)
>  {
>  /* source is wholly contained within extents, drop the REPEAT */
>  extents = sampled_area;
>  attr->extend = CAIRO_EXTEND_NONE;
>  }
> ...
> 
> 
> Wouldn't it make more sense to simply check for EXTEND_PAD in this code and
> skip conversion to EXTEND_NONE?

No, that is not the issue. Cairo knows that EXTEND_PAD, EXTEND_NONE, anything have the same result. It is just guessing that the default will be fast.
Unfortunately this is not the case for that specific driver, but might very well be in many other cases.
Can you guarantee that no driver will have regressions if it gets PAD instead of NONE?
Optimizing the default case looks quite important, so I would definitely try to do that, instead of trying to avoid the default.

> Or maybe check whether or not XRENDER is being used and if not, don't touch the extend flag?

In this case XRENDER is being used, so I guess you would have to suggest the opposite to change Cairo behavior, i.e.:
Or maybe check whether or not XRENDER is being used and if *it is*, don't touch the extend flag?

I believe that most of this misunderstanding is due to the similarity in names (and ease of mapping) between Cairo extend modes and Xlib repeat modes.
Please remember that they are not the same thing.
(This is not that crazy, it might really happen. Some time ago I remember that there was some discussion about allowing different vertical/horizontal extend modes)

Shall I ping the xorg-devel list/bug 35579/anybody asking if detecting the "ANY" extend mode in X is acceptable?
Comment 30 Bill Spitzak 2012-01-22 09:09:06 UTC
I, and I believe everybody else arguing with you, FULLY understand exactly what you are saying.

> No, that is not the issue. Cairo knows that EXTEND_PAD, EXTEND_NONE, anything have the same result. It is just guessing that the default will be fast. Unfortunately this is not the case for that specific driver, but might very well be in many other cases.

I can guarantee that EXTEND_NONE is the *SLOWEST* for any driver that uses a modern GPU to sample the image, since there is no incentive to accelerate that mode on the GPU hardware (for the simple reason that it is *useless* for graphics due to double-multiplying of the edges of texture-mapped shapes).

In C code I have found that EXTEND_PAD is much easier to implement than EXTEND_NONE, especially for correct antialiasing around the edges. Nuke actually implements EXTEND_NONE by enlarging the image to add a zero pixel all around the edge and then using EXTEND_PAD on this. Though this is due to Nuke using more complex larger sampling filters that make EXTEND_NONE very difficult to implement in other ways, since Cairo is limiting itself to very small filters it may have other methods.

Thus I believe EXTEND_NONE is the slowest for *EVERY* driver other than the XRender one!

> Can you guarantee that no driver will have regressions if it gets PAD instead
of NONE?

Yes. The driver must be able to correctly clip an EXTEND_PAD image that is *partially* clipped, otherwise the driver is broken with current Cairo anyway. I am willing to bet there is no driver that correctly implements partial-clip of an EXTEND_PAD image that does not correctly render an image where all of the extend is clipped.

> Optimizing the default case looks quite important, so I would definitely try to do that, instead of trying to avoid the default.

Yes, and as has been stated several times, there are three methods:

1. What we have now: Cairo wastes time on a clipping test to turn the mode to EXTEND_NONE. The driver is then forced to *REPEAT* that test in order to detect if this was the reason it got EXTEND_NONE, and change the mode to the fastest one in this case.

2. (what everybody expected): Cairo leaves the extend mode alone and sends it to the driver. If (and only if) the driver knows it could run faster with another mode, the driver then runs the clipping test and changes to that other mode. It can skip this test if the mode is already the fastest.

3. (the most common proposal here): Add EXTEND_ANY to the modes, modifying the XRender API to accept it as well. Cairo still runs it's clipping test and changes the mode to EXTEND_ANY. The driver can then change it to the fastest mode without re-running the clipping test. This is not as fast as #2 because the test is run even when the mode is already the fastest. However it is faster than #1 because the test is only run once. It also means the driver writers do not have to implement the test.
Comment 31 Benjamin Otte 2012-01-22 10:07:32 UTC
(In reply to comment #30)
> Thus I believe EXTEND_NONE is the slowest for *EVERY* driver other than the
> XRender one!
>
There is no other driver we are talking about here. So it looks like we're doing things right.

And you obviously are aware of things like http://cgit.freedesktop.org/cairo/commit/?id=a1d0a06b6275cac3974be84919993e187394fe43 and know that this means that either you or the commit message have no idea what they're saying.
 
> 1. What we have now: Cairo wastes time on a clipping test to turn the mode to
> EXTEND_NONE. The driver is then forced to *REPEAT* that test in order to detect
> if this was the reason it got EXTEND_NONE, and change the mode to the fastest
> one in this case.
> 
This is of course the only solution that works. But you either do not know that or are ignoring the comments from other people on purpose. (I stated the why in comment 6, that was 24 comments ago).

> 2. (what everybody expected): Cairo leaves the extend mode alone and sends it
> to the driver. If (and only if) the driver knows it could run faster with
> another mode, the driver then runs the clipping test and changes to that other
> mode. It can skip this test if the mode is already the fastest.
>
Just in case you didn't learn it 24 comments back, I'll quote for you:

> 1) Have the same performance characteristics for any EXTEND when it doesn't
> matter
 
> 3. (the most common proposal here): Add EXTEND_ANY to the modes, modifying the
> XRender API to accept it as well. Cairo still runs it's clipping test and
> changes the mode to EXTEND_ANY. The driver can then change it to the fastest
> mode without re-running the clipping test. This is not as fast as #2 because
> the test is run even when the mode is already the fastest. However it is faster
> than #1 because the test is only run once. It also means the driver writers do
> not have to implement the test.

This violates the same point as (2).
Comment 32 Bill Spitzak 2012-01-22 19:52:26 UTC
I don't know if you are being deliberately misleading, but comment #6 explicitly mentions the need for an "EXTEND_ANY" to fix this. I will quote it:

"I think problem (1) is something that should be fixed inside Cairo (and might
be already, I haven't looked at recent changes), either by introducing an
internal EXTEND_ANY value that the backends transform to their fastest extend
mode on demand. Or by having code in the backends that can check if extend
matters for a certain information and then can pick the per-backend/operation
fastest extend mode."

The first statement is exactly what I am proposing. The second statement is a repeat of the exact same problem everybody is complaining about: the complete useless test in Cairo when the driver has to repeat the test!

I do not believe this bug should be closed unless one of these happens:

1. Cairo stops testing for clipping and sends the clip unchanged to the backend (unlikely)

2. A new enumeration called EXTEND_ANY is added to the api between Cairo and the backend so that the backend can distinguish between EXTEND_NONE and "extend does not matter"

3. The backend can inform Cairo which mode is fastest and it chooses that instead of EXTEND_NONE when running the clipping test. I don't like this because the fastest mode may depend a lot on the current graphics state.

Any solution that requires the backend to run a clipping test that Cairo already ran is unacceptable.
Comment 33 Benjamin Otte 2012-01-22 22:28:56 UTC
(In reply to comment #32)
> I don't know if you are being deliberately misleading, but comment #6
> explicitly mentions the need for an "EXTEND_ANY" to fix this. I will quote it:
> 
You could have read all of the comment and would have found this gem:

> But considering that we do 2 things correctly:
> 1) Have the same performance characteristics for any EXTEND when it doesn't
> matter
> 2) Pass the desired mode for EXTEND_ANY
> I don't think it's a bug in Cairo.
>
We do the correct thing. We pass the extend mode that is fastest to Render.
And I even pointed that out conveniently in comment 10:

> And the Xrender spec is very clear that the default value is None. See
> http://www.x.org/releases/current/doc/renderproto/renderproto.txt for details.

So again, this is a bug in Render or the driver, not in Cairo. Cairo does absolutely the right thing with the information that it's given. It couldn't behave any better.
Comment 34 Andrea Canciani 2012-01-23 04:04:50 UTC
Sorry for not replying to every comment, but I've been busy trying to figure out what goes on within EXA.

Fun fact: the X server is already trying to detect if the repeat modes matters:

commit ff258ac2783203ed2a7698894d951391d1aecebc
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Thu Oct 6 23:45:29 2005 +0000

    Clients tend to set picture->repeat when not necessary. Most HW cannot
        accelerate repeat NPOT thus triggering software fallback (this is the
        case with gnome desktop for example). This adds a simple optimisation
        to exa that removes "repeat" when it's obviously useless, that is, the
        single picture instance covers the entire rectangle beeing used

If it does not matter, it sets the "repeat" field to 0, which implies that the repeatType is RepeatNone.
(To me, this looks like a further reason to believe that the issue is not in Cairo)

The test could be extended to also allow for non-identity transforms, which is the case we're tackling.
I am now working on a patch which detects if "any" repeat mode is ok and in this case it tries to perform the compositing with every available repeat mode until one succeeds.

For this, I would be happy to have a backtrace showing the stack when the driver decides to fallback.
The reason for this is that I'm trying to only fix exaComposite, leaving exaCompositeRects alone.
So far exaComposite has been optimized to avoid fallbacks as much as possible, while exaCompositeRects has been kept simple, but if the performance issue you're getting comes from exaCompositeRects, fixing exaCompoosite won't be sufficient.
Comment 35 Michel Dänzer 2012-01-23 06:24:29 UTC
(In reply to comment #29)
> Please notice that this change is *NOT* observable from outside Cairo in any
> way.

Other than being orders of magnitude slower in some cases. ;)

> In the first case this happens because, since any of
> Repeat{None,Pad,Reflect,Normal} would produce the same result, Cairo goes for
> the default one, hoping/expecting that defaults are optimized as much as
> possible in the underlying layers.

Well, again, that's flawed in the case of RepeatNone. Its being default is an accident at best.


(In reply to comment #34)
> commit ff258ac2783203ed2a7698894d951391d1aecebc
> Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date:   Thu Oct 6 23:45:29 2005 +0000
> 
>     Clients tend to set picture->repeat when not necessary. Most HW cannot
>         accelerate repeat NPOT thus triggering software fallback (this is the
>         case with gnome desktop for example). This adds a simple optimisation
>         to exa that removes "repeat" when it's obviously useless, that is, the
>         single picture instance covers the entire rectangle beeing used
> 
> If it does not matter, it sets the "repeat" field to 0, which implies that the
> repeatType is RepeatNone.

Hmm, that code was intended to avoid RepeatNormal, which older hardware cannot handle with non-power-of-two sources. It should also try RepeatPad before RepeatNone.

It looks like there's still some code in exa_render.c that treats pSrc->repeat as RepeatNormal without checking pSrc->repeatType...


> I am now working on a patch which detects if "any" repeat mode is ok and in
> this case it tries to perform the compositing with every available repeat mode
> until one succeeds.

With transformed (and arguably also untransformed) sources, this test should only be performed if the driver seems unable to handle the specified repeat mode in the first place, to avoid adding more of the same overhead madness this bug report is about.


> The reason for this is that I'm trying to only fix exaComposite, leaving
> exaCompositeRects alone.

exaCompositeRects is only called from exaGlyphs, I don't think it's relevant for this.
Comment 36 Konstantin Svist 2012-03-22 13:35:46 UTC
No progress on this, sounds like?
Comment 37 Andrea Canciani 2012-03-23 01:41:06 UTC
Created attachment 58904 [details] [review]
Work-in-progress patch for X Server

Sorry, I started working on this, but I got stuck with many other things.
The attachment contains a patch that makes the X server try every possible combination when the repeat mode is detected as "any".

Please notice that the X server is already detecting if the repeat mode actually matters or not and normalizing it to RepeatNone even without my patch.

The code should be extended to detect more cases when the repeat mode does not matter and tested (especially with radeon drivers).
Comment 38 Siarhei Siamashka 2012-04-05 06:39:53 UTC
It is also difficult to SIMD optimize REPEAT_NONE in pixman (the backend for Cairo’s software rasteriser) in the cases when REPEAT_NONE and REPEAT_PAD are not equivalent. Checking whether src sample area is entirely within the surface (this is also done in pixman) is not very effective when BILINEAR scaling is used, because sampling a bit of pixels outside the source image is rather common. IMHO it makes sense to advise against using RepeatNone whenever it is possible.
Comment 39 Karl Tomlinson 2012-12-05 23:58:30 UTC
(In reply to comment #37)
> Please notice that the X server is already detecting if the repeat mode
> actually matters or not and normalizing it to RepeatNone even without my
> patch.

The case it currently detects is optimized by drivers (because there is no transform).
 
> The code should be extended to detect more cases when the repeat mode does
> not matter and tested (especially with radeon drivers).

Whether or not the X server / driver interface should know when it can turn a
slow operation into a fast operation, there is always the risk that the
server's detection algorithm won't cover all the cases in cairo's "doesn't matter" detection for its mis-optimization.
Comment 40 GitLab Migration User 2018-08-25 13:57:05 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/cairo/cairo/issues/284.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.