16820 – [EXA] Composition result in black for areas outside of source-surface bounds

Bug 16820 - [EXA] Composition result in black for areas outside of source-surface bounds

Summary: [EXA] Composition result in black for areas outside of source-surface bounds

Status:	RESOLVED FIXED

Alias:	None

Product:	xorg
Classification:	Unclassified
Component:	Driver/intel (show other bugs)
Version:	git
Hardware:	Other All

Importance:	medium normal
Assignee:	Carl Worth
QA Contact:	Xorg Project Team

URL:
Whiteboard:
Keywords:

Duplicates (1):	16843 (view as bug list)
Depends on:
Blocks:	intel-2.5
	Show dependency tree / graph

Reported:	2008-07-23 14:56 UTC by Clemens Eisserer
Modified:	2008-10-06 15:27 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:
i915 features:

Attachments
Sheard image with black pixels outside src-bounds (8.37 KB, image/png) 2008-07-23 14:56 UTC, Clemens Eisserer	no flags	Details
View All

Description Clemens Eisserer 2008-07-23 14:56:15 UTC

Created attachment 17843 [details]
Sheard image with black pixels outside src-bounds

Composition with RepeatNone on src and an opaque dest results in black for areas which fall outside the source-bounds, instead of leaving those areas untouched.

For non-shear transformations thats no problem at all, because the x/y/w/h parameters XRenderComposite takes act as simple clip, however for transformations with non-rectangular output its quite a problem.
The render-spec is a bit weird and inconsistent about this, however "RepeatNone" would in my opinion be most useful when treated as "all pixels outside src aren't touched".

For now its not really possible to draw a single (RepeatNone) shear-transformed image with XRender to non-black background.
In a discussion on the xorg-mailing-list the only idea mentioned to avoid this was to set a clip, however this also does not really work because its implementation and filter specific which pixels are touched or not.
Either the clip clips too much or leaves a thin black line.

Comment 1 Michel Dänzer 2008-07-24 00:26:09 UTC

Actually, I thought the conclusion of that discussion was that the problem was due to using the Src operator. Are you using a different operator now? If so, this could be a bug anywhere from the driver you're using to the pixman library, please start by attaching the full xorg.conf and Xorg.0.log files.

Comment 2 Clemens Eisserer 2008-07-24 05:46:46 UTC

Sorry I forgot to mention that I was using PictOpOver.
Indeed it seems to be a bug in the Intel-driver, when using XAA I get the result I expected.

By the way is it possible to tell XRender not to interpolate the edges of the source when a billinear filter is set, like it is done here: http://picasaweb.google.com/linuxhippy/Transformations/photo#5224020903104218962
The problem is that e.g. scaling a 3x10 picture with a 10x scale leaves 10 pixes left and right that are partially transparent.

Comment 3 Michel Dänzer 2008-07-24 07:11:26 UTC

(In reply to comment #2)
> Indeed it seems to be a bug in the Intel-driver, when using XAA I get the
> result I expected.

Reassigning.

Out of curiosity, does it also work correctly with EXA and Option "ExaNoComposite" though?

Comment 4 WuNian 2008-07-25 01:06:09 UTC

*** Bug 16843 has been marked as a duplicate of this bug. ***

Comment 5 Gordon Jin 2008-07-29 18:19:59 UTC

the test case is on bug#16843

Comment 6 Clemens Eisserer 2008-07-30 02:49:46 UTC

Yes, works also correctly with EXA and "ExaNoComposite".

Comment 7 Carl Worth 2008-08-12 17:42:47 UTC

I do see the undesired result with EXA, and I can believe that XAA
(or EXA with ExaNoComposite) might be different.

But are you sure that it's the EXA result that is buggy? I can understand
why you would want the effect of "transparent" source pixels outside the
source extents, but there's no way to represent those with an RGB24
source. It could definitely be argued that it's the XAA result that is
buggy here.

One workaround for the test case to get the desired result is to use
PictStandardARGB32 instead of PictStandardRGB24, (and of course, create
the pixmap with a depth of 32 instead of 24).

Looking forward to more comments,

-Carl

Comment 8 Michel Dänzer 2008-08-13 00:42:07 UTC

(In reply to comment #7)
> But are you sure that it's the EXA result that is buggy? I can understand
> why you would want the effect of "transparent" source pixels outside the
> source extents, but there's no way to represent those with an RGB24
> source.

Unfortunately, that doesn't seem to matter. See bug 15334 for quite extensive discussion related to this.

Comment 9 Carl Worth 2008-08-13 09:45:28 UTC

(In reply to comment #8)
> (In reply to comment #7)
> > But are you sure that it's the EXA result that is buggy? I can understand
> > why you would want the effect of "transparent" source pixels outside the
> > source extents, but there's no way to represent those with an RGB24
> > source.
> 
> Unfortunately, that doesn't seem to matter. See bug 15334 for quite extensive
> discussion related to this.

The bug says that REPEAT_NONE should result in alpha=0 for samples outside the surface.

I agree with that for a source picture that has an alpha channel.

But what does REPEAT_NONE mean for RGB-only surfaces? Generally, an RGB-only surface is interpreted as having an implicit alpha channel with alpha=1.0 everywhere.

Or maybe what you're saying is that an RGB-only surface should have an implicit alpha channel with alpha=1.0 for the samples within the surface and alpha=0.0 for samples outside.

My point is that I don't think the Render specification is at all clear about this case. And unless we have something definitive to point at somewhere, it's hard to get pixman and all the drivers to agree on a single behavior. We can end up having different people insist that different behavior is correct, and it's not easy to say which behavior is buggy.

-Carl

Comment 10 Clemens Eisserer 2008-08-13 12:16:57 UTC

Isn't it usually the goal of driver developers to make their drivers match pixman's behaviour?

In this case I think two facts speak for the xaa result:

- pixman does it this way in recent versions, old versions do it the way the intel driver does it now. So some time ago this was "fixed" in pixman.
- pixman's result should be in general more useful. I know this is a weak argument, but after all, the driver isn't there to fullfill a spec but to run real software on it.

After all, the render spec does not specify many things which are implemented anyway.

Comment 11 Carl Worth 2008-08-13 14:25:09 UTC

On Wed, 2008-08-13 at 12:16 -0700, bugzilla-daemon@freedesktop.org
wrote:
> Isn't it usually the goal of driver developers to make their drivers match
> pixman's behaviour?

We need one set of behaviors that drivers and pixman can agree on, yes.

Whether that's what pixman actually does at any give point in time isn't
really the issue, I don't think.

> In this case I think two facts speak for the xaa result:
> 
> - pixman does it this way in recent versions, old versions do it the way the
> intel driver does it now. So some time ago this was "fixed" in pixman.

This is actually the problem though. Just because one implementation
changed doesn't mean the others should follow. That should also be
backed up with a consensus decision that the change is intended, and
that decision needs to be *written* down so that other implementors can
know to follow it.

> - pixman's result should be in general more useful. I know this is a weak
> argument, but after all, the driver isn't there to fullfill a spec but to run
> real software on it.

The "more useful" thing doesn't work. There's lots of crappy behavior
codified in the Render extension and things like cairo do a lot of work
to work around those. Now, it's useful to understand those, and add
*new* things to Render to address those, but it's not an excuse to break
established behavior.

And, no, the driver isn't there to "fulfill a spec" in and of itself.
But the specification *is* there to give us a single place to write down
agreed-upon decisions for how things behave.

> After all, the render spec does not specify many things which are implemented
> anyway.

There has not yet developed good process and precedent around getting
the Render specification updated. That's a process bug, and not an
argument for maintaining the status quo. But the xorg@ mailing list is
the right forum for discussing Render specification process issues---not
here in bugzilla where there are just two or three people talking.

Getting back to REPEAT_NONE and RGB24 images, what I would like to see
is clear text in the specification stating what the behavior should be.
If we can get that, and the Intel driver doesn't match it, then I'll
commit to trying to fix the driver.

Is that reasonable?

-Carl

Comment 12 Michel Dänzer 2008-08-14 00:11:41 UTC

(In reply to comment #9)
> My point is that I don't think the Render specification is at all clear about
> this case.

Owen Taylor didn't seem to have any doubts about the spec in the bug I referenced. If you do, you should probably bring it up on the xorg mailing list?

Comment 13 Clemens Eisserer 2008-09-09 09:15:57 UTC

Any update on this?

To be honest I don't feel qualified enough to bring it up on the xorg mailing list, so could you do that, Carl?

Comment 14 Carl Worth 2008-09-25 09:08:15 UTC

(In reply to comment #13)
> Any update on this?
> 
> To be honest I don't feel qualified enough to bring it up on the xorg mailing
> list, so could you do that, Carl?

Hi Clemens,

I've run into this same problem from a couple new angles since, so I'm now convinced that you are right about the desired behavior. I'll still want to follow up to make the Render specification more precise about this, but that's really a separate issue.

I'll dig into what it will take to fix this bug.

-Carl

Comment 15 Carl Worth 2008-09-30 16:50:57 UTC

(In reply to comment #14)
> I'll dig into what it will take to fix this bug.

Here's what I have so far:

For EXTEND_NONE the driver is setting the wrap_mode to BRW_TEXCOORDMODE_CLAMP_BORDER and setting the border_color,
(driver code calls it default_color), to RGBA (0, 0, 0, 0).
So this really should be giving the desired behavior as far
as I understand. I'll dig deeper to see if that alpha value
is being ignored for some reason.

-Carl

Comment 16 Carl Worth 2008-10-02 11:43:36 UTC

(In reply to comment #15)
> For EXTEND_NONE the driver is setting the wrap_mode to
> BRW_TEXCOORDMODE_CLAMP_BORDER and setting the border_color,
> (driver code calls it default_color), to RGBA (0, 0, 0, 0).
> So this really should be giving the desired behavior as far
> as I understand. I'll dig deeper to see if that alpha value
> is being ignored for some reason.

Indeed it is. Here's what the documentation says:

    For surface formats with one or more channels missing, the           
    value from the border color is not used for the missing              
    channels, resulting in these channels resulting in the               
    overall default value (0 for colors and 1 for alpha)                 
    regardless of whether border color is chosen.                        
                                                                              
    [Intel 965 PRM; Volume 4; Section 4.7.4 SAMPLER_BORDER_COLOR_STATE]

That's icky.

I guess we'll have to fall back to software rendering for EXTEND_NONE with an RGB-only source and a not-integer-translation-only matrix for now. And maybe later we can find out if there's some way we can actually get the hardware to give us the results we want.

A patch to come shortly...

-Carl

PS. The Render specification actually *does* specify that transparent is what we want here, and not black. It's in section 9. (I missed it originally because it uses terms of Transparent and Tile there instead of None and Repeat like it uses elswhere.)

Comment 17 Clemens Eisserer 2008-10-02 15:25:42 UTC

Thanks a lot for the investigation, good to know the spec is specific in this case.

Of course a fallback would be quite sad for such a frequent operation, maybe allocating a temporary argb32 surface could help here? 
Maybe with some re-using, so that allocation overhead is not too large for very small operations.

I had a few ideas for workarrounds but all other turned out to work only in certain cases (e.g. no filter, ...).

Thanks, Clemens

Comment 18 Carl Worth 2008-10-02 17:14:53 UTC

(In reply to comment #17)
> Of course a fallback would be quite sad for such a frequent operation, maybe
> allocating a temporary argb32 surface could help here?

That's probably most easily done at the application level, (though also, the application level is the place least likely to "know" of this device-specific performance problem).

> Maybe with some re-using, so that allocation overhead is not too large for very
> small operations.

And that kind of cacheing is probably more than we want to do in the driver.

The hardware's behavior isn't what anybody wants, (including OpenGL), so we'll look into various other drivers, (Mesa, etc.), for this hardware to see if someone has a high-performance approach here.

-Carl

PS. I said we'd have to fallback for any non-identity transformation, but it will actually be easy enough to check if the transformation simply doesn't require reading from any sample locations "outside" the original picture. That still won't help your case of the big shear, but might help in some other cases, (multiple-of-90-degree rotations, for example).

Comment 19 Carl Worth 2008-10-02 20:48:42 UTC

I've pushed out a fix for this now:

http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=76c9ece36e6400fd10f364ee330faea470e2da64

(In reply to comment #18) 
> PS. I said we'd have to fallback for any non-identity transformation, but it
> will actually be easy enough to check if the transformation simply doesn't
> require reading from any sample locations "outside" the original picture. That
> still won't help your case of the big shear, but might help in some other
> cases, (multiple-of-90-degree rotations, for example).

This part isn't done yet. We don't have the necessary information in the driver to do this. Instead, we can do this in the ReduceCompositeOp function up in the X server, (in render/picture.c). The current code there reduces Over to Source if the source picture has no alpha and there is an identity transformation. I'll fix that to also do the reduction if the affected rectangle when transformed lies entirely within the source picture.

-Carl

Comment 20 Carl Worth 2008-10-06 13:26:33 UTC

(In reply to comment #17)
> Of course a fallback would be quite sad for such a frequent operation, maybe
> allocating a temporary argb32 surface could help here? 

Hi Clemens,

I was able to learn that the hardware does indeed support the mode we want,
and it was as simple as setting the right bit in the right register.

So I've eliminated the fallback and set that bit now:

http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=260cbcfe61868175ba3e649ce07d43f57601f9be

Which should be much more pleasant for everybody.

-Carl

Comment 21 Clemens Eisserer 2008-10-06 14:11:54 UTC

Wow, very impressive, thanks a lot :)
Is this only supported on 965+, or also on i915 (or even earlier) hardware?

Comment 22 Clemens Eisserer 2008-10-06 14:16:19 UTC

in the case its not supported on older hardware, I've another idea for a work-arround:

A temporary, transparent mask could be used with the same transformation as set on source used together with RepeatPad. Maybe the scale applied to the mask could be modified, so this way the mask could be much smaller - maybe even 1x1 and allocated permanently.

Comment 23 Carl Worth 2008-10-06 14:21:29 UTC

(In reply to comment #21)
> Wow, very impressive, thanks a lot :)
> Is this only supported on 965+, or also on i915 (or even earlier) hardware?

The fix I made today is specific to 965 (and later) hardware.

But I also think the original bug was specific to the same hardware.

If not, and you're seeing a similar problem on 915, please let me know, (ideally with a new bug entry).

Thanks,

-Carl

Comment 24 Clemens Eisserer 2008-10-06 15:27:41 UTC

I should have mentioned that I see the problem on an i945GM when I filed the bug, this was really my fault, sorry.

I created a new bug-report:
https://bugs.freedesktop.org/show_bug.cgi?id=17933

I don't have an i830-class chip here to test on, would be great if you could check it out there too.

Thanks, Clemens

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.