25871 – nearest neighbour samples wrong texel (precision/rounding problem)

Bug 25871 - nearest neighbour samples wrong texel (precision/rounding problem)

Summary: nearest neighbour samples wrong texel (precision/rounding problem)

Status:	RESOLVED NOTABUG

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/r600 (show other bugs)
Version:	git
Hardware:	Other All

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-01-03 05:01 UTC by Pierre Ossman
Modified:	2014-04-16 15:15 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:
i915 features:

Attachments
fp-tri.c (10.04 KB, text/plain) 2010-01-04 13:48 UTC, Pierre Ossman	Details
fragment program (67 bytes, application/octet-stream) 2010-01-04 13:49 UTC, Pierre Ossman	Details
possible fix (3.39 KB, patch) 2011-02-28 18:56 UTC, Alex Deucher	Details \| Splinter Review
captured shader (59.30 KB, image/png) 2012-09-11 17:12 UTC, Andreas Boll	Details
View All

Description Pierre Ossman 2010-01-03 05:01:35 UTC

The following fragment program does the wrong thing on r600/r700 hardware:

TEX result.color, 0.498046860099, texture[0], 2D;

(the texture coord is as far as I got before I got bored. I'm not sure the precision goes much further anyway)

The circumstances is that I have a 4x4 texture with nearest neighbour interpolation. The above code should be sampling texel 1,1 but is instead sampling texel 2,2. This causes problems for fragment programs that try to do things based on texel coords (like bicubic interpolation).

This seems to be a hw bug as I've tried it with fglrx (albeit with a slightly different card) and it exhibits the same bug. As such, I'm not sure it can be solved but hopefully there is a workaround.

Comment 1 Pierre Ossman 2010-01-04 13:48:51 UTC

Created attachment 32446 [details]
fp-tri.c

Test program to provoke the bug. This is a hacked version of fp-tri.c from mesa/progs/fp that I was using to debug a problem with a bicubic filtering fragment program, so it can be a bit messy. If you have trouble sorting it out, then I can try to clean it up and remove non-essential bits.

Comment 2 Pierre Ossman 2010-01-04 13:49:22 UTC

Created attachment 32447 [details]
fragment program

The fragment program in its full form.

Comment 3 Alex Deucher 2011-02-28 18:27:18 UTC

For point sampled textures, SQ_TEX_SAMPLER_WORD2_0.MC_COORD_TRUNCATE needs to be set to 1.  The default behavior for texture addressing is to round unless that bit is set.

Comment 4 Alex Deucher 2011-02-28 18:56:24 UTC

Created attachment 43951 [details] [review]
possible fix

Does this patch help?

Comment 5 Andreas Boll 2012-09-11 11:32:01 UTC

(In reply to comment #4)
> Created attachment 43951 [details] [review] [review]
> possible fix
> 
> Does this patch help?

this patch has been committed as

commit 1dc204d145dc8c0b19473a7814c201a8954b6274
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Mon Feb 28 21:52:19 2011 -0500

    r600g: truncate point sampled texture coordinates
    
    By default the hardware rounds texcoords.  However,
    for point sampled textures, the expected behavior is
    to truncate.  When we have point sampled textures,
    set the truncate bit in the sampler.
    
    Should fix:
    https://bugs.freedesktop.org/show_bug.cgi?id=25871
    
    Signed-off-by: Alex Deucher <alexdeucher@gmail.com>

and reverted in

commit 72c6a748b9ffdaa893f82faf911f22a241a5e4f5
Author: Marek Olšák <maraeo@gmail.com>
Date:   Mon May 2 01:10:19 2011 +0200

    Revert "r600g: truncate point sampled texture coordinates"
    
    This reverts commit 1dc204d145dc8c0b19473a7814c201a8954b6274.
    
    MC_COORD_TRUNCATE is for MPEG and produces quite an interesting behavior
    on regular textures. Anyway that commit broke filtering in demos/cubemap.


reassigning to r600g

is this still an issue with a newer mesa (e.g. 8.0.4 or git master)?

Comment 6 Pierre Ossman 2012-09-11 13:51:20 UTC

I don't have any updated systems right now, so I can't really test. Test program should still be valid though.

Comment 7 Andreas Boll 2012-09-11 17:12:41 UTC

Created attachment 66984 [details]
captured shader

I have tested your modified fp-tri-c from attachment 32446 [details]
with r600g on my rv770 .
I got the following output with mesa git master e81ee67b51651e99e7e8e52c1ccafc66835d57cd
and mesa 8.0.4:

$ ./fp-tri -fps foo2.arb
!!ARBfp1.0

TEX result.color, 0.498046860099, texture[0], 2D;

END

GL_RENDERER = Gallium 0.4 on AMD RV770
glGetError = 0x0
12850 frames in 5.0 seconds = 2570.000 FPS

Additionally I have attached a screen shot from the shader.

What is the expected behavior?

Comment 8 Alex Deucher 2012-09-11 17:23:43 UTC

might be fixed with this commit:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f44bda17f515c411071ca8744ebd96039d9c583b

Comment 9 Andreas Boll 2012-09-12 12:34:21 UTC

(In reply to comment #8)
> might be fixed with this commit:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=f44bda17f515c411071ca8744ebd96039d9c583b


I can't see any difference between mesa git master and mesa 8.0.4 .

What is the expected behavior?

Comment 10 Pierre Ossman 2012-09-12 13:03:19 UTC

(In reply to comment #7)
> 
> What is the expected behavior?

It's so long ago I don't remember the exact details. You can easily figure it out though. The test program creates a 4x4 checker board texture and then uses a single point on it to draw the triangles. The range [0.25,0.5[ should all be the same color. The bug was that as you approached 0.5, it started sampling the next pixel prematurely.

So change the fragment program to use coordinate 0.4 or something like that. That colour should be the same as 0.4999... . Looking at the code, white seems to be the correct colour.

Comment 11 Andreas Boll 2012-09-12 13:56:46 UTC

(In reply to comment #10)
> 
> It's so long ago I don't remember the exact details. You can easily figure it
> out though. The test program creates a 4x4 checker board texture and then uses
> a single point on it to draw the triangles. The range [0.25,0.5[ should all be
> the same color. The bug was that as you approached 0.5, it started sampling the
> next pixel prematurely.
> 
> So change the fragment program to use coordinate 0.4 or something like that.
> That colour should be the same as 0.4999... . Looking at the code, white seems
> to be the correct colour.

Ok I've got it.

With coordinate 0.4 I get a white color and with 0.4999 I get black.

Additionally I checked the other end of the range:
Between the range [0.2480468676,0.498046860099] the color is white.
If I change the fragment program to use coordinate 0.2480468675 then the color is black. With 0.4980468601 I get also black.

So the issue persists in git mesa master 9.1-devel (git-e81ee67)

Comment 12 Marek Olšák 2014-04-13 10:46:53 UTC

I think this is a normal behavior and it's not a precision issue. The coordinates are rounded to 0.5, because the filter is NEAREST (and 0.5 is the nearest pixel). Closing as NOTABUG.

Comment 13 Pierre Ossman 2014-04-13 20:40:32 UTC

Hardly NOTABUG. There is no pixel at 0.5. The pixels are at 0.125, 0.375, 0.625 and 0.875. 0.49999 should get rounded to the 0.375 pixel. That's the closest, not the 0.625 one.

Comment 14 Marek Olšák 2014-04-13 22:39:15 UTC

Thanks for the feedback. So it is a precision issue after all. If fglrx exhibits the same behavior, there is nothing I can do. I don't know of any hardware state which controls precision of texture addressing.

Comment 15 Erik Faye-Lund 2014-04-16 08:13:54 UTC

(In reply to comment #14)
> Thanks for the feedback. So it is a precision issue after all. If fglrx
> exhibits the same behavior, there is nothing I can do. I don't know of any
> hardware state which controls precision of texture addressing.

AMD's Windows drivers at least seemingly consistently performs nearest-filtering with round-off-point off by a 512th of a texel. So I think this is somehow intended (even if really unfortunate for some non-trivial usecases).

Comment 16 Pierre Ossman 2014-04-16 15:15:08 UTC

(In reply to comment #14)
> Thanks for the feedback. So it is a precision issue after all. If fglrx
> exhibits the same behavior, there is nothing I can do. I don't know of any
> hardware state which controls precision of texture addressing.

I would guess it needs input from the AMD folks. So I guess getting their attention is one thing that could be done. Maybe a microcode update could solve.

I also haven't tested this in ages. Might not be an issue for modern chips...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.