Bug 78914 - [llvmpipe] Front/Backfaces do not get the same depth values when interpolated
Summary: [llvmpipe] Front/Backfaces do not get the same depth values when interpolated
Status: RESOLVED WONTFIX
Alias: None
Product: Mesa
Classification: Unclassified
Component: Other (show other bugs)
Version: 10.1
Hardware: Other All
: medium normal
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-19 14:47 UTC by Florian Link
Modified: 2015-01-08 21:31 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
some screenshots showing the problem (52.57 KB, text/plain)
2014-05-20 08:09 UTC, Florian Link
Details
apitrace of example (236.79 KB, text/plain)
2014-05-20 12:26 UTC, Florian Link
Details
new api trace of example (237.41 KB, text/plain)
2014-05-26 08:20 UTC, Florian Link
Details
Frame 8 screenshot done with glretrace (4.50 KB, text/plain)
2014-05-26 08:21 UTC, Florian Link
Details
Frame 8 with glretrace (correct mime type) (4.50 KB, image/png)
2014-05-26 08:23 UTC, Florian Link
Details

Description Florian Link 2014-05-19 14:47:30 UTC
When trying to run my GLSL raycaster with Mesa/llvmpipe, I recognized artifacts due to the wrong ray start/end positions. I render the start/end rays as boxes where the front face is the start position and the back face is the back position. The problem seems to be that the front/back faces do not cover the exact same pixels in the framebuffer, even when they share the same edges.
On NVidia/ATI cards with native driver, the back faces cover the same pixels as the front faces.

This can be reproduced by rendering a triangle with culling turned off and blending turned on. When rotating the triangle one can see the artifacts on the borders of the triangle.
Comment 1 Roland Scheidegger 2014-05-19 22:39:09 UTC
So, in order to get front and backface tris, you draw essentially the same tri twice, but once you draw index 0,1,2 and once you draw 0,2,1? I could see this getting different results for interpolated attributes (in fact I know it will happen...). I am not actually sure it's guaranteed to get the same results, this is very tricky to get right (the reason is the interpolation / interpolation setup is not quite symmetric wrt all triangle corners, the float math can give different results). Though this should only affect interpolated attribute values, not rasterization itself (which happens with fixed point math). If it actually rasterizes different pixels this is a bug. Hence if you could provide some minimal test case that would be great.
This only affects llvmpipe right?
Comment 2 Florian Link 2014-05-20 08:09:30 UTC
Created attachment 99394 [details]
some screenshots showing the problem

I rendered the same cube two times, first culling the front faces, then culling the back faces. The backside has a red light, so that backside pixels are draw red. Since the front side is painted, no red pixels should ever be seen. As you can see on the screenshots, there are red pixels on the edges, depending on the angle.
Comment 3 Florian Link 2014-05-20 08:15:47 UTC
So regarding your comment, rasterization of back/front faces with shared edges seems to rasterize to different (at least on some pixels). So it is not about the interpolated vertex attributes, but about the screen pixels.

I tested this with LLVM pipe and with softpipe and only LLVM pipe shows the artifacts (softpipe does not show any red pixels in my example setup).

The example scene was done using MeVisLab (www.mevislab.de) and the Mesa opengl32.dll with LLVM pipe, I can send you that MeVisLab scene (MeVisLab is free to install).


Or do you need a C++ example to reproduce this?
Comment 4 Roland Scheidegger 2014-05-20 10:37:00 UTC
I agree this looks like a rasterization problem. I'll look into it when I find some time. Right now I have no idea why that happens - there's the code which would rotate the vertices but this all happens with fixed point thus it should still produce the same fragments afaict. But there might be a bug in this logic somewhere.
The more easy it is to reproduce the better. If you could get a apitrace that would be helpful.
Comment 5 Florian Link 2014-05-20 12:26:19 UTC
Created attachment 99406 [details]
apitrace of example

The rendering is in frame 6, 7 and 9. I don't know how to strip the MeVisLab network rendering from the other frames...
Comment 6 Roland Scheidegger 2014-05-20 17:00:09 UTC
Actually I think I know what's happening here. Our rasterizer can only handle counterclockwise triangles (so the edge function sign indicates if a fragment is inside or outside a plane and ultimately the tri). To make it work we rotate the triangle if the triangle is clockwise to make it counterclockwise (if backface culling is disabled). So far so good. But for correct fill rules (to get cases right where fragment centers are on edges) we also fix up the plane coefficients - and we do this based on the sign of the steepness of the plane function, but because we exchanged two vertices in case when we have front face culling instead of back face culling (so actual winding of the tri did not really change, we just treat it as if it would have changed) we get it wrong.
Though this requires some further analysis, it is quite possible I made some mistake here, this is tricky to get right!
Comment 7 Roland Scheidegger 2014-05-20 19:50:51 UTC
Hmm actually that theory wasn't right. Must be something else.
Comment 8 Florian Link 2014-05-21 07:14:20 UTC
I noticed that the same problem happens if I always use GL_CULL_BACK and change the winding so that the back faces are front facing when I need the back faces. 

So it seems to depend on the winding of the triangles (which probably is related on how GL_CULL_BACK/FRONT are implemented...).
Comment 9 Roland Scheidegger 2014-05-24 01:57:05 UTC
When I play back the trace I don't see any such errors (doing snapshots 2 images were captured with cubes though both frames were identical).
Comment 10 Florian Link 2014-05-26 08:19:38 UTC
Sorry, seems that the trace I attached did not have the right angle.

I did another trace and used the 64bit glretrace with Mesa LLVMPipe (10.2 RC2) opengl32.dll (64Bit) copied to the apitrace-msvc\x64\bin. I then get two red pixels on the top left corner of the black cube, which are from the red backface.
I tried both release and debug Mesa dll, both show the problem.

I also did the test with an older Mesa LLVMPipe (10.0 dev), and it showed 3 red pixels instead of 2. Running it with softpipe shows not errors.

Will attach the trace in a sec.
Comment 11 Florian Link 2014-05-26 08:20:30 UTC
Created attachment 99841 [details]
new api trace of example

New trace with different camera angle.
Comment 12 Florian Link 2014-05-26 08:21:59 UTC
Created attachment 99842 [details]
Frame 8 screenshot done with glretrace
Comment 13 Florian Link 2014-05-26 08:23:42 UTC
Created attachment 99843 [details]
Frame 8 with glretrace (correct mime type)
Comment 14 Roland Scheidegger 2014-05-26 19:57:18 UTC
Ok I see the error now. Too bad the trace is a bit complex and trimming didn't work unfortunately (I bet that's got something to do with the multiple contexts and windows). I'll try to extract the necessary calls by hand...
Comment 15 Roland Scheidegger 2014-05-27 00:19:26 UTC
Ok turns out this is indeed an interpolation problem. Depth test is enabled and it just happens that for these failing fragments probably due to interpolation precision issues the depth test fails. At least if I nuke depth testing it no longer shows this. This may or may not be a valid bug (not entirely sure if this is a legitimate case of floating point rounding errors, we could however near certainly do a better job with interpolation), but in any case I'm afraid I can't fix this for now.
Rasterization itself is just fine.
Comment 16 Florian Link 2014-05-27 07:38:18 UTC
This is strange, since in my renderer I render all back faces to one FBO and all front faces to another FBO, so the front/back faces do not fight in the Z-buffer.

I really experience missing pixels on faces that share edges, in one FBO the front triangle edge has different pixels than the back triangle in the other FBO... 

But maybe your code does something different when depth testing is on?

I will try to adapt my example to show the problem without depth test.
Comment 17 Florian Link 2014-05-27 08:02:29 UTC
Ok, you are right, it only happens with depth test enabled.

The strange thing is that it creates these artifacts in my ray caster,
where I get exactly theses holes but both front/back faces have the same
interpolated positions, so depth rejection should not create holes because another triangle should be have rendered to that pixel first.

Anyway, I think the depth test should still not generate these rejection pixels, since it will create problems in e.g. depth peeling and other algorithms as well.
Comment 18 Florian Link 2014-05-27 08:37:34 UTC
Ok, I can confirm that it is a depth fighting problem and found a fix for my ray caster. Thank you for your effort!

Still it would be good if LLVM pipe would do the same quality depth test as softpipe and NVidia/ATI do.
Comment 19 Roland Scheidegger 2014-05-27 11:38:05 UTC
depth test as such is as accurate as it could be. Doing interpolation with as much precision as possible is not all that easy due to properties of floating point arithmetic. In particular for the math the order of vertices matter. Reordering would be possible, though still does not guarantee the same results for fragments along a shared edge (unless the tri shares all vertices, that is it's really the same tri with reordered edges).
But I agree doing better would be nice, I'm just not entirely sure what clever tricks need to be done to achieve this.
There's also a slight bug in the implementation I believe, the interpolation should be done with snapped (fixed point) coordinates, however we do the interpolation setup with float coordinates. I'm not sure though this would help here, but at least in contrast to other interpolation issues this one wouldn't be all that difficult to fix. Another issue is that if you have some attributes with large gradients on a somewhat small triangle, you can get huge errors the further the triangle is away from the framebuffer origin. So, interpolation is definitely not perfect.
Comment 20 Jose Fonseca 2014-11-24 20:30:08 UTC
> depth test as such is as accurate as it could be. Doing interpolation with as much precision as possible is not all that easy due to properties of floating point arithmetic. In particular for the math the order of vertices matter. Reordering would be possible, though still does not guarantee the same results for fragments along a shared edge (unless the tri shares all vertices, that is it's really the same tri with reordered edges).

I agree.  It might be possible to do better (barycentric interpolation might help), but it is complicated to ensure we get exactly the same depth values in all circumstances, particularly because llvmpipe wants to strike a good balance of performance/accuracy.

OpenGL application developers should make use of glPolygonOffset to prevent depth-fighting in a portable way.
Comment 21 Florian Link 2014-11-25 08:25:56 UTC
It is ok for me if you close this bug as WONTFIX, since I worked around the depth fighting and I agree that it is a hard problem to do this with high precision and speed at the same time.
Comment 22 Jose Fonseca 2014-11-25 16:48:11 UTC
OK!
Comment 23 Kurzemnieks 2015-01-08 21:02:40 UTC
I think I have run in the same problem. 
Here is a screenshot with a simple double-sided cube whose backface seems to be drawn incorrectly on the edge. https://api1-ams2.monosnap.com/static/0kwtTWiNWJrcIXex8BcxpgoWCkxHGx.png
So what is the best workaround for this if it won't be fixed?
Comment 24 Jose Fonseca 2015-01-08 21:31:18 UTC
(In reply to Kurzemnieks from comment #23)
> I think I have run in the same problem. 
> Here is a screenshot with a simple double-sided cube whose backface seems to
> be drawn incorrectly on the edge.
> https://api1-ams2.monosnap.com/static/0kwtTWiNWJrcIXex8BcxpgoWCkxHGx.png

Link is dead.

Maybe try using Buzilla's attachment feature -- might be good for future reference too.

> So what is the best workaround for this if it won't be fixed?

The "workaround" (although I'd merely call it "best practice") is to make adequate use of glPolygonOffset.

See for example http://www.zeuscmd.com/tutorials/opengl/15-PolygonOffset.php


In short the whole idea consists in: instead of assuming that two draw calls will produce _exactly_ the same depth values, give a tolerance so to accomodate small differences. glPolygonOffset allows to do that.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.