Summary: | dEQP failures on llvmpipe | ||
---|---|---|---|
Product: | Mesa | Reporter: | Ilia Mirkin <imirkin> |
Component: | Drivers/Gallium/llvmpipe | Assignee: | mesa-dev |
Status: | RESOLVED MOVED | QA Contact: | mesa-dev |
Severity: | normal | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
list of llvmpipe-related deqp fails
list of llvmpipe-related deqp fails |
This was, by the way, the result of running mesa master with LIBGL_ALWAYS_SOFTWARE=1 ./deqp-gles3 --deqp-visibility=hidden --deqp-caselist-file=<(grep -v 'mipmap_linear' ../../android/cts/master/gles3-master.txt ) You should be able to run any one of the failures directly by doing --deqp-case='foobar' instead of the caselist-file. Oh, linear mipmap filtering should work perfectly. For performance reasons though we cheat, which is likely why it fails. The cheats can be disabled via env var, preferably all 3 of them (GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod). If the vertex texturing tests use (non-constant) explicit lod or derivatives that would explain the failures there as well. Though I'm wondering about the blend failures, should work perfect (and gets quite a lot of test coverage from piglit). Or is this silly and complaining about single-bit rounding errors? A random sampling of the texturing tests that failed seem to all pass with GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod I will start a fresh run with these parameters. A random blending test makes it seem like it completely fails. I'd like to encourage you to grab a copy of deqp and run it yourself to see the details - you get expected and actual images among other things. <TestCaseResult Version="0.3.3" CasePath="dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.src.dst_color_one_minus_src_alpha" CaseType="SelfValidate"> <Text>RGB equation = GL_FUNC_ADD</Text> <Text>RGB src func = GL_DST_COLOR</Text> <Text>RGB dst func = GL_ONE</Text> <Text>Alpha equation = GL_FUNC_ADD</Text> <Text>Alpha src func = GL_ONE_MINUS_SRC_ALPHA</Text> <Text>Alpha dst func = GL_ONE</Text> <Text>Blend color = (0.2, 0.4, 0.6, 0.8)</Text> <Text>Image comparison failed: max difference = (97, 97, 97, 1), threshold = (4, 4, 4, 4)</Text> Looking at the error mask:  IBAF0Y/3vzMWRgqjnctkk3mVdDsgDSMzcyTJmEnafC9HmjOA1jXgug/pG5C7oXHA xQCaATQDaAbQDKAZQDOAZgDNAJoBtB0Bj8fAf206gbqGwoDSjV+qAvZMHy8xz4AP 6/X4dfmjwhNYQ9dNH38hngE0A2gG0AygGUAzgGYAzQCaATQDaAbQDKAZQDsBxS4p f+bHF58AAAAASUVORK5CYII= (paste that into the url bar of your browser) makes it seem like something funny is going on. Green = good, red = bad. This is running on a SKL with LLVM 3.7.1 in case it matters. (In reply to Ilia Mirkin from comment #3) > I will start a fresh run with these parameters. A random blending test makes > it seem like it completely fails. I'd like to encourage you to grab a copy > of deqp and run it yourself to see the details - you get expected and actual > images among other things. > > <TestCaseResult Version="0.3.3" > CasePath="dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer. > rgb_func_alpha_func.src.dst_color_one_minus_src_alpha" > CaseType="SelfValidate"> > <Text>RGB equation = GL_FUNC_ADD</Text> > <Text>RGB src func = GL_DST_COLOR</Text> > <Text>RGB dst func = GL_ONE</Text> > <Text>Alpha equation = GL_FUNC_ADD</Text> > <Text>Alpha src func = GL_ONE_MINUS_SRC_ALPHA</Text> > <Text>Alpha dst func = GL_ONE</Text> > <Text>Blend color = (0.2, 0.4, 0.6, 0.8)</Text> > <Text>Image comparison failed: max difference = (97, 97, 97, 1), threshold > = (4, 4, 4, 4)</Text> > Yes, that looks quite wrong indeed - and not precision related of course. I'll give it a try... > > (paste that into the url bar of your browser) makes it seem like something > funny is going on. Green = good, red = bad. This is running on a SKL with > LLVM 3.7.1 in case it matters. Ideally it should of course not, albeit different bugs with and without AVX are possible, as quite different code paths may be used due to 8x32 vs. 4x32 vectors (if you use LP_NATIVE_VECTOR_WIDTH=128 it will disable avx). Of course it's also possible llvm miscompiles things in which case the llvm version would matter, but that should be rare. Created attachment 123061 [details]
list of llvmpipe-related deqp fails
Updated dEQP fail list attached (again, filtered for msaa-related fails, but left linear mipmaps in this time). This was run with a copy of mesa which includes the LLVM 3.7 workaround for broken vector selects.
Also GALLIVM_DEBUG=no_rho_approx,no_brilinear,no_quad_lod was used in the environment.
LLVM 3.7.1, Core i7-6700 (SKL)
I wonder what deqp doesn't like about our nearest_mipmap_linear implementation (all filtering errors use that). Also, I'm wondering if the test is overly picky about pow. The spec says right there the error is derived as pow(x,y) = exp2(log2(y) * x) (note there is a spec bug, x and y are swapped), which is exactly as we implement it. Therefore, if our results are good enough for passing exp2 and log2, we should pass pow as well. The problems with 32bit integer formats look a bit odd as well (since there can be no filtering or blending or whatever, the values should remain mostly untouched), not sure what's up with that. (In reply to Roland Scheidegger from comment #6) > I wonder what deqp doesn't like about our nearest_mipmap_linear > implementation (all filtering errors use that). Error mask for dEQP-GLES3.functional.texture.filtering.2d.formats.rgba16f_nearest_mipmap_linear:  MAwE0HPp//+yO7kppUMhWoTeLQmCmCM8DV472Wvnnb2u9zP/Nfuc/zu7e+b392tf zzs9P8a9cn7AnTzPQSedBFT1JICApqkQ8Kip0jdWYPoKEEBACCCAgKYhoKgnAQQ0 DQFFPUcLcB1OrMDoFSCAgBBAAAFNQ0BRTwIIaBoCinqOFuA6nFiB0StAAAEhgAAC moaAop4EENA0BBT1HC3AdTixAqNXgAACQgABBDQNAUU9CSCgaSoEvADIL0yLjY1M 5wAAAABJRU5ErkJggg== <Text>Texture coordinates: (-1, -2.7) -> (2.03143, 4.76426)</Text> <Text>ERROR: Result verification failed, got 3072 invalid pixels!</Text> > Also, I'm wondering if the test is overly picky about pow. The spec says > right there the error is derived as pow(x,y) = exp2(log2(y) * x) (note there > is a spec bug, x and y are swapped), which is exactly as we implement it. > Therefore, if our results are good enough for passing exp2 and log2, we > should pass pow as well. pow() fails for inf ^ x == inf. I glanced at the gallivm code, and this appears to be on purpose (i.e. you generate faster code that doesn't handle inf). > The problems with 32bit integer formats look a bit odd as well (since there > can be no filtering or blending or whatever, the values should remain mostly > untouched), not sure what's up with that. Error mask for dEQP-GLES3.functional.fbo.color.tex2d.rgba32i:  IBAFife/83qBAFu1ZPoFuj9dlcE2k8k+8fq0e67eC619rp8+efolvy+xtp7nimmt /Y1elOdRAIwCYBQAowAYBcDcCxhPTq8mbWtXS5rHqYeDgnp6S2S1oN5TEsXCenrf 6kEBaRd7Jg8KSPt1zsQrAMYrACbrJnwgWQI27lq9rd1//NF5vGCJtSM/kAds3L7T tpbVgg5EATAKgFEAjAJgzANgzANG9RQskdWC0ob0hQB5gHyDAmAUAKMAGAXAZAlI G9IXYh4wKck8oIi0rWW1oANRAIwCYBQAowAY8wAY84BRPQVLZLWgtCF9IeYBoSgA RgEwHtCA8YAGTNYVsPFFYx4wKck8oIi0rTkFwSgARgEwCoBRAIx5AIx5QFE9vSWy noTThvR3/7+gtJ9mJt6EYRQAowAYBcBkCUgb0hdiHjApyTygiLStZbWgA1EAjAJg FACjABjzABjzgFE9BUtktaC0IX0hng8IRQEwCoBRAIwCYLL+KiJtSF+IecCkJPOA ItK25j0ARgEwCoBRAIwCYMwDYMwDiurpLZHVgtKG9HefD5BvyHov6EA8IQNjC4JR AEyWgI1vG+YBk5LMA4pI21pWCzoQBcAoAEYBMAqAMQ+AMQ8Y1VOwRFYLShvSF2Ie EIoCYBQAowAYBcBkCUgb0hdiHjApyTygiLStZbWgA1EAjAJgFACjABjzABjzgKJ6 ektktaC0Id3zAfujABgFwCgARgEwnhGD+QfUVU3wABEGZAAAAABJRU5ErkJggg (In reply to Ilia Mirkin from comment #7) > > Also, I'm wondering if the test is overly picky about pow. The spec says > > right there the error is derived as pow(x,y) = exp2(log2(y) * x) (note there > > is a spec bug, x and y are swapped), which is exactly as we implement it. > > Therefore, if our results are good enough for passing exp2 and log2, we > > should pass pow as well. > > pow() fails for inf ^ x == inf. I glanced at the gallivm code, and this > appears to be on purpose (i.e. you generate faster code that doesn't handle > inf). > Ahh right forgot about that - we hook up the safe log2 version for LG2 tgsi opcode, but use the unsafe version for pow. I think we did the lg2 safe version for d3d10 initially, since in gl it traditionally didn't really matter. And pow doesn't exist in d3d10. I suppose we could switch that if it's really worth it (too bad the special values require 3 comparisons, 3 selects). So, the r32i/ui failures are actually due to an overflow. One example I've seen samples a rgb8 unorm texture, scales to int range, converts to int and outputs this. The problem is that rescaling to 2^31 - 1 really ends up with 2^31 due to imprecise float math, which causes an overflow when converted to an int. I'm nearly certain this is undefined behavior by the glsl spec, albeit the spec doesn't explicitly say so (but should probably follow from ieee754 math). d3d10 would require clamping, making it work. So, I'm inclined to say that's just a test bug. But even if it's undefined but all gpus clamp anyway we might want to fix it nonetheless... I am also seeing this same issue on Mesa 17.3.6. I wanted to know if there is an update /patch available for this issue. Thanks. (In reply to msdhedhi007 from comment #10) > I am also seeing this same issue on Mesa 17.3.6. I wanted to know if there > is an update /patch available for this issue. There is no goal as such to pass dEQP. As mentioned, some bugs are due to performance optimizations (which can be disabled, albeit only on debug builds). Some might not even be real bugs (also as mentioned, where I think dEQP relies on behavior not guaranteed by the spec). For both of these types, there's no interest in addressing these (albeit I suppose if you're talking about making it possible to disable performance hacks on release builds, that could be done). As for the rest, patches welcome, but personally I've got little interest and definitely no time to specifically look into dEQP failures. The below commit allows us to disable the perf. optimisations (for release builds), and thus fixing the functional.texture tests. Should we close this bug, or keep it open as all the failing tests have a solution/workaround? commit 8f77156c268356baf9df8490c52cc5d8475b9db8 Author: Gert Wollny <gert.wollny@collabora.com> Date: Fri Oct 5 15:08:51 2018 +0200 gallivm: Make it possible to disable some optimization shortcuts in release builds (In reply to Emil Velikov from comment #12) > The below commit allows us to disable the perf. optimisations (for release > builds), and thus fixing the functional.texture tests. > > Should we close this bug, or keep it open as all the failing tests have a > solution/workaround? If there's still failures with the perf optimizations disabled I think we should keep it open. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/239. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 122980 [details] list of llvmpipe-related deqp fails There are a number of dEQP tests that fail on llvmpipe, it's up to the llvmpipe maintainers whether they care or not. I'm guessing for a lot of these, it will be "not". I've manually removed a number of failures that are msaa-related, or are otherwise not llvmpipe's fault. Also, due to dEQP testsuite bugs, I avoided running anything with mipmap_linear in the name, as it leads to super-long-running tests that fail anyways. However it would seem that mipmap_linear filtering is non-functional. Perhaps that's known. I'm thinking that as these issues are triaged, dependent bugs will be created.