Created attachment 133247 [details] apitrace of test that crashes Since commit 6839d33699 (st/mesa: fix handling of NumSamples=1 (v2), 2017-08-01) many of VTK's tests crash with Mesa with an assertion failure. I've attached an apitrace of one of the tests. The test aborts with an assertion failure: ``` state_tracker/st_atom_framebuffer.c:62: update_framebuffer_size: Assertion `surface' failed. ``` The backtrace in gdb is ``` #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ffff7a6e3fa in __GI_abort () at abort.c:89 #2 0x00007ffff7a65e37 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x7fffec7516a4 "surface", file=file@entry=0x7fffec751680 "state_tracker/st_atom_framebuffer.c", line=line@entry=62, function=function@entry=0x7fffec7517a0 <__PRETTY_FUNCTION__.39021> "update_framebuffer_size") at assert.c:92 #3 0x00007ffff7a65ee2 in __GI___assert_fail ( assertion=0x7fffec7516a4 "surface", file=0x7fffec751680 "state_tracker/st_atom_framebuffer.c", line=62, function=0x7fffec7517a0 <__PRETTY_FUNCTION__.39021> "update_framebuffer_size") at assert.c:101 #4 0x00007fffec336bad in update_framebuffer_size (framebuffer=0x7fffffffce70, surface=0x0) at state_tracker/st_atom_framebuffer.c:62 #5 0x00007fffec336f3e in st_update_framebuffer_state (st=0x555555ae3b90) at state_tracker/st_atom_framebuffer.c:181 #6 0x00007fffec33538c in st_validate_state (st=0x555555ae3b90, pipeline=ST_PIPELINE_CLEAR) at state_tracker/st_atom.c:251 #7 0x00007fffec3428ea in st_Clear (ctx=0x555555abeb30, mask=18) at state_tracker/st_cb_clear.c:411 #8 0x00007fffec0a2b4a in clear (no_error=false, mask=16640, ctx=0x555555abeb30) at main/clear.c:221 #9 _mesa_Clear (mask=16640) at main/clear.c:242 ...(vtk code)... ``` For reference, I'm building Mesa as follows: ./autogen.sh \ --prefix="$prefix" \ --enable-debug \ --disable-dri \ --disable-egl \ --disable-gbm \ --disable-gles1 \ --disable-gles2 \ --disable-shared-glapi \ --with-platforms=x11 \ --enable-glx=gallium-xlib \ --enable-gallium-osmesa \ --with-gallium-drivers=swrast \ --enable-gallium-llvm=yes \ LLVM_CONFIG=/usr/bin/llvm-config-3.8 \ --enable-llvm-shared-libs \ --with-gl-lib-name=MesaGL \ --with-osmesa-lib-name=MesaOSMesa
Sorry for the breakage, Brad. I'll try to investigate ASAP, but I'm about to leave town for a week. I can't repro with llmvpipe, fwiw. I suspect it may be an issue in the swr driver. I'm building it now, but may run out of time.
Hmm, swr isn't working at all for me. It's hanging in a swr_fence_finish() call with everything I've tried. Even glxinfo hangs (but elsewhere). Maybe one of the swr developers can take a look.
OK, with a patch from Bruce, swr is working for me, but I can't repro this assertion. Are you still seeing it, Brad?
I was able to reproduce, but not through the trace. It runs fine. And, only on llvmpipe, the swr driver doesn't fail. Running VTK test "ctest -R TestGL2PSBillboardTextActor3D" I get a backtrace like Brad's. I'll dig a little deeper.
Looks like a problem in llvmpipe_is_format_supported. If should be returning false for all "sample_count > 0", not 1. That works for me, at least. I'm running all VTK tests against it right now.
(In reply to Bruce Cherniak from comment #5) > Looks like a problem in llvmpipe_is_format_supported. If should be > returning false for all "sample_count > 0", not 1. No, not really. In gallium there's no distinction between sample count 0 and 1 - there are no surfaces with no samples, after all. (Ordinary surfaces should actually always have 1 sample, that 0 is allowed to is more of a historical accident.) Maybe VTK is trying to use msaa surfaces with just one sample, despite that the max sample count supported we announce is 0. Albeit I'm thinking we should have refused that configuration earlier somewhere, if the sample_count gets upgraded later...
I understand that this isn't a solution, but it points to where things go wrong. I'll keep digging for the correct fix.
Roland, you are correct, VTK is trying to use msaa surfaces with just one sample. Here's what's happening: The rendertarget color buffer is being created by: st_api_make_current st_framebuffer_validate xmesa_st_frambuffer_validate_textures llvmpipe_resource_create with nr_samples = 1 Then later in st_framebuffer_validate "changed" is set when the surface is created. This causes the following: _mesa_resize_framebuffer (loops until it gets to the depth attachment) st_renderbuffer_alloc_storage until we get to: 160├──> if (rb->NumSamples > 0) { 161│ unsigned i; 162│ 163│ for (i = MAX2(2, rb->NumSamples); i <= ctx->Const.MaxSamples; i++) { 164│ format = st_choose_renderbuffer_format(st, internalFormat, i); 165│ 166│ if (format != PIPE_FORMAT_NONE) { 167│ rb->NumSamples = i; 168│ break; 169│ } 170│ } 171│ } else { 172│ format = st_choose_renderbuffer_format(st, internalFormat, 0); 173│ } 174│ 175│ /* Not setting gl_renderbuffer::Format here will cause 176│ * FRAMEBUFFER_UNSUPPORTED and ValidateFramebuffer will not be called. 177│ */ 178| if (format == PIPE_FORMAT_NONE) { 179│ return GL_TRUE; 180│ } rb->NumSamples = 1, but ctx->Const.MaxSamples = 0 so the loop @ 163 is skipped and format remains PIPE_FORMAT_NONE. Because to this, we don't allocate a depth buffer and boom. But, this works: - if (rb->NumSamples > 0) { + if (rb->NumSamples > 1) { unsigned i; for (i = MAX2(2, rb->NumSamples); i <= ctx->Const.MaxSamples; i++) { @@ -169,7 +170,7 @@ st_renderbuffer_alloc_storage(struct gl_context * ctx, } } } else { - format = st_choose_renderbuffer_format(st, internalFormat, 0); + format = st_choose_renderbuffer_format(st, internalFormat, rb->NumSamples); } I'm not sure if it honors the original intent. But, in addition to fixing the VTK tests ext_framebuffer_multisample-blit-mismatched-formats now passes.
Created attachment 133654 [details] [review] proposed patch Bruce, can you try the attached patch?
For llvmpipe, this fails in the same way as before. Unless llvmpipe_is_format_supported is changed in the same way that you modified svga_is_format_supported: - if (sample_count > 1) + if (sample_count > 0) However, this is the change that Roland has issue with.
Created attachment 133661 [details] simple-msaa.c Hi Brian, This simple msaa sample fails in the same way when run with "sample-msaa 1".
Created attachment 133687 [details] [review] new patch to try. Hi Bruce, here's another patch to try.
Hi Brian, This appears to fix VTK tests on llvmpipe. However, it breaks SWR. Please let me see what's going on there before submitting this patch for review.
Created attachment 133726 [details] [review] Patch to fix swr after "new patch to try" Hi Brian, I found the reason that swr starts failing with your patch. I was treating any num_samples > 0 as msaa and turning off the fake_msaa cap. The attached (in conjunction with you patch) works correctly. If this makes sense to you, will you please include it with your patch so that we don't end up with a regression?
Comment on attachment 133726 [details] [review] Patch to fix swr after "new patch to try" Nevermind that patch, I found a bug in validating MSAA override value. Updating with a new version.
Created attachment 133732 [details] [review] Updated swr patch to fix msaa once new_patch_to_try has been applied. Here's a patch that fixes the MSAA override validation. "new-msaa-swr.patch" should be included along with the "new_patch_to_try".
After applying the two patches I can confirm that the VTK test I used to produce the apitrace now passes again. Thanks!
The state tracker patch has been committed. I'll leave the swr driver patch to Bruce.
The swr driver patch has been committed, as well.
The VTK test suite passes again since these two patches were merged. Thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.