Bug 98840

Summary: nir clone test fails
Product: Mesa Reporter: Mark Janes <mark.a.janes>
Component: Drivers/DRI/i965Assignee: Timothy Arceri <t_arceri>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: jason
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Mark Janes 2016-11-24 05:52:16 UTC
Intel's Mesa CI periodically runs test suites with NIR_TEST_CLONE=true, at the request of developers.

This execution mode began failing at some point between the following commits:

Good: 
0187f88	Revert "configure.ac: honour LLVM_LIBDIR when linking against LLVM"

Bad:
92ec47a	glsl: define __STDC_FORMAT_MACROS to get PRIx64 macro
Comment 1 Jason Ekstrand 2016-11-29 05:30:15 UTC
This is really Tim's bug.  The problem occurred when he moved us to using nir_gather_info.  When we start off, the nir_shader::info field points to gl_program::info.  However, when the shader gets cloned, we ralloc a copy of the shader_info and copy the data into that and nir_shader::info no longer points to gl_program::info but points to a copy of the shader_info.  With NIR_TEST_CLONE=1, nir_shader_gather_info gets run after some cloning has been done and so the copied shader_info is what gets filled out and gl_program::info gets left alone.  One hack solution is to just do "prog->info = *nir->info" after we run nir_shader_gather_info, but I don't think this is a long-term solution.

After going around in circles a bit, I think the answer is probably to make nir_shader::info an embedded struct again and not a pointer.  Now that we use the same shader_info struct in gl_program as we do in nir_shader, we can easily pre-populate the one in nir_shader and, after we're done with our early NIR optimization and we've run nir_shader_gather_info, we copy it back.
Comment 2 Timothy Arceri 2016-11-29 06:04:33 UTC
(In reply to Jason Ekstrand from comment #1)
> This is really Tim's bug.  The problem occurred when he moved us to using
> nir_gather_info.  When we start off, the nir_shader::info field points to
> gl_program::info.  However, when the shader gets cloned, we ralloc a copy of
> the shader_info and copy the data into that and nir_shader::info no longer
> points to gl_program::info but points to a copy of the shader_info.  With
> NIR_TEST_CLONE=1, nir_shader_gather_info gets run after some cloning has
> been done and so the copied shader_info is what gets filled out and
> gl_program::info gets left alone.  One hack solution is to just do
> "prog->info = *nir->info" after we run nir_shader_gather_info, but I don't
> think this is a long-term solution.
> 
> After going around in circles a bit, I think the answer is probably to make
> nir_shader::info an embedded struct again and not a pointer.  Now that we
> use the same shader_info struct in gl_program as we do in nir_shader, we can
> easily pre-populate the one in nir_shader and, after we're done with our
> early NIR optimization and we've run nir_shader_gather_info, we copy it back.

Wouldn't that just mean we would end up with:

"prog->info = nir->info"

rather than

"prog->info = *nir->info"

How is that any better?

I'll need to take a look at what NIR_TEST_CLONE=true is meant to do this is the first time I've heard of it.
Comment 3 Timothy Arceri 2016-12-20 04:52:09 UTC
Should be fixed by:

commit f562b13bc7d0b4fd954d285a9325e94167b16bf5
Author: Timothy Arceri <timothy.arceri@collabora.com>
Date:   Thu Dec 1 13:37:38 2016 +1100

    i965: keep gl_program shader info in sync after gather info
    
    It's possible that nir_shader was cloned and it no longer contains
    a pointer to the shader_info in gl_program. So we need to copy
    shader_info back to gl_program if that is the case.
    
    Fixes a regression with NIR_TEST_CLONE=true
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98840
Comment 4 Mark Janes 2016-12-20 16:16:11 UTC
Tim's patch fixed most of the errors, but we still have some failures:

    piglit.spec.glsl-1_10.execution.vs-nested-return-sibling-loop
    piglit.spec.glsl-1_10.execution.vs-nested-return-sibling-loop2

/tmp/build_root/m64/lib/piglit/bin/shader_runner /tmp/build_root/m64/lib/piglit/tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop.shader_test -auto
Probe color at (0,0)
  Expected: 255 255 0 255
  Observed: 0 255 0 255
Test failure on line 46

This may be some other NIR_TEST_CLONE regression in mesa since the one that Timothy fixed.  The test target has not been functional in the intervening weeks, so Timothy's fix will need to be rebased back in the history for a bisection.
Comment 5 Mark Janes 2016-12-20 17:09:28 UTC
My mistake:  these tests were recently added to piglit and fails on master.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.