Created attachment 100628 [details] apitrace that reproduces issue The application used as a test case can be found here: https://github.com/jp9000/obs-studio To reproduce: 1) Start application >.> 2) Create a scene named anything (lower left list box of application, the "+" button underneath). 3) With scene highlighted, create a source (lower middle list box of application, the "+" button underneath). 4) Select "X11 Shared Memory Screen Input". What is displayed is a distorted output of your desktop, upside down, and about one-fourth the size it should be in the upper right corner. There is additional stretched textures outside of this area. The lower left corner contains nothing generally. Provided as an attachment is an apitrace that reproduces the same results on mesa drivers, and produces correct results on other vendors.
apitrace dump-images shows that misrendering happens on the very first draw call (#55762 in the trace). The misrendering looks like a texture coordinate problem: the correct image (in the upper right quadrant) is upside-down, and the edge pixels are extended infinitely (due to GL_CLAMP_TO_EDGE). INTEL_DEBUG=vs shows that the vertex shader outputs are being qualified as "flat", even though the "flat" keyword never appears in the shader source code. I believe this is a bug in our separate shader objects (SSO) implementation. I've observed that link_varyings.cpp:809 is kicking in, marking outputs as flat because they aren't "being consumed by the fragment shader". Indeed, consumer_var is NULL in all cases. CC'ing Ian.
Is there any chance to get this fixed soon?
I have the same issue on my application (PCSX2). The code in link_varyings.cpp (varying_matches::record) is potentially wrong but it isn't the main issue. I try to comment the code and the issue is still here. Nevertheless the code feels wrong as you don't know the existence of the consumer in SSO. So the flat optimization is likely bad. Anyway I try also to render the texture coordinate in the screen. Normally they must be interpolated between [0;1] however the interpolation is done between [-1;1] (potentially with a different sign). Indeed applying a rescaling of the coordinate in the FS, (t + 1.0) / 2.0 seems to render correctly my draw call. It seems [-1;1] is a default value of the raster unit. The behavior is the same if the texture coordinate is not written in the vertex shader. Maybe the code is optimized in the VS. Unfortunately I don't know if there is any possibility to dump VS asm code with Nouveau.
(In reply to gregory.hainaut from comment #3) > Unfortunately I don't know if there is any > possibility to dump VS asm code with Nouveau. NV50_PROG_DEBUG=1 (assuming you've built mesa with --enable-debug) should dump the TGSI, nv50 ir (post optimizations and, separately, post-RA), and the actual instruction stream. NV50_PROG_OPTIMIZE=0 will disable all of the nv50 ir optimizations, =1 will enable some of them, =2 will enable most of them (and =3 should be everything).
(In reply to Ilia Mirkin from comment #4) > (In reply to gregory.hainaut from comment #3) > > Unfortunately I don't know if there is any > > possibility to dump VS asm code with Nouveau. > > NV50_PROG_DEBUG=1 (assuming you've built mesa with --enable-debug) should > dump the TGSI, nv50 ir (post optimizations and, separately, post-RA), and > the actual instruction stream. > > NV50_PROG_OPTIMIZE=0 will disable all of the nv50 ir optimizations, =1 will > enable some of them, =2 will enable most of them (and =3 should be > everything). Greatly thanks you for the info. It is very useful. I think the issue is on the fragment shader input optimization. My fs shader strech a texture. ----- in SHADER { vec4 p; vec2 t; } PSin; out = texture(TextureSampler, PSin.t); ----- Here the dump of the ASM (TGSI?). (Generated with the hack to disable flat optimization) FRAG DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SVIEW[0], 2D, FLOAT DCL CONST[1][0] DCL TEMP[0..1], LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MOV TEMP[0].xy, IN[0].xyyy 1: TEX TEMP[0], TEMP[0], SAMP[0], 2D 2: MOV TEMP[1].xyz, TEMP[0].xyzx 3: MUL TEMP[0].x, TEMP[0].wwww, IMM[0].xxxx 4: MOV TEMP[1].w, TEMP[0].xxxx 5: MOV OUT[0], TEMP[1] 6: END As you can it use the IN[0] whereas I have 2 inputs in the FS. IN[0] is likely my position which is [-1;1]. It confirms the strange behavior of my previous message. My shader doesn't use the parameter p so I guess it was wrongly removed.
Hum, it could be related to item4 (assign_varying_locations). /* Operate in a total of four passes. * * 1. Sort inputs / outputs into a canonical order. This is necessary so * that inputs / outputs of separable shaders will be assigned * predictable locations regardless of the order in which declarations * appeared in the shader source. * * 2. Assign locations for any matching inputs and outputs. * * 3. Mark output variables in the producer that do not have locations as * not being outputs. This lets the optimizer eliminate them. * * 4. Mark input variables in the consumer that do not have locations as * not being inputs. This lets the optimizer eliminate them. */ But the shader uses a basic interface block without location (as far as I understand). in SHADER { vec4 p; vec2 t; } PSin;
After further investigation, the issue comes from the deadcode optimization of the input (in the IR tree). Just disabling the optimization fixes my basic testcase (see a quick patch below). It remains 2 open questions 1/ I think the support of location for Interface Block was added in ARB_enhanced_layouts. If I'm correct, it isn't legal to partially optimize an Interface Block. But I'm not sure. 2/ As far as I understand, the separate state isn't known at the compilation (only the link phase). So we potentially lose the varying optimization even for standard program (potentially it is done at link time). diff --git a/src/glsl/opt_dead_code.cpp b/src/glsl/opt_dead_code.cpp index f45bf5d..ad0403c 100644 --- a/src/glsl/opt_dead_code.cpp +++ b/src/glsl/opt_dead_code.cpp @@ -125,6 +125,10 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) } } + if (entry->var->data.mode == ir_var_shader_in) { + continue; + } + entry->var->remove(); progress = true;
(In reply to gregory.hainaut from comment #5) > Here the dump of the ASM (TGSI?). (Generated with the hack to disable flat > optimization) TGSI (which is DX9-ish assembly type code, but extended well beyond) > > FRAG > DCL IN[0], GENERIC[0], PERSPECTIVE > DCL OUT[0], COLOR > DCL SAMP[0] > DCL SVIEW[0], 2D, FLOAT > DCL CONST[1][0] > DCL TEMP[0..1], LOCAL > IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} > 0: MOV TEMP[0].xy, IN[0].xyyy > 1: TEX TEMP[0], TEMP[0], SAMP[0], 2D > 2: MOV TEMP[1].xyz, TEMP[0].xyzx > 3: MUL TEMP[0].x, TEMP[0].wwww, IMM[0].xxxx > 4: MOV TEMP[1].w, TEMP[0].xxxx > 5: MOV OUT[0], TEMP[1] > 6: END > > As you can it use the IN[0] whereas I have 2 inputs in the FS. IN[0] is > likely my position which is [-1;1]. It confirms the strange behavior of my > previous message. > My shader doesn't use the parameter p so I guess it was wrongly removed. You should double-check that it wasn't also eliminated from the relevant vertex shader. GENERIC[0] should map to whatever GENERIC[0] output in the vertex shader. By the way, you may also be interested in MESA_GLSL=dump which dumps out the mesa glsl ir in various stages.
(In reply to Ilia Mirkin from comment #8) > (In reply to gregory.hainaut from comment #5) > > Here the dump of the ASM (TGSI?). (Generated with the hack to disable flat > > optimization) > > TGSI (which is DX9-ish assembly type code, but extended well beyond) > > > > > FRAG > > DCL IN[0], GENERIC[0], PERSPECTIVE > > DCL OUT[0], COLOR > > DCL SAMP[0] > > DCL SVIEW[0], 2D, FLOAT > > DCL CONST[1][0] > > DCL TEMP[0..1], LOCAL > > IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} > > 0: MOV TEMP[0].xy, IN[0].xyyy > > 1: TEX TEMP[0], TEMP[0], SAMP[0], 2D > > 2: MOV TEMP[1].xyz, TEMP[0].xyzx > > 3: MUL TEMP[0].x, TEMP[0].wwww, IMM[0].xxxx > > 4: MOV TEMP[1].w, TEMP[0].xxxx > > 5: MOV OUT[0], TEMP[1] > > 6: END > > > > As you can it use the IN[0] whereas I have 2 inputs in the FS. IN[0] is > > likely my position which is [-1;1]. It confirms the strange behavior of my > > previous message. > > My shader doesn't use the parameter p so I guess it was wrongly removed. > > You should double-check that it wasn't also eliminated from the relevant > vertex shader. GENERIC[0] should map to whatever GENERIC[0] output in the > vertex shader. VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL OUT[2], GENERIC[1] DCL TEMP[0..2], LOCAL IMM[0] FLT32 { 0.5000, 1.0000, 0.0000, 0.0000} 0: MOV TEMP[0].zw, IMM[0].yyxy 1: MOV TEMP[0].xy, IN[0].xyxx 2: MOV TEMP[1].zw, IMM[0].yyxy 3: MOV TEMP[1].xy, IN[0].xyxx 4: MOV TEMP[2].xy, IN[1].xyxx 5: MOV OUT[1], TEMP[0] 6: MOV OUT[2], TEMP[2] 7: MOV OUT[0], TEMP[1] 8: END The vertex shader outputs 2 variables. POSITION is copyied into GENERIC[0] whereas the texture coordinate is copyied into GENERIC[1]. In separate shader program, Frag shader must work for any Vert shader (or it could be a Geom shader). > By the way, you may also be interested in MESA_GLSL=dump which dumps out the > mesa glsl ir in various stages. Note I renamed the interface name to p_deadcode to ease grep. Before the above patch: (declare (location=24 shader_in ) vec4 packed:t) (declare (location=0 uniform ) ivec4 ScalingFactor) (declare (location=4 shader_out ) vec4 SV_Target0) (declare (temporary ) vec4 SV_Target0) (declare (location=0 uniform ) sampler2D TextureSampler) If I enable the logging of opt_dead_code.cpp p_deadcode@0x93dbca8: 0 refs, 0 assigns, declared in our scope Removed declaration of p_deadcode@0x93dbca8 After the above patch: (declare (location=0 shader_in ) vec4 gl_FragCoord) (declare (location=17 shader_in ) (array vec4 1) gl_ClipDistanceMESA) (declare (location=19 shader_in flat) int gl_PrimitiveID) (declare (location=22 shader_in ) bool gl_FrontFacing) (declare (location=23 shader_in ) vec2 gl_PointCoord) (declare (location=24 shader_in ) vec4 p_deadcode) <= the missing one (declare (location=25 shader_in ) vec4 packed:t) (declare (location=0 uniform ) ivec4 ScalingFactor) (declare (location=4 shader_out ) vec4 SV_Target0) (declare (temporary ) vec4 SV_Target0) (declare (location=0 uniform ) sampler2D TextureSampler) Hum, it might be a better idea to the interface block type to only disable the optimization in this case. I tried to use is_interface_instance but I guess it is only working for uniform.
Extrat of the GL4.5 spec: ------------------------------------------- Chapter 7.4.1 With separable program objects, interfaces between shader stages may involve the outputs from one program object and the inputs from a second program object. For such interfaces, it is not possible to detect mismatches at link time, because the programs are linked separately. When each such program is linked, all inputs or outputs interfacing with another program stage are treated as active. The linker will generate an executable that assumes the presence of a compatible program on the other side of the interface. If a mismatch between programs occurs, no GL error is generated, but some or all of the inputs on the interface will be undefined. ------------------------------------------- So inputs/outputs can't be removed by optimization as they must remain active.
I ran some piglit tests with a new patch (see below) to only remove deadcode variable that doesn't have a location. Unfortunately it triggers various regression in piglit. 1/ Test spec@glsl-1.10@execution@variable-indexing@fs-temp-array-mat2-col-rd Failed to link: error: fragment shader varying color not written by vertex shader An unused variable still exists on the program but technically the variable isn't read. So maybe the check can be improved. The error appears for glsl <= 120 or as a warning in GLES. Maybe we could use ir_variable_data.used bit, don't know. 2/ Test spec@glsl-1.10@api@getactiveattrib 110 Failing shader: attribute vec4 not_used; void main() { gl_Position = gl_Vertex; } Attribute `not_used' should not be active but is. I think we could optimize the vertex shader input as it is the first stage of the pipeline (until someone add a new stage). However I don't know how to check the current stage on the IR (neither if is possible actually). Or there is maybe a special bits to detect attribute input. diff --git a/src/glsl/opt_dead_code.cpp b/src/glsl/opt_dead_code.cpp index f45bf5d..29cbe52 100644 --- a/src/glsl/opt_dead_code.cpp +++ b/src/glsl/opt_dead_code.cpp @@ -125,6 +125,21 @@ do_dead_code(exec_list *instructions, bool uniform_locations_assigned) } } + if (entry->var->data.mode == ir_var_shader_in && + !entry->var->data.explicit_location) + continue; + entry->var->remove(); progress = true;
Normally SSO issues will be fixed by those patchs. http://lists.freedesktop.org/archives/mesa-dev/2015-November/101393.html
As of commit 8117f46f496fb31339 This should be fixed.
*** Bug 94275 has been marked as a duplicate of this bug. ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.