Created attachment 29268 [details] phong vertex shader I have a basic phong GLSL shader (attached) that has a for loop in its pixel shader. It loops until MAX_LIGHTS, which in my case on Intel hardware is #defined to be 0, i.e. it should just ignore the for loop entirely. It is not however and during compilation it notes "Note: 'for (i ... )' body is too large/complex to unroll", and then seems to run the loop body at least once anyway since all my lighting is off (makes sense as I didn't set any parameters for the OpenGL lights), and performance plummets. If I just remove the entire loop (#ifdef 0/#endif) the lighting is fine and FPS is much higher. Obviously it is simple to just put #if MAX_LIGHTS > 0 around it as a workaround but I guess this might be a problem for other applications so I thought I would report it anyway. I am using Mesa from git (latest pull may be a few days ago at time of writing) with an Intel® GM45 Express integrated GPU
Created attachment 29269 [details] phong pixel shader
I added the piglit test glsl-fs-loop-zero-iter to reproduce this behavior. The test produces the correct result, but, as can be seen in the INTEL_DEBUG=wm output below, the loop is not optimized away. brw_wm_glsl_emit: pre-fp: # Fragment Program/Shader 3 0: MOV OUTPUT[1], CONST[0]; 1: MOV TEMP[0].x, CONST[0].xxxx; 2: BGNLOOP; # (end at 10) 3: SGE.C TEMP[1].x, TEMP[0].xxxx, CONST[0].xxxx; 4: IF (NE.xxxx); # (if false, goto 6); 5: BRK (TR.xxxx); # (goto 10); 6: ENDIF; 7: MOV OUTPUT[1], CONST[0].yxzw; 8: ADD TEMP[1].x, TEMP[0].xxxx, CONST[0].yyyy; 9: MOV TEMP[0].x, TEMP[1].xxxx; 10: ENDLOOP; # (goto 2) 11: END pass_fp: 0: MOV OUTPUT[1], CONST[0]; 1: MOV TEMP[0].x, CONST[0].xxxx; 2: BGNLOOP; # (end at 10) 3: SGE.C TEMP[1].x, TEMP[0].xxxx, CONST[0].xxxx; 4: IF (NE.xxxx); # (if false, goto 6); 5: BRK (TR.xxxx); # (goto 10); 6: ENDIF; 7: MOV OUTPUT[1], CONST[0].yxzw; 8: ADD TEMP[1].x, TEMP[0].xxxx, CONST[0].yyyy; 9: MOV TEMP[0].x, TEMP[1].xxxx; 10: ENDLOOP; # (goto 2) 11: FB_WRITE ???, OUTPUT[1], FILE14[30], OUTPUT[0]; wm-native: mov(1) a0<1>UW 0x00e0UW { align1 }; mov(8) g9<1>F 0F { align1 }; mov(8) g10<1>F 1F { align1 }; mov(8) g11<1>F 0F { align1 }; mov(8) g12<1>F 0F { align1 }; mov(8) g13<1>F 0F { align1 }; do(8) { align1 }; cmp.ge(8) null g13<8,8,1>F 0F { align1 }; mov(8) g14<1>F 0F { align1 }; (+f0) mov(8) g14<1>F 1F { align1 }; (+f0) iff(8) ip ip 3D { align1 switch }; break(8) ip 9D { align1 }; endif(8) g0<4,4,1>UD 65536D { align1 switch }; mov(8) g9<1>F 1F { align1 }; mov(8) g10<1>F 0F { align1 }; mov(8) g11<1>F 0F { align1 }; mov(8) g12<1>F 0F { align1 }; add(8) g14<1>F g13<8,8,1>F 1F { align1 }; mov(8) g13<1>F g14<8,8,1>F { align1 }; while(8) ip 65524D { align1 }; mov(8) m2<1>F g9<8,8,1>F { align1 }; mov(8) m3<1>F g10<8,8,1>F { align1 }; mov(8) m4<1>F g11<8,8,1>F { align1 }; mov(8) m5<1>F g12<8,8,1>F { align1 }; mov(8) m1<1>F g1<8,8,1>F { align1 nomask }; send(8) 0 null g0<8,8,1>UW write (0, 12, 4, 0) mlen 6 rlen 0 { align1 EOT }; brw_wm_glsl_emit done:
commit 7850ce0a9990c7f752e43a1dd88c204a7cf090aa Author: Ian Romanick <ian.d.romanick@intel.com> Date: Fri Aug 27 11:26:08 2010 -0700 glsl2: Eliminate zero-iteration loops
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.