r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 2: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: DRM version: 2.24.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression RAM: YES, Z compression RAM: YES, HiZ RAM: YES r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 2: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: DRM version: 2.24.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression RAM: YES, Z compression RAM: YES, HiZ RAM: YES r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 2D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: src0.xyz = temp[1], src0.w = temp[1] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0], 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0], 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[2], input[0], 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[2], src0.w = temp[2] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe402e400: src: 0 R/G/B/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV_SAT OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV_SAT OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL OUT[0], COLOR 0: MOV_SAT OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 2: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: DRM version: 2.24.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression RAM: YES, Z compression RAM: YES, HiZ RAM: YES r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 2D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: src0.xyz = temp[1], src0.w = temp[1] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0], 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0], 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[2], input[0], 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[2], src0.w = temp[2] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe402e400: src: 0 R/G/B/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 3D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 3D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 3D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 3D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 3D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 3D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[1], input[0], 3D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 3D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[0], 3D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[0], 3D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[0], 3D[0]; 1: src0.xyz = temp[1], src0.w = temp[1] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0], 3D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0], 3D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[2], input[0], 3D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[2], src0.w = temp[2] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe402e400: src: 0 R/G/B/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV_SAT OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program radeon: Acquired access to Hyper-Z. FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR 0: END Fragment Program: before compilation # Radeon Compiler Program Fragment Program: after 'rewrite depth out' # Radeon Compiler Program Fragment Program: after 'transform KILP' # Radeon Compiler Program Fragment Program: after 'unroll loops' # Radeon Compiler Program Fragment Program: after 'transform TEX' # Radeon Compiler Program Fragment Program: after 'transform IF' # Radeon Compiler Program Fragment Program: after 'native rewrite' # Radeon Compiler Program Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program Fragment Program: after 'dead constants' # Radeon Compiler Program Fragment Program: after 'pair translate' # Radeon Compiler Program Fragment Program: after 'pair scheduling' # Radeon Compiler Program Fragment Program: after 'dead sources' # Radeon Compiler Program Fragment Program: after 'register allocation' # Radeon Compiler Program R500 Fragment Program: -------- 0 0:CMN_INST 0x00000005:OUT TEX_WAIT wmask: NONE omask: NONE 1:RGB_ADDR 0x00000000:Addr0: 0t, Addr1: 0t, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x00000000:Addr0: 0t, Addr1: 0t, Addr2: 0t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 2: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: DRM version: 2.24.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression RAM: YES, Z compression RAM: YES, HiZ RAM: YES r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 2: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: DRM version: 2.24.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression RAM: YES, Z compression RAM: YES, HiZ RAM: YES Mesa: User error: GL_INVALID_VALUE in glPointSize 8562: glDebugOutputCallback: High severity API error 1, GL_INVALID_VALUE in glPointSize 0 8562 glPointSize(size = 0) 8562: warning: glGetError(glPointSize) = GL_INVALID_VALUE Mesa: User error: GL_INVALID_VALUE in glPointSize 8766: glDebugOutputCallback: High severity API error 1, GL_INVALID_VALUE in glPointSize 0 8766 glPointSize(size = 0) 8766: warning: glGetError(glPointSize) = GL_INVALID_VALUE r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], COLOR[1] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV_SAT OUT[1], IN[1] 5: MOV_SAT OUT[2], IN[2] 6: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x01f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 7: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR, COLOR DCL IN[1], COLOR[1], COLOR DCL OUT[0], COLOR DCL CONST[0] DCL TEMP[0], ARRAY(1), LOCAL 0: MAD TEMP[0], IN[1], CONST[0], IN[0] 1: MOV_SAT OUT[0], TEMP[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MAD temp[0], input[1], const[0], input[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0], src2.xyz = input[0], src2.w = input[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[0].w, src0.w, src1.w, src2.w 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0], src2.xyz = input[0], src2.w = input[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[0].w, src0.w, src1.w, src2.w 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0], src2.xyz = input[0], src2.w = input[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[0].w, src0.w, src1.w, src2.w 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0], src2.xyz = input[0], src2.w = input[0] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 1: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00040001:Addr0: 1t, Addr1: 0c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x00040001:Addr0: 1t, Addr1: 0c, Addr2: 0t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c222020:MAD dest:2 rgb_C_src:2 R/G/B 0 alp_C_src:2 A 0 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], COLOR[1] DCL OUT[3], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV_SAT OUT[1], IN[1] 5: MOV_SAT OUT[2], IN[2] 6: MOV OUT[3], IN[3] 7: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x01f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 6: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10061 reg: 3i swiz: X/ Y/ Z/ W src1: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 src2: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 7: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 8: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR, COLOR DCL IN[1], COLOR[1], COLOR DCL IN[2], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL 0: TEX TEMP[0], IN[2].xyyy, SAMP[0], 2D 1: MOV_SAT TEMP[0], TEMP[0] 2: MUL TEMP[1], IN[1], CONST[0] 3: MAD TEMP[2], TEMP[0], IN[0], TEMP[1] 4: MOV_SAT OUT[0], TEMP[2] 5: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: MOV_SAT output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 4: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[3], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[4].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[4].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[3], src0.w = temp[3] SEM_WAIT MAD_SAT temp[3].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[3].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[4], src2.w = temp[4] MAD temp[5].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[5].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4035402: src: 2 R/G/G/G dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c040:MAD dest:4 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 2 0:CMN_INST 0x00187804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00400003:Addr0: 3t, Addr1: 0t, Addr2: 4t, srcp:0 2:ALPHA_ADDR 0x00400003:Addr0: 3t, Addr1: 0t, Addr2: 4t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c050:MAD dest:5 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c222050:MAD dest:5 rgb_C_src:2 R/G/B 0 alp_C_src:2 A 0 4 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 context mis-match in pipe_sampler_view_release() r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL CONST[0..6] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL DCL TEMP[3..4], LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[0].yyyy, CONST[2] 1: MAD TEMP[0], CONST[1], IN[0].xxxx, TEMP[0] 2: MAD TEMP[0], CONST[3], IN[0].zzzz, TEMP[0] 3: ADD TEMP[0], TEMP[0], CONST[4] 4: ADD TEMP[1].x, -TEMP[0].zzzz, CONST[6].yyyy 5: MUL TEMP[1].x, TEMP[1].xxxx, CONST[6].wwww 6: MOV TEMP[2].xy, IN[1].xyxx 7: MOV TEMP[3].xw, TEMP[0].xxzw 8: MOV_SAT TEMP[1].x, TEMP[1].xxxx 9: MUL TEMP[4].x, TEMP[0].yyyy, CONST[0].yyyy 10: MOV TEMP[3].y, TEMP[4].xxxx 11: MAD TEMP[3].xy, CONST[0].zwww, TEMP[0].wwww, TEMP[3].xyyy 12: MAD TEMP[0].x, TEMP[0].zzzz, IMM[0].xxxx, -TEMP[0].wwww 13: MOV TEMP[3].z, TEMP[0].xxxx 14: MOV OUT[1], TEMP[1].xxxx 15: MOV OUT[0], TEMP[3] 16: MOV OUT[2], TEMP[2] 17: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; CONST[7] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 4: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x1e924000 reg: 0t swiz: -Z/-Z/-Z/-Z src1: 0x004920c2 reg: 6c swiz: Y/ Y/ Y/ Y src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 5: op: 0x00102002 dst: 1t op: VE_MULTIPLY src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x00db60c2 reg: 6c swiz: W/ W/ W/ W src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 6: op: 0x00304003 dst: 2t op: VE_ADD src0: 0x00010021 reg: 1i swiz: X/ Y/ X/ X src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 7: op: 0x00906003 dst: 3t op: VE_ADD src0: 0x00d00000 reg: 0t swiz: X/ X/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x01102003 dst: 1t op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00108002 dst: 4t op: VE_MULTIPLY src0: 0x00492000 reg: 0t swiz: Y/ Y/ Y/ Y src1: 0x00492002 reg: 0c swiz: Y/ Y/ Y/ Y src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 10: op: 0x00206003 dst: 3t op: VE_ADD src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 11: op: 0x00306004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00db4002 reg: 0c swiz: Z/ W/ W/ W src1: 0x00db6000 reg: 0t swiz: W/ W/ W/ W src2: 0x00490060 reg: 3t swiz: X/ Y/ Y/ Y 12: op: 0x00100004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924000 reg: 0t swiz: Z/ Z/ Z/ Z src1: 0x000000e2 reg: 7c swiz: X/ X/ X/ X src2: 0x1edb6000 reg: 0t swiz: -W/-W/-W/-W 13: op: 0x00406003 dst: 3t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 14: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 15: op: 0x00f0a003 dst: 5t op: VE_ADD src0: 0x00d10060 reg: 3t swiz: X/ Y/ Z/ W src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 16: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 17: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 18: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..2] DCL TEMP[0], LOCAL DCL TEMP[1], ARRAY(1), LOCAL DCL TEMP[2], LOCAL 0: TEX TEMP[0], IN[1].xyyy, SAMP[0], 2D 1: MOV TEMP[1], TEMP[0] 2: ADD TEMP[0].x, CONST[2].zzzz, -IN[0].xxxx 3: ADD TEMP[2].x, CONST[2].zzzz, -CONST[2].yyyy 4: RCP TEMP[2].x, TEMP[2].xxxx 5: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[2].xxxx 6: LRP TEMP[1].xyz, TEMP[0].xxxx, TEMP[1].xyzz, CONST[1].xyzz 7: MOV_SAT OUT[0], TEMP[1] 8: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: ADD temp[3].xyz, temp[1].xyzz, -const[1].xyzz; 7: MAD temp[1].xyz, temp[0].xxxx, temp[3], const[1].xyzz; 8: MOV_SAT output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: ADD temp[3].xyz, temp[1].xyzz, -const[1].xyzz; 7: MAD temp[1].xyz, temp[0].xxxx, temp[3], const[1].xyzz; 8: MOV_SAT output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: ADD temp[3].xyz, temp[1].xyzz, -const[1].xyzz; 7: MAD temp[1].xyz, temp[0].xxxx, temp[3], const[1].xyzz; 8: MOV_SAT output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 2: src0.xyz = const[2], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 3: src0.xyz = const[2] MAD temp[2].x, src0.zzz, src0.111, -src0.yyy 4: src0.xyz = temp[2] REPL_ALPHA temp[2].x RCP, src0.x 5: src0.xyz = temp[0], src1.xyz = temp[2] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 6: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 7: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[1] MAD temp[1].xyz, src0.xxx, src1.xyz, src2.xyz 8: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[2] MAD temp[2].x, src0.zzz, src0.111, -src0.yyy 3: src0.xyz = temp[2] REPL_ALPHA temp[2].x RCP, src0.x 4: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[2], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = temp[0], src1.xyz = temp[2] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[1] MAD temp[1].xyz, src0.xxx, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[2] MAD temp[2].x, src0.zzz, src0.111, -src0.yyy 3: src0.xyz = temp[2] REPL_ALPHA temp[2].x RCP, src0.x 4: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[2], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = temp[0], src1.xyz = temp[2] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[1] MAD temp[1].xyz, src0.xxx, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[2], input[0].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[2] MAD temp[4].x, src0.zzz, src0.111, -src0.yyy 3: src0.xyz = temp[4] REPL_ALPHA temp[4].x RCP, src0.x 4: src0.xyz = temp[2], src0.w = temp[2] SEM_WAIT MAD temp[3].xyz, src0.xyz, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[3], src1.xyz = const[1] MAD temp[5].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[2], src1.xyz = input[1] MAD temp[2].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = temp[2], src1.xyz = temp[4] MAD_SAT temp[2].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[2], src1.xyz = temp[5], src2.xyz = const[1] MAD temp[3].xyz, src0.xxx, src1.xyz, src2.xyz 9: src0.xyz = temp[3], src0.w = temp[3] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4025400: src: 0 R/G/G/G dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020102:Addr0: 2c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924040:MAD dest:4 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 2 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000004a:SOP dest:4 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040403:Addr0: 3t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21050:MAD dest:5 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000502:Addr0: 2c, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801020:MAD dest:2 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001002:Addr0: 2t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10101402:Addr0: 2t, Addr1: 5t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222030:MAD dest:3 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 8 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL OUT[5], GENERIC[3] DCL CONST[0..17] DCL TEMP[0..3], LOCAL DCL TEMP[4..7], ARRAY(1), LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[0].yyyy, CONST[7] 1: MAD TEMP[0], CONST[6], IN[0].xxxx, TEMP[0] 2: MAD TEMP[0], CONST[8], IN[0].zzzz, TEMP[0] 3: ADD TEMP[1], TEMP[0], CONST[9] 4: MAD TEMP[2].x, TEMP[1].xxxx, CONST[1].xxxx, CONST[16].wwww 5: MAD TEMP[2].x, TEMP[2].xxxx, CONST[1].yyyy, CONST[1].zzzz 6: FRC TEMP[2].x, TEMP[2].xxxx 7: MAD TEMP[2].x, TEMP[2].xxxx, CONST[2].xxxx, CONST[2].yyyy 8: MUL TEMP[2].x, TEMP[2].xxxx, TEMP[2].xxxx 9: MAD TEMP[3].x, TEMP[2].xxxx, CONST[2].zzzz, CONST[2].wwww 10: MAD TEMP[3].x, TEMP[2].xxxx, TEMP[3].xxxx, CONST[1].wwww 11: MAD TEMP[3].x, TEMP[2].xxxx, TEMP[3].xxxx, CONST[0].xxxx 12: MAD TEMP[3].x, TEMP[2].xxxx, TEMP[3].xxxx, CONST[0].yyyy 13: MAD TEMP[3].x, TEMP[2].xxxx, TEMP[3].xxxx, CONST[0].zzzz 14: MAD TEMP[3].x, TEMP[3].xxxx, CONST[0].wwww, TEMP[1].yyyy 15: ADD TEMP[3].x, TEMP[3].xxxx, CONST[3].xxxx 16: MUL TEMP[0], TEMP[3].xxxx, CONST[11] 17: MAD TEMP[0], CONST[10], TEMP[1].xxxx, TEMP[0] 18: MUL TEMP[2], CONST[17].yyzy, CONST[3].yzzy 19: MAD TEMP[0], CONST[12], TEMP[1].zzzz, TEMP[0] 20: MAD TEMP[3].xy, TEMP[1].zxxx, CONST[3].wwww, TEMP[2].xyyy 21: MAD TEMP[0], CONST[13], TEMP[1].wwww, TEMP[0] 22: MAD TEMP[1].xy, TEMP[1].zxxx, CONST[4].xxxx, TEMP[2].zwww 23: ADD TEMP[2].x, -TEMP[0].zzzz, CONST[15].yyyy 24: MUL TEMP[2].x, TEMP[2].xxxx, CONST[15].wwww 25: MOV TEMP[4], TEMP[0] 26: MOV TEMP[5].xy, IN[1].xyxx 27: MOV TEMP[6].xy, TEMP[3].xyxx 28: MOV TEMP[7].xy, TEMP[1].xyxx 29: MOV TEMP[1].xw, TEMP[0].xxzw 30: MOV_SAT TEMP[2].x, TEMP[2].xxxx 31: MUL TEMP[3].x, TEMP[0].yyyy, CONST[5].yyyy 32: MOV TEMP[1].y, TEMP[3].xxxx 33: MAD TEMP[1].xy, CONST[5].zwww, TEMP[0].wwww, TEMP[1].xyyy 34: MAD TEMP[0].x, TEMP[0].zzzz, IMM[0].xxxx, -TEMP[0].wwww 35: MOV TEMP[1].z, TEMP[0].xxxx 36: MOV OUT[1], TEMP[2].xxxx 37: MOV OUT[0], TEMP[1] 38: MOV OUT[2], TEMP[4] 39: MOV OUT[3], TEMP[5] 40: MOV OUT[4], TEMP[6] 41: MOV OUT[5], TEMP[7] 42: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[7]; 1: MAD temp[0], const[6], input[0].xxxx, temp[0]; 2: MAD temp[0], const[8], input[0].zzzz, temp[0]; 3: ADD temp[1], temp[0], const[9]; 4: MAD temp[2].x, temp[1].xxxx, const[1].xxxx, const[16].wwww; 5: MAD temp[2].x, temp[2].xxxx, const[1].yyyy, const[1].zzzz; 6: FRC temp[2].x, temp[2].xxxx; 7: MAD temp[2].x, temp[2].xxxx, const[2].xxxx, const[2].yyyy; 8: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 9: MAD temp[3].x, temp[2].xxxx, const[2].zzzz, const[2].wwww; 10: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[1].wwww; 11: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].xxxx; 12: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].yyyy; 13: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].zzzz; 14: MAD temp[3].x, temp[3].xxxx, const[0].wwww, temp[1].yyyy; 15: ADD temp[3].x, temp[3].xxxx, const[3].xxxx; 16: MUL temp[0], temp[3].xxxx, const[11]; 17: MAD temp[0], const[10], temp[1].xxxx, temp[0]; 18: MUL temp[2], const[17].yyzy, const[3].yzzy; 19: MAD temp[0], const[12], temp[1].zzzz, temp[0]; 20: MAD temp[3].xy, temp[1].zxxx, const[3].wwww, temp[2].xyyy; 21: MAD temp[0], const[13], temp[1].wwww, temp[0]; 22: MAD temp[1].xy, temp[1].zxxx, const[4].xxxx, temp[2].zwww; 23: ADD temp[2].x, -temp[0].zzzz, const[15].yyyy; 24: MUL temp[2].x, temp[2].xxxx, const[15].wwww; 25: MOV temp[4], temp[0]; 26: MOV temp[5].xy, input[1].xyxx; 27: MOV temp[6].xy, temp[3].xyxx; 28: MOV temp[7].xy, temp[1].xyxx; 29: MOV temp[1].xw, temp[0].xxzw; 30: MOV_SAT temp[2].x, temp[2].xxxx; 31: MUL temp[3].x, temp[0].yyyy, const[5].yyyy; 32: MOV temp[1].y, temp[3].xxxx; 33: MAD temp[1].xy, const[5].zwww, temp[0].wwww, temp[1].xyyy; 34: MAD temp[0].x, temp[0].zzzz, const[18].xxxx, -temp[0].wwww; 35: MOV temp[1].z, temp[0].xxxx; 36: MOV output[1], temp[2].xxxx; 37: MOV temp[8], temp[1]; 38: MOV output[2], temp[4]; 39: MOV output[3], temp[5]; 40: MOV output[4], temp[6]; 41: MOV output[5], temp[7]; 42: MOV output[0], temp[8]; 43: MOV output[6], temp[8]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[7]; 1: MAD temp[0], const[6], input[0].xxxx, temp[0]; 2: MAD temp[0], const[8], input[0].zzzz, temp[0]; 3: ADD temp[1], temp[0], const[9]; 4: MAD temp[2].x, temp[1].xxxx, const[1].xxxx, const[16].wwww; 5: MAD temp[2].x, temp[2].xxxx, const[1].yyyy, const[1].zzzz; 6: FRC temp[2].x, temp[2].xxxx; 7: MAD temp[2].x, temp[2].xxxx, const[2].xxxx, const[2].yyyy; 8: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 9: MAD temp[3].x, temp[2].xxxx, const[2].zzzz, const[2].wwww; 10: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[1].wwww; 11: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].xxxx; 12: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].yyyy; 13: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].zzzz; 14: MAD temp[3].x, temp[3].xxxx, const[0].wwww, temp[1].yyyy; 15: ADD temp[3].x, temp[3].xxxx, const[3].xxxx; 16: MUL temp[0], temp[3].xxxx, const[11]; 17: MAD temp[0], const[10], temp[1].xxxx, temp[0]; 18: MUL temp[2], const[17].yyzy, const[3].yzzy; 19: MAD temp[0], const[12], temp[1].zzzz, temp[0]; 20: MAD temp[3].xy, temp[1].zxxx, const[3].wwww, temp[2].xyyy; 21: MAD temp[0], const[13], temp[1].wwww, temp[0]; 22: MAD temp[1].xy, temp[1].zxxx, const[4].xxxx, temp[2].zwww; 23: ADD temp[2].x, -temp[0].zzzz, const[15].yyyy; 24: MUL temp[2].x, temp[2].xxxx, const[15].wwww; 25: MOV temp[4], temp[0]; 26: MOV temp[5].xy, input[1].xyxx; 27: MOV temp[6].xy, temp[3].xyxx; 28: MOV temp[7].xy, temp[1].xyxx; 29: MOV temp[1].xw, temp[0].xxzw; 30: MOV_SAT temp[2].x, temp[2].xxxx; 31: MUL temp[3].x, temp[0].yyyy, const[5].yyyy; 32: MOV temp[1].y, temp[3].xxxx; 33: MAD temp[1].xy, const[5].zwww, temp[0].wwww, temp[1].xyyy; 34: MAD temp[0].x, temp[0].zzzz, const[18].xxxx, -temp[0].wwww; 35: MOV temp[1].z, temp[0].xxxx; 36: MOV output[1], temp[2].xxxx; 37: MOV temp[8], temp[1]; 38: MOV output[2], temp[4]; 39: MOV output[3], temp[5]; 40: MOV output[4], temp[6]; 41: MOV output[5], temp[7]; 42: MOV output[0], temp[8]; 43: MOV output[6], temp[8]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[7]; 1: MAD temp[0], const[6], input[0].xxxx, temp[0]; 2: MAD temp[0], const[8], input[0].zzzz, temp[0]; 3: ADD temp[1], temp[0], const[9]; 4: MAD temp[2].x, temp[1].xxxx, const[1].xxxx, const[16].wwww; 5: MAD temp[2].x, temp[2].xxxx, const[1].yyyy, const[1].zzzz; 6: FRC temp[2].x, temp[2].xxxx; 7: MAD temp[2].x, temp[2].xxxx, const[2].xxxx, const[2].yyyy; 8: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 9: MAD temp[3].x, temp[2].xxxx, const[2].zzzz, const[2].wwww; 10: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[1].wwww; 11: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].xxxx; 12: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].yyyy; 13: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].zzzz; 14: MAD temp[3].x, temp[3].xxxx, const[0].wwww, temp[1].yyyy; 15: ADD temp[3].x, temp[3].xxxx, const[3].xxxx; 16: MUL temp[0], temp[3].xxxx, const[11]; 17: MAD temp[0], const[10], temp[1].xxxx, temp[0]; 18: MUL temp[2], const[17].yyzy, const[3].yzzy; 19: MAD temp[0], const[12], temp[1].zzzz, temp[0]; 20: MAD temp[3].xy, temp[1].zxxx, const[3].wwww, temp[2].xyyy; 21: MAD temp[0], const[13], temp[1].wwww, temp[0]; 22: MAD temp[1].xy, temp[1].zxxx, const[4].xxxx, temp[2].zwww; 23: ADD temp[2].x, -temp[0].zzzz, const[15].yyyy; 24: MUL temp[2].x, temp[2].xxxx, const[15].wwww; 25: MOV temp[4], temp[0]; 26: MOV temp[5].xy, input[1].xyxx; 27: MOV temp[6].xy, temp[3].xyxx; 28: MOV temp[7].xy, temp[1].xyxx; 29: MOV temp[1].xw, temp[0].xxzw; 30: MOV_SAT temp[2].x, temp[2].xxxx; 31: MUL temp[3].x, temp[0].yyyy, const[5].yyyy; 32: MOV temp[1].y, temp[3].xxxx; 33: MAD temp[1].xy, const[5].zwww, temp[0].wwww, temp[1].xyyy; 34: MAD temp[0].x, temp[0].zzzz, const[18].xxxx, -temp[0].wwww; 35: MOV temp[1].z, temp[0].xxxx; 36: MOV output[1], temp[2].xxxx; 37: MOV temp[8], temp[1]; 38: MOV output[2], temp[4]; 39: MOV output[3], temp[5]; 40: MOV output[4], temp[6]; 41: MOV output[5], temp[7]; 42: MOV output[0], temp[8]; 43: MOV output[6], temp[8]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[7]; 1: MAD temp[0], const[6], input[0].xxxx, temp[0]; 2: MAD temp[0], const[8], input[0].zzzz, temp[0]; 3: ADD temp[1], temp[0], const[9]; 4: MOV temp[9], const[16].wwww; 5: MAD temp[2].x, temp[1].xxxx, const[1].xxxx, temp[9]; 6: MAD temp[2].x, temp[2].xxxx, const[1].yyyy, const[1].zzzz; 7: FRC temp[2].x, temp[2].xxxx; 8: MAD temp[2].x, temp[2].xxxx, const[2].xxxx, const[2].yyyy; 9: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 10: MAD temp[3].x, temp[2].xxxx, const[2].zzzz, const[2].wwww; 11: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[1].wwww; 12: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].xxxx; 13: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].yyyy; 14: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].zzzz; 15: MAD temp[3].x, temp[3].xxxx, const[0].wwww, temp[1].yyyy; 16: ADD temp[3].x, temp[3].xxxx, const[3].xxxx; 17: MUL temp[0], temp[3].xxxx, const[11]; 18: MAD temp[0], const[10], temp[1].xxxx, temp[0]; 19: MOV temp[10], const[3].yzzy; 20: MUL temp[2], const[17].yyzy, temp[10]; 21: MAD temp[0], const[12], temp[1].zzzz, temp[0]; 22: MAD temp[3].xy, temp[1].zxxx, const[3].wwww, temp[2].xyyy; 23: MAD temp[0], const[13], temp[1].wwww, temp[0]; 24: MAD temp[1].xy, temp[1].zxxx, const[4].xxxx, temp[2].zwww; 25: ADD temp[2].x, -temp[0].zzzz, const[15].yyyy; 26: MUL temp[2].x, temp[2].xxxx, const[15].wwww; 27: MOV temp[4], temp[0]; 28: MOV temp[5].xy, input[1].xyxx; 29: MOV temp[6].xy, temp[3].xyxx; 30: MOV temp[7].xy, temp[1].xyxx; 31: MOV temp[1].xw, temp[0].xxzw; 32: MOV_SAT temp[2].x, temp[2].xxxx; 33: MUL temp[3].x, temp[0].yyyy, const[5].yyyy; 34: MOV temp[1].y, temp[3].xxxx; 35: MAD temp[1].xy, const[5].zwww, temp[0].wwww, temp[1].xyyy; 36: MAD temp[0].x, temp[0].zzzz, const[18].xxxx, -temp[0].wwww; 37: MOV temp[1].z, temp[0].xxxx; 38: MOV output[1], temp[2].xxxx; 39: MOV temp[8], temp[1]; 40: MOV output[2], temp[4]; 41: MOV output[3], temp[5]; 42: MOV output[4], temp[6]; 43: MOV output[5], temp[7]; 44: MOV output[0], temp[8]; 45: MOV output[6], temp[8]; CONST[18] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[7]; 1: MAD temp[0], const[6], input[0].xxxx, temp[0]; 2: MAD temp[0], const[8], input[0].zzzz, temp[0]; 3: ADD temp[1], temp[0], const[9]; 4: MOV temp[9], const[16].wwww; 5: MAD temp[2].x, temp[1].xxxx, const[1].xxxx, temp[9]; 6: MAD temp[2].x, temp[2].xxxx, const[1].yyyy, const[1].zzzz; 7: FRC temp[2].x, temp[2].xxxx; 8: MAD temp[2].x, temp[2].xxxx, const[2].xxxx, const[2].yyyy; 9: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 10: MAD temp[3].x, temp[2].xxxx, const[2].zzzz, const[2].wwww; 11: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[1].wwww; 12: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].xxxx; 13: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].yyyy; 14: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].zzzz; 15: MAD temp[3].x, temp[3].xxxx, const[0].wwww, temp[1].yyyy; 16: ADD temp[3].x, temp[3].xxxx, const[3].xxxx; 17: MUL temp[0], temp[3].xxxx, const[11]; 18: MAD temp[0], const[10], temp[1].xxxx, temp[0]; 19: MOV temp[10], const[3].yzzy; 20: MUL temp[2], const[17].yyzy, temp[10]; 21: MAD temp[0], const[12], temp[1].zzzz, temp[0]; 22: MAD temp[3].xy, temp[1].zxxx, const[3].wwww, temp[2].xyyy; 23: MAD temp[0], const[13], temp[1].wwww, temp[0]; 24: MAD temp[1].xy, temp[1].zxxx, const[4].xxxx, temp[2].zwww; 25: ADD temp[2].x, -temp[0].zzzz, const[15].yyyy; 26: MUL temp[2].x, temp[2].xxxx, const[15].wwww; 27: MOV temp[4], temp[0]; 28: MOV temp[5].xy, input[1].xyxx; 29: MOV temp[6].xy, temp[3].xyxx; 30: MOV temp[7].xy, temp[1].xyxx; 31: MOV temp[1].xw, temp[0].xxzw; 32: MOV_SAT temp[2].x, temp[2].xxxx; 33: MUL temp[3].x, temp[0].yyyy, const[5].yyyy; 34: MOV temp[1].y, temp[3].xxxx; 35: MAD temp[1].xy, const[5].zwww, temp[0].wwww, temp[1].xyyy; 36: MAD temp[0].x, temp[0].zzzz, const[18].xxxx, -temp[0].wwww; 37: MOV temp[1].z, temp[0].xxxx; 38: MOV output[1], temp[2].xxxx; 39: MOV temp[8], temp[1]; 40: MOV output[2], temp[4]; 41: MOV output[3], temp[5]; 42: MOV output[4], temp[6]; 43: MOV output[5], temp[7]; 44: MOV output[0], temp[8]; 45: MOV output[6], temp[8]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[7]; 1: MAD temp[0], const[6], input[0].xxxx, temp[0]; 2: MAD temp[0], const[8], input[0].zzzz, temp[0]; 3: ADD temp[1], temp[0], const[9]; 4: MOV temp[9], const[16].wwww; 5: MAD temp[2].x, temp[1].xxxx, const[1].xxxx, temp[9]; 6: MAD temp[2].x, temp[2].xxxx, const[1].yyyy, const[1].zzzz; 7: FRC temp[2].x, temp[2].xxxx; 8: MAD temp[2].x, temp[2].xxxx, const[2].xxxx, const[2].yyyy; 9: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 10: MAD temp[3].x, temp[2].xxxx, const[2].zzzz, const[2].wwww; 11: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[1].wwww; 12: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].xxxx; 13: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].yyyy; 14: MAD temp[3].x, temp[2].xxxx, temp[3].xxxx, const[0].zzzz; 15: MAD temp[3].x, temp[3].xxxx, const[0].wwww, temp[1].yyyy; 16: ADD temp[3].x, temp[3].xxxx, const[3].xxxx; 17: MUL temp[0], temp[3].xxxx, const[11]; 18: MAD temp[0], const[10], temp[1].xxxx, temp[0]; 19: MOV temp[10], const[3].yzzy; 20: MUL temp[2], const[17].yyzy, temp[10]; 21: MAD temp[0], const[12], temp[1].zzzz, temp[0]; 22: MAD temp[3].xy, temp[1].zxxx, const[3].wwww, temp[2].xyyy; 23: MAD temp[0], const[13], temp[1].wwww, temp[0]; 24: MAD temp[1].xy, temp[1].zxxx, const[4].xxxx, temp[2].zwww; 25: ADD temp[2].x, -temp[0].zzzz, const[15].yyyy; 26: MUL temp[2].x, temp[2].xxxx, const[15].wwww; 27: MOV temp[4], temp[0]; 28: MOV temp[5].xy, input[1].xyxx; 29: MOV temp[6].xy, temp[3].xyxx; 30: MOV temp[7].xy, temp[1].xyxx; 31: MOV temp[1].xw, temp[0].xxzw; 32: MOV_SAT temp[2].x, temp[2].xxxx; 33: MUL temp[3].x, temp[0].yyyy, const[5].yyyy; 34: MOV temp[1].y, temp[3].xxxx; 35: MAD temp[1].xy, const[5].zwww, temp[0].wwww, temp[1].xyyy; 36: MAD temp[0].x, temp[0].zzzz, const[18].xxxx, -temp[0].wwww; 37: MOV temp[1].z, temp[0].xxxx; 38: MOV output[1], temp[2].xxxx; 39: MOV temp[8], temp[1]; 40: MOV output[2], temp[4]; 41: MOV output[3], temp[5]; 42: MOV output[4], temp[6]; 43: MOV output[5], temp[7]; 44: MOV output[0], temp[8]; 45: MOV output[6], temp[8]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02003 dst: 1t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x01248122 reg: 9c swiz: 0/ 0/ 0/ 0 4: op: 0x00f12003 dst: 9t op: VE_ADD src0: 0x00db6202 reg: 16c swiz: W/ W/ W/ W src1: 0x01248202 reg: 16c swiz: 0/ 0/ 0/ 0 src2: 0x01248202 reg: 16c swiz: 0/ 0/ 0/ 0 5: op: 0x00104004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x00000022 reg: 1c swiz: X/ X/ X/ X src2: 0x00d10120 reg: 9t swiz: X/ Y/ Z/ W 6: op: 0x00104004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src2: 0x00924022 reg: 1c swiz: Z/ Z/ Z/ Z 7: op: 0x00104006 dst: 2t op: VE_FRACTION src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 8: op: 0x00104004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00000042 reg: 2c swiz: X/ X/ X/ X src2: 0x00492042 reg: 2c swiz: Y/ Y/ Y/ Y 9: op: 0x00104002 dst: 2t op: VE_MULTIPLY src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00000040 reg: 2t swiz: X/ X/ X/ X src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 10: op: 0x00106004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00924042 reg: 2c swiz: Z/ Z/ Z/ Z src2: 0x00db6042 reg: 2c swiz: W/ W/ W/ W 11: op: 0x00106004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00000060 reg: 3t swiz: X/ X/ X/ X src2: 0x00db6022 reg: 1c swiz: W/ W/ W/ W 12: op: 0x00106004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00000060 reg: 3t swiz: X/ X/ X/ X src2: 0x00000002 reg: 0c swiz: X/ X/ X/ X 13: op: 0x00106004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00000060 reg: 3t swiz: X/ X/ X/ X src2: 0x00492002 reg: 0c swiz: Y/ Y/ Y/ Y 14: op: 0x00106004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00000060 reg: 3t swiz: X/ X/ X/ X src2: 0x00924002 reg: 0c swiz: Z/ Z/ Z/ Z 15: op: 0x00106004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x00db6002 reg: 0c swiz: W/ W/ W/ W src2: 0x00492020 reg: 1t swiz: Y/ Y/ Y/ Y 16: op: 0x00106003 dst: 3t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x00000062 reg: 3c swiz: X/ X/ X/ X src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 17: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x00d10162 reg: 11c swiz: X/ Y/ Z/ W src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 18: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src1: 0x00000020 reg: 1t swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 19: op: 0x00f14003 dst: 10t op: VE_ADD src0: 0x00522062 reg: 3c swiz: Y/ Z/ Z/ Y src1: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 20: op: 0x00f04002 dst: 2t op: VE_MULTIPLY src0: 0x00512222 reg: 17c swiz: Y/ Y/ Z/ Y src1: 0x00d10140 reg: 10t swiz: X/ Y/ Z/ W src2: 0x01248140 reg: 10t swiz: 0/ 0/ 0/ 0 21: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10182 reg: 12c swiz: X/ Y/ Z/ W src1: 0x00924020 reg: 1t swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 22: op: 0x00306004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00004020 reg: 1t swiz: Z/ X/ X/ X src1: 0x00db6062 reg: 3c swiz: W/ W/ W/ W src2: 0x00490040 reg: 2t swiz: X/ Y/ Y/ Y 23: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d101a2 reg: 13c swiz: X/ Y/ Z/ W src1: 0x00db6020 reg: 1t swiz: W/ W/ W/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 24: op: 0x00302004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00004020 reg: 1t swiz: Z/ X/ X/ X src1: 0x00000082 reg: 4c swiz: X/ X/ X/ X src2: 0x00db4040 reg: 2t swiz: Z/ W/ W/ W 25: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x1e924000 reg: 0t swiz: -Z/-Z/-Z/-Z src1: 0x004921e2 reg: 15c swiz: Y/ Y/ Y/ Y src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 26: op: 0x00104002 dst: 2t op: VE_MULTIPLY src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00db61e2 reg: 15c swiz: W/ W/ W/ W src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 27: op: 0x00f08003 dst: 4t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 28: op: 0x0030a003 dst: 5t op: VE_ADD src0: 0x00010021 reg: 1i swiz: X/ Y/ X/ X src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 29: op: 0x0030c003 dst: 6t op: VE_ADD src0: 0x00010060 reg: 3t swiz: X/ Y/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 30: op: 0x0030e003 dst: 7t op: VE_ADD src0: 0x00010020 reg: 1t swiz: X/ Y/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 31: op: 0x00902003 dst: 1t op: VE_ADD src0: 0x00d00000 reg: 0t swiz: X/ X/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 32: op: 0x01104003 dst: 2t op: VE_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 33: op: 0x00106002 dst: 3t op: VE_MULTIPLY src0: 0x00492000 reg: 0t swiz: Y/ Y/ Y/ Y src1: 0x004920a2 reg: 5c swiz: Y/ Y/ Y/ Y src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 34: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 35: op: 0x00302004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db40a2 reg: 5c swiz: Z/ W/ W/ W src1: 0x00db6000 reg: 0t swiz: W/ W/ W/ W src2: 0x00490020 reg: 1t swiz: X/ Y/ Y/ Y 36: op: 0x00100004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924000 reg: 0t swiz: Z/ Z/ Z/ Z src1: 0x00000242 reg: 18c swiz: X/ X/ X/ X src2: 0x1edb6000 reg: 0t swiz: -W/-W/-W/-W 37: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 38: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 39: op: 0x00f10003 dst: 8t op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 40: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 41: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 42: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d100c0 reg: 6t swiz: X/ Y/ Z/ W src1: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 src2: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 43: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 44: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src1: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 45: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src1: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL IN[4], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL CONST[0..1] DCL CONST[5..7] DCL TEMP[0..4], LOCAL DCL TEMP[5], ARRAY(1), LOCAL IMM[0] FLT32 { 1.0000, 2.0000, -1.0000, 0.0000} 0: TEX TEMP[0].xy, IN[3].xyyy, SAMP[2], 2D 1: MAD TEMP[1].x, TEMP[0].xxxx, IMM[0].yyyy, IMM[0].zzzz 2: MAD TEMP[0].x, TEMP[0].yyyy, IMM[0].yyyy, IMM[0].zzzz 3: MOV TEMP[1].y, TEMP[0].xxxx 4: TEX TEMP[2].xy, IN[4].xyyy, SAMP[2], 2D 5: MAD TEMP[3].x, TEMP[2].xxxx, IMM[0].yyyy, IMM[0].zzzz 6: MAD TEMP[2].x, TEMP[2].yyyy, IMM[0].yyyy, IMM[0].zzzz 7: MOV TEMP[3].y, TEMP[2].xxxx 8: RCP TEMP[2].x, IN[1].wwww 9: MUL TEMP[2].xy, TEMP[2].xxxx, IN[1].xyyy 10: MAD TEMP[2].xy, TEMP[2].xyyy, CONST[1].xyyy, CONST[1].xxxx 11: MAD TEMP[4].xy, CONST[0].xxxx, TEMP[1].xyyy, CONST[0].yyyy 12: MAD TEMP[4].xy, CONST[0].xxxx, TEMP[3].xyyy, TEMP[4].xyyy 13: ADD TEMP[4].xy, TEMP[4].xyyy, CONST[0].yyyy 14: MAD TEMP[2].xy, TEMP[4].xyyy, CONST[0].wwww, TEMP[2].xyyy 15: MUL TEMP[1].x, TEMP[1].xxxx, TEMP[3].xxxx 16: MAD TEMP[0].xy, TEMP[4].xyyy, CONST[1].zzzz, IN[2].xyyy 17: TEX TEMP[2].xyz, TEMP[2].xyyy, SAMP[1], 2D 18: TEX TEMP[3], TEMP[0].xyyy, SAMP[0], 2D 19: MUL TEMP[4].x, TEMP[1].xxxx, TEMP[1].xxxx 20: MUL TEMP[1].x, TEMP[1].xxxx, TEMP[4].xxxx 21: LRP TEMP[2].xyz, TEMP[3].wwww, TEMP[3].xyzz, TEMP[2].xyzz 22: MUL TEMP[1].x, TEMP[1].xxxx, CONST[0].zzzz 23: MOV_SAT TEMP[1].x, TEMP[1].xxxx 24: MAD TEMP[0].xyz, TEMP[2].xyzz, CONST[5].xyzz, TEMP[1].xxxx 25: MOV TEMP[0].w, CONST[1].wwww 26: MOV TEMP[5], TEMP[0] 27: ADD TEMP[0].x, CONST[7].zzzz, -IN[0].xxxx 28: ADD TEMP[1].x, CONST[7].zzzz, -CONST[7].yyyy 29: RCP TEMP[1].x, TEMP[1].xxxx 30: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx 31: LRP TEMP[5].xyz, TEMP[0].xxxx, TEMP[5].xyzz, CONST[6].xyzz 32: MOV_SAT OUT[0], TEMP[5] 33: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: LRP temp[2].xyz, temp[3].wwww, temp[3].xyzz, temp[2].xyzz; 22: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 23: MOV_SAT temp[1].x, temp[1].xxxx; 24: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 25: MOV temp[0].w, const[1].wwww; 26: MOV temp[5], temp[0]; 27: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 28: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 29: RCP temp[1].x, temp[1].xxxx; 30: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 31: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 32: MOV_SAT output[0], temp[5]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: LRP temp[2].xyz, temp[3].wwww, temp[3].xyzz, temp[2].xyzz; 22: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 23: MOV_SAT temp[1].x, temp[1].xxxx; 24: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 25: MOV temp[0].w, const[1].wwww; 26: MOV temp[5], temp[0]; 27: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 28: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 29: RCP temp[1].x, temp[1].xxxx; 30: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 31: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 32: MOV_SAT output[0], temp[5]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: LRP temp[2].xyz, temp[3].wwww, temp[3].xyzz, temp[2].xyzz; 22: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 23: MOV_SAT temp[1].x, temp[1].xxxx; 24: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 25: MOV temp[0].w, const[1].wwww; 26: MOV temp[5], temp[0]; 27: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 28: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 29: RCP temp[1].x, temp[1].xxxx; 30: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 31: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 32: MOV_SAT output[0], temp[5]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: LRP temp[2].xyz, temp[3].wwww, temp[3].xyzz, temp[2].xyzz; 22: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 23: MOV_SAT temp[1].x, temp[1].xxxx; 24: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 25: MOV temp[0].w, const[1].wwww; 26: MOV temp[5], temp[0]; 27: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 28: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 29: RCP temp[1].x, temp[1].xxxx; 30: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 31: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 32: MOV_SAT output[0], temp[5]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: LRP temp[2].xyz, temp[3].wwww, temp[3].xyzz, temp[2].xyzz; 22: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 23: MOV_SAT temp[1].x, temp[1].xxxx; 24: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 25: MOV temp[0].w, const[1].wwww; 26: MOV temp[5], temp[0]; 27: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 28: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 29: RCP temp[1].x, temp[1].xxxx; 30: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 31: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 32: MOV_SAT output[0], temp[5]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: LRP temp[2].xyz, temp[3].wwww, temp[3].xyzz, temp[2].xyzz; 22: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 23: MOV_SAT temp[1].x, temp[1].xxxx; 24: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 25: MOV temp[0].w, const[1].wwww; 26: MOV temp[5], temp[0]; 27: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 28: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 29: RCP temp[1].x, temp[1].xxxx; 30: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 31: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 32: MOV_SAT output[0], temp[5]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: ADD temp[6].xyz, temp[3].xyzz, -temp[2].xyzz; 22: MAD temp[2].xyz, temp[3].wwww, temp[6], temp[2].xyzz; 23: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 24: MOV_SAT temp[1].x, temp[1].xxxx; 25: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 26: MOV temp[0].w, const[1].wwww; 27: MOV temp[5], temp[0]; 28: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 29: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 30: RCP temp[1].x, temp[1].xxxx; 31: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 32: ADD temp[7].xyz, temp[5].xyzz, -const[6].xyzz; 33: MAD temp[5].xyz, temp[0].xxxx, temp[7], const[6].xyzz; 34: MOV_SAT output[0], temp[5]; CONST[8] = { 1.0000 2.0000 -1.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: ADD temp[6].xyz, temp[3].xyzz, -temp[2].xyzz; 22: MAD temp[2].xyz, temp[3].wwww, temp[6], temp[2].xyzz; 23: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 24: MOV_SAT temp[1].x, temp[1].xxxx; 25: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 26: MOV temp[0].w, const[1].wwww; 27: MOV temp[5], temp[0]; 28: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 29: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 30: RCP temp[1].x, temp[1].xxxx; 31: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 32: ADD temp[7].xyz, temp[5].xyzz, -const[6].xyzz; 33: MAD temp[5].xyz, temp[0].xxxx, temp[7], const[6].xyzz; 34: MOV_SAT output[0], temp[5]; CONST[8] = { 1.0000 2.0000 -1.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: MAD temp[1].x, temp[0].xxxx, const[8].yyyy, const[8].zzzz; 2: MAD temp[0].x, temp[0].yyyy, const[8].yyyy, const[8].zzzz; 3: MOV temp[1].y, temp[0].xxxx; 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: MAD temp[3].x, temp[2].xxxx, const[8].yyyy, const[8].zzzz; 6: MAD temp[2].x, temp[2].yyyy, const[8].yyyy, const[8].zzzz; 7: MOV temp[3].y, temp[2].xxxx; 8: RCP temp[2].x, input[1].wwww; 9: MUL temp[2].xy, temp[2].xxxx, input[1].xyyy; 10: MAD temp[2].xy, temp[2].xyyy, const[1].xyyy, const[1].xxxx; 11: MAD temp[4].xy, const[0].xxxx, temp[1].xyyy, const[0].yyyy; 12: MAD temp[4].xy, const[0].xxxx, temp[3].xyyy, temp[4].xyyy; 13: ADD temp[4].xy, temp[4].xyyy, const[0].yyyy; 14: MAD temp[2].xy, temp[4].xyyy, const[0].wwww, temp[2].xyyy; 15: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 16: MAD temp[0].xy, temp[4].xyyy, const[1].zzzz, input[2].xyyy; 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: MUL temp[4].x, temp[1].xxxx, temp[1].xxxx; 20: MUL temp[1].x, temp[1].xxxx, temp[4].xxxx; 21: ADD temp[6].xyz, temp[3].xyzz, -temp[2].xyzz; 22: MAD temp[2].xyz, temp[3].wwww, temp[6], temp[2].xyzz; 23: MUL temp[1].x, temp[1].xxxx, const[0].zzzz; 24: MOV_SAT temp[1].x, temp[1].xxxx; 25: MAD temp[0].xyz, temp[2].xyzz, const[5].xyzz, temp[1].xxxx; 26: MOV temp[0].w, const[1].wwww; 27: MOV temp[5], temp[0]; 28: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 29: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 30: RCP temp[1].x, temp[1].xxxx; 31: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 32: ADD temp[7].xyz, temp[5].xyzz, -const[6].xyzz; 33: MAD temp[5].xyz, temp[0].xxxx, temp[7], const[6].xyzz; 34: MOV_SAT output[0], temp[5]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0].xy, input[3].xyyy, 2D[2]; 1: src0.xyz = temp[0], src1.xyz = const[8] MAD temp[1].x, src0.xxx, src1.yyy, src1.zzz 2: src0.xyz = temp[0], src1.xyz = const[8] MAD temp[0].x, src0.yyy, src1.yyy, src1.zzz 3: src0.xyz = temp[0] MAD temp[1].y, src0.xxx, src0.111, src0.000 4: TEX temp[2].xy, input[4].xyyy, 2D[2]; 5: src0.xyz = temp[2], src1.xyz = const[8] MAD temp[3].x, src0.xxx, src1.yyy, src1.zzz 6: src0.xyz = temp[2], src1.xyz = const[8] MAD temp[2].x, src0.yyy, src1.yyy, src1.zzz 7: src0.xyz = temp[2] MAD temp[3].y, src0.xxx, src0.111, src0.000 8: src0.w = input[1] REPL_ALPHA temp[2].x RCP, src0.w 9: src0.xyz = temp[2], src1.xyz = input[1] MAD temp[2].xy, src0.xxx, src1.xyy, src0.000 10: src0.xyz = temp[2], src1.xyz = const[1] MAD temp[2].xy, src0.xyy, src1.xyy, src1.xxx 11: src0.xyz = const[0], src1.xyz = temp[1] MAD temp[4].xy, src0.xxx, src1.xyy, src0.yyy 12: src0.xyz = const[0], src1.xyz = temp[3], src2.xyz = temp[4] MAD temp[4].xy, src0.xxx, src1.xyy, src2.xyy 13: src0.xyz = temp[4], src1.xyz = const[0] MAD temp[4].xy, src0.xyy, src0.111, src1.yyy 14: src0.xyz = temp[4], src0.w = const[0], src1.xyz = temp[2] MAD temp[2].xy, src0.xyy, src0.www, src1.xyy 15: src0.xyz = temp[1], src1.xyz = temp[3] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 16: src0.xyz = temp[4], src1.xyz = const[1], src2.xyz = input[2] MAD temp[0].xy, src0.xyy, src1.zzz, src2.xyy 17: TEX temp[2].xyz, temp[2].xyyy, 2D[1]; 18: TEX temp[3], temp[0].xyyy, 2D[0]; 19: src0.xyz = temp[1] MAD temp[4].x, src0.xxx, src0.xxx, src0.000 20: src0.xyz = temp[1], src1.xyz = temp[4] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 21: src0.xyz = temp[3], src1.xyz = temp[2] MAD temp[6].xyz, src0.xyz, src0.111, -src1.xyz 22: src0.xyz = temp[6], src0.w = temp[3], src1.xyz = temp[2] MAD temp[2].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[1], src1.xyz = const[0] MAD temp[1].x, src0.xxx, src1.zzz, src0.000 24: src0.xyz = temp[1] MAD_SAT temp[1].x, src0.xxx, src0.111, src0.000 25: src0.xyz = temp[2], src1.xyz = const[5], src2.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src2.xxx 26: src0.w = const[1] MAD temp[0].w, src0.w, src0.1, src0.0 27: src0.xyz = temp[0], src0.w = temp[0] MAD temp[5].xyz, src0.xyz, src0.111, src0.000 MAD temp[5].w, src0.w, src0.1, src0.0 28: src0.xyz = const[7], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 29: src0.xyz = const[7] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 30: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 31: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 32: src0.xyz = temp[5], src1.xyz = const[6] MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz 33: src0.xyz = temp[0], src1.xyz = temp[7], src2.xyz = const[6] MAD temp[5].xyz, src0.xxx, src1.xyz, src2.xyz 34: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.w = const[1] MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TEX temp[0].xy, input[3].xyyy, 2D[2]; 3: TEX temp[2].xy, input[4].xyyy, 2D[2] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = temp[0], src1.xyz = const[8] SEM_WAIT MAD temp[1].x, src0.xxx, src1.yyy, src1.zzz 5: src0.xyz = temp[2], src1.xyz = const[8] MAD temp[3].x, src0.xxx, src1.yyy, src1.zzz 6: src0.xyz = temp[0], src1.xyz = const[8] MAD temp[0].x, src0.yyy, src1.yyy, src1.zzz 7: src0.xyz = temp[2], src1.xyz = const[8] MAD temp[2].x, src0.yyy, src1.yyy, src1.zzz 8: src0.xyz = temp[0] MAD temp[1].y, src0.xxx, src0.111, src0.000 9: src0.xyz = const[0], src1.xyz = temp[1] MAD temp[4].xy, src0.xxx, src1.xyy, src0.yyy 10: src0.xyz = temp[2] MAD temp[3].y, src0.xxx, src0.111, src0.000 11: src0.xyz = const[0], src1.xyz = temp[3], src2.xyz = temp[4] MAD temp[4].xy, src0.xxx, src1.xyy, src2.xyy 12: src0.xyz = temp[1], src1.xyz = temp[3] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 13: src0.w = input[1] REPL_ALPHA temp[2].x RCP, src0.w 14: src0.xyz = temp[4], src1.xyz = const[0] MAD temp[4].xy, src0.xyy, src0.111, src1.yyy 15: src0.xyz = temp[4], src1.xyz = const[1], src2.xyz = input[2] MAD temp[0].xy, src0.xyy, src1.zzz, src2.xyy 16: src0.xyz = temp[2], src1.xyz = input[1] MAD temp[2].xy, src0.xxx, src1.xyy, src0.000 17: src0.xyz = temp[2], src1.xyz = const[1] MAD temp[2].xy, src0.xyy, src1.xyy, src1.xxx 18: src0.xyz = temp[4], src0.w = const[0], src1.xyz = temp[2] MAD temp[2].xy, src0.xyy, src0.www, src1.xyy 19: src0.xyz = temp[1] MAD temp[4].x, src0.xxx, src0.xxx, src0.000 20: src0.xyz = temp[1], src1.xyz = temp[4] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 21: src0.xyz = temp[1], src1.xyz = const[0] MAD temp[1].x, src0.xxx, src1.zzz, src0.000 22: src0.xyz = temp[1] MAD_SAT temp[1].x, src0.xxx, src0.111, src0.000 23: BEGIN_TEX; 24: TEX temp[3], temp[0].xyyy, 2D[0]; 25: TEX temp[2].xyz, temp[2].xyyy, 2D[1] SEM_WAIT SEM_ACQUIRE; 26: src0.xyz = temp[3], src1.xyz = temp[2] SEM_WAIT MAD temp[6].xyz, src0.xyz, src0.111, -src1.xyz 27: src0.xyz = temp[6], src0.w = temp[3], src1.xyz = temp[2] MAD temp[2].xyz, src0.www, src0.xyz, src1.xyz 28: src0.xyz = temp[2], src1.xyz = const[5], src2.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src2.xxx 29: src0.xyz = temp[0], src0.w = temp[0] MAD temp[5].xyz, src0.xyz, src0.111, src0.000 MAD temp[5].w, src0.w, src0.1, src0.0 30: src0.xyz = temp[5], src1.xyz = const[6] MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz 31: src0.xyz = const[7] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 32: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 33: src0.xyz = const[7], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 34: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 35: src0.xyz = temp[0], src1.xyz = temp[7], src2.xyz = const[6] MAD temp[5].xyz, src0.xxx, src1.xyz, src2.xyz 36: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.w = const[1] MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TEX temp[0].xy, input[3].xyyy, 2D[2]; 3: TEX temp[2].xy, input[4].xyyy, 2D[2] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = temp[0], src1.xyz = const[8] SEM_WAIT MAD temp[1].x, src0.xxx, src1.yyy, src1.zzz 5: src0.xyz = temp[2], src1.xyz = const[8] MAD temp[3].x, src0.xxx, src1.yyy, src1.zzz 6: src0.xyz = temp[0], src1.xyz = const[8] MAD temp[0].x, src0.yyy, src1.yyy, src1.zzz 7: src0.xyz = temp[2], src1.xyz = const[8] MAD temp[2].x, src0.yyy, src1.yyy, src1.zzz 8: src0.xyz = temp[0] MAD temp[1].y, src0.xxx, src0.111, src0.000 9: src0.xyz = const[0], src1.xyz = temp[1] MAD temp[4].xy, src0.xxx, src1.xyy, src0.yyy 10: src0.xyz = temp[2] MAD temp[3].y, src0.xxx, src0.111, src0.000 11: src0.xyz = const[0], src1.xyz = temp[3], src2.xyz = temp[4] MAD temp[4].xy, src0.xxx, src1.xyy, src2.xyy 12: src0.xyz = temp[1], src1.xyz = temp[3] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 13: src0.w = input[1] REPL_ALPHA temp[2].x RCP, src0.w 14: src0.xyz = temp[4], src1.xyz = const[0] MAD temp[4].xy, src0.xyy, src0.111, src1.yyy 15: src0.xyz = temp[4], src1.xyz = const[1], src2.xyz = input[2] MAD temp[0].xy, src0.xyy, src1.zzz, src2.xyy 16: src0.xyz = temp[2], src1.xyz = input[1] MAD temp[2].xy, src0.xxx, src1.xyy, src0.000 17: src0.xyz = temp[2], src1.xyz = const[1] MAD temp[2].xy, src0.xyy, src1.xyy, src1.xxx 18: src0.xyz = temp[4], src0.w = const[0], src1.xyz = temp[2] MAD temp[2].xy, src0.xyy, src0.www, src1.xyy 19: src0.xyz = temp[1] MAD temp[4].x, src0.xxx, src0.xxx, src0.000 20: src0.xyz = temp[1], src1.xyz = temp[4] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 21: src0.xyz = temp[1], src1.xyz = const[0] MAD temp[1].x, src0.xxx, src1.zzz, src0.000 22: src0.xyz = temp[1] MAD_SAT temp[1].x, src0.xxx, src0.111, src0.000 23: BEGIN_TEX; 24: TEX temp[3], temp[0].xyyy, 2D[0]; 25: TEX temp[2].xyz, temp[2].xyyy, 2D[1] SEM_WAIT SEM_ACQUIRE; 26: src0.xyz = temp[3], src1.xyz = temp[2] SEM_WAIT MAD temp[6].xyz, src0.xyz, src0.111, -src1.xyz 27: src0.xyz = temp[6], src0.w = temp[3], src1.xyz = temp[2] MAD temp[2].xyz, src0.www, src0.xyz, src1.xyz 28: src0.xyz = temp[2], src1.xyz = const[5], src2.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src2.xxx 29: src0.xyz = temp[0], src0.w = temp[0] MAD temp[5].xyz, src0.xyz, src0.111, src0.000 MAD temp[5].w, src0.w, src0.1, src0.0 30: src0.xyz = temp[5], src1.xyz = const[6] MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz 31: src0.xyz = const[7] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 32: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 33: src0.xyz = const[7], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 34: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 35: src0.xyz = temp[0], src1.xyz = temp[7], src2.xyz = const[6] MAD temp[5].xyz, src0.xxx, src1.xyz, src2.xyz 36: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.w = const[1] MAD temp[5].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TEX temp[5].xy, input[2].xyyy, 2D[2]; 3: TEX temp[7].xy, input[3].xyyy, 2D[2] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = temp[5], src1.xyz = const[8] SEM_WAIT MAD temp[6].x, src0.xxx, src1.yyy, src1.zzz 5: src0.xyz = temp[7], src1.xyz = const[8] MAD temp[8].x, src0.xxx, src1.yyy, src1.zzz 6: src0.xyz = temp[5], src1.xyz = const[8] MAD temp[5].x, src0.yyy, src1.yyy, src1.zzz 7: src0.xyz = temp[7], src1.xyz = const[8] MAD temp[7].x, src0.yyy, src1.yyy, src1.zzz 8: src0.xyz = temp[5] MAD temp[6].y, src0.xxx, src0.111, src0.000 9: src0.xyz = const[0], src1.xyz = temp[6] MAD temp[9].xy, src0.xxx, src1.xyy, src0.yyy 10: src0.xyz = temp[7] MAD temp[8].y, src0.xxx, src0.111, src0.000 11: src0.xyz = const[0], src1.xyz = temp[8], src2.xyz = temp[9] MAD temp[9].xy, src0.xxx, src1.xyy, src2.xyy 12: src0.xyz = temp[6], src1.xyz = temp[8] MAD temp[6].x, src0.xxx, src1.xxx, src0.000 13: src0.w = input[0] REPL_ALPHA temp[7].x RCP, src0.w 14: src0.xyz = temp[9], src1.xyz = const[0] MAD temp[9].xy, src0.xyy, src0.111, src1.yyy 15: src0.xyz = temp[9], src1.xyz = const[1], src2.xyz = input[1] MAD temp[5].xy, src0.xyy, src1.zzz, src2.xyy 16: src0.xyz = temp[7], src1.xyz = input[0] MAD temp[7].xy, src0.xxx, src1.xyy, src0.000 17: src0.xyz = temp[7], src1.xyz = const[1] MAD temp[7].xy, src0.xyy, src1.xyy, src1.xxx 18: src0.xyz = temp[9], src0.w = const[0], src1.xyz = temp[7] MAD temp[7].xy, src0.xyy, src0.www, src1.xyy 19: src0.xyz = temp[6] MAD temp[9].x, src0.xxx, src0.xxx, src0.000 20: src0.xyz = temp[6], src1.xyz = temp[9] MAD temp[6].x, src0.xxx, src1.xxx, src0.000 21: src0.xyz = temp[6], src1.xyz = const[0] MAD temp[6].x, src0.xxx, src1.zzz, src0.000 22: src0.xyz = temp[6] MAD_SAT temp[6].x, src0.xxx, src0.111, src0.000 23: BEGIN_TEX; 24: TEX temp[8], temp[5].xyyy, 2D[0]; 25: TEX temp[7].xyz, temp[7].xyyy, 2D[1] SEM_WAIT SEM_ACQUIRE; 26: src0.xyz = temp[8], src1.xyz = temp[7] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.111, -src1.xyz 27: src0.xyz = temp[11], src0.w = temp[8], src1.xyz = temp[7] MAD temp[7].xyz, src0.www, src0.xyz, src1.xyz 28: src0.xyz = temp[7], src1.xyz = const[5], src2.xyz = temp[6] MAD temp[5].xyz, src0.xyz, src1.xyz, src2.xxx 29: src0.xyz = temp[5], src0.w = temp[5] MAD temp[10].xyz, src0.xyz, src0.111, src0.000 MAD temp[10].w, src0.w, src0.1, src0.0 30: src0.xyz = temp[10], src1.xyz = const[6] MAD temp[12].xyz, src0.xyz, src0.111, -src1.xyz 31: src0.xyz = const[7] MAD temp[6].x, src0.zzz, src0.111, -src0.yyy 32: src0.xyz = temp[6] REPL_ALPHA temp[6].x RCP, src0.x 33: src0.xyz = const[7], src1.xyz = input[4] MAD temp[5].x, src0.zzz, src0.111, -src1.xxx 34: src0.xyz = temp[5], src1.xyz = temp[6] MAD_SAT temp[5].x, src0.xxx, src1.xxx, src0.000 35: src0.xyz = temp[5], src1.xyz = temp[12], src2.xyz = const[6] MAD temp[10].xyz, src0.xxx, src1.xyz, src2.xyz 36: src0.xyz = temp[10], src0.w = temp[10] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020101:Addr0: 1c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c050:MAD dest:5 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00001803:TEX wmask: RG omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe4055402: src: 2 R/G/G/G dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00001807:TEX TEX_WAIT wmask: RG omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4075403: src: 3 R/G/G/G dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08042005:Addr0: 5t, Addr1: 8c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00249060:MAD dest:6 rgb_C_src:1 B/B/B 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08042007:Addr0: 7t, Addr1: 8c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00249080:MAD dest:8 rgb_C_src:1 B/B/B 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08042005:Addr0: 5t, Addr1: 8c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a124:rgb_A_src:0 G/G/G 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00249050:MAD dest:5 rgb_C_src:1 B/B/B 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08042007:Addr0: 7t, Addr1: 8c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a124:rgb_A_src:0 G/G/G 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00249070:MAD dest:7 rgb_C_src:1 B/B/B 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00001000:ALU wmask: G omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08001900:Addr0: 0c, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00242000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00124090:MAD dest:9 rgb_C_src:0 G/G/G 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00001000:ALU wmask: G omask: NONE 1:RGB_ADDR 0x08020007:Addr0: 7t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490080:MAD dest:8 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x00902100:Addr0: 0c, Addr1: 8t, Addr2: 9t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00242000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00122090:MAD dest:9 rgb_C_src:2 R/G/G 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08002006:Addr0: 6t, Addr1: 8t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c00a:RCP dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000007a:SOP dest:7 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040009:Addr0: 9t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0120:rgb_A_src:0 R/G/G 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00125090:MAD dest:9 rgb_C_src:1 G/G/G 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x00140409:Addr0: 9t, Addr1: 1c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492120:rgb_A_src:0 R/G/G 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00122050:MAD dest:5 rgb_C_src:2 R/G/G 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08000007:Addr0: 7t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00242000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040407:Addr0: 7t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00242120:rgb_A_src:0 R/G/G 0 rgb_B_src:1 R/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00001070:MAD dest:7 rgb_C_src:1 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08001c09:Addr0: 9t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8120:rgb_A_src:0 R/G/G 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00121070:MAD dest:7 rgb_C_src:1 R/G/G 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08002406:Addr0: 6t, Addr1: 9t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08040006:Addr0: 6t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe4085405: src: 5 R/G/G/G dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 23 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4075407: src: 7 R/G/G/G dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 24 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08001c08:Addr0: 8t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a210b0:MAD dest:11 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 25 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08001c0b:Addr0: 11t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221070:MAD dest:7 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 26 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x00641407:Addr0: 7t, Addr1: 5c, Addr2: 6t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00002050:MAD dest:5 rgb_C_src:2 R/R/R 0 alp_C_src:0 R 0 27 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c0a0:MAD dest:10 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x204900a0:MAD dest:10 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 28 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x0804180a:Addr0: 10t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a210c0:MAD dest:12 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 29 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020107:Addr0: 7c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924060:MAD dest:6 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 30 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000006a:SOP dest:6 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 31 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001107:Addr0: 7c, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801050:MAD dest:5 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 32 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 33 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10603005:Addr0: 5t, Addr1: 12t, Addr2: 6c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x002220a0:MAD dest:10 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 34 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x0802000a:Addr0: 10t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0802000a:Addr0: 10t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL CONST[0..15] DCL TEMP[0..5], LOCAL DCL TEMP[6], ARRAY(1), LOCAL IMM[0] FLT32 { 0.0000, 128.0000, -128.0000, 1.0000} IMM[1] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: ADD TEMP[0].xyz, IN[3].xyzz, -IN[0].xyzz 1: MAD TEMP[1].xyz, TEMP[0].xyzz, CONST[14].xxxx, IN[0].xyzz 2: MUL TEMP[0], TEMP[1].yyyy, CONST[8] 3: MAD TEMP[0], CONST[7], TEMP[1].xxxx, TEMP[0] 4: MAD TEMP[0], CONST[9], TEMP[1].zzzz, TEMP[0] 5: MOV TEMP[1].x, CONST[11].xxxx 6: MOV TEMP[1].y, CONST[12].xxxx 7: MOV TEMP[1].z, CONST[13].xxxx 8: MOV TEMP[2].x, CONST[11].yyyy 9: MOV TEMP[2].y, CONST[12].yyyy 10: MOV TEMP[2].z, CONST[13].yyyy 11: MOV TEMP[3].x, CONST[11].zzzz 12: MOV TEMP[3].y, CONST[12].zzzz 13: MOV TEMP[3].z, CONST[13].zzzz 14: ADD TEMP[4].xyz, IN[4].xyzz, -IN[1].xyzz 15: MAD TEMP[4].xyz, TEMP[4].xyzz, CONST[14].xxxx, IN[1].xyzz 16: DP3 TEMP[5].x, TEMP[4].xyzz, TEMP[4].xyzz 17: ABS TEMP[5].x, TEMP[5].xxxx 18: RSQ TEMP[5].x, TEMP[5].xxxx 19: MUL TEMP[4].xyz, TEMP[4].xyzz, TEMP[5].xxxx 20: DP3 TEMP[1].x, TEMP[4].xyzz, TEMP[1].xyzz 21: DP3 TEMP[5].x, TEMP[4].xyzz, TEMP[2].xyzz 22: MOV TEMP[1].y, TEMP[5].xxxx 23: DP3 TEMP[3].x, TEMP[4].xyzz, TEMP[3].xyzz 24: MOV TEMP[1].z, TEMP[3].xxxx 25: DP3 TEMP[4].x, TEMP[1].xyzz, TEMP[1].xyzz 26: ABS TEMP[3].x, TEMP[4].xxxx 27: RSQ TEMP[3].x, TEMP[3].xxxx 28: MUL TEMP[2].xyz, TEMP[1].xyzz, TEMP[3].xxxx 29: ADD TEMP[1], TEMP[0], CONST[10] 30: DP3 TEMP[0].x, TEMP[2].xyzz, -CONST[5].xyzz 31: ADD TEMP[2].x, -TEMP[1].zzzz, CONST[15].yyyy 32: MAX TEMP[0].x, TEMP[0].xxxx, IMM[0].xxxx 33: SEQ TEMP[3].x, CONST[0].xxxx, IMM[0].xxxx 34: IF TEMP[3].xxxx :0 35: ELSE :0 36: ENDIF 37: MOV TEMP[3], TEMP[1] 38: MUL TEMP[1].xyz, TEMP[0].xxxx, CONST[3].xyzz 39: MUL TEMP[0].x, TEMP[2].xxxx, CONST[15].wwww 40: ADD TEMP[2].xyz, CONST[2].xyzz, TEMP[1].xyzz 41: MOV TEMP[2].w, CONST[0].xxxx 42: MOV TEMP[6].xy, IN[2].xyxx 43: MOV TEMP[4].xw, TEMP[3].xxzw 44: MOV_SAT TEMP[0].x, TEMP[0].xxxx 45: MUL TEMP[5].x, TEMP[3].yyyy, CONST[1].yyyy 46: MOV TEMP[4].y, TEMP[5].xxxx 47: MAD TEMP[4].xy, CONST[1].zwww, TEMP[1].wwww, TEMP[4].xyyy 48: MAD TEMP[1].x, TEMP[3].zzzz, IMM[1].xxxx, -TEMP[1].wwww 49: MOV TEMP[4].z, TEMP[1].xxxx 50: MOV OUT[2], TEMP[0].xxxx 51: MOV OUT[0], TEMP[4] 52: MOV_SAT OUT[1], TEMP[2] 53: MOV OUT[3], TEMP[6] 54: END Vertex Program: before compilation # Radeon Compiler Program 0: ADD temp[0].xyz, input[3].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[8]; 3: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 5: MOV temp[1].x, const[11].xxxx; 6: MOV temp[1].y, const[12].xxxx; 7: MOV temp[1].z, const[13].xxxx; 8: MOV temp[2].x, const[11].yyyy; 9: MOV temp[2].y, const[12].yyyy; 10: MOV temp[2].z, const[13].yyyy; 11: MOV temp[3].x, const[11].zzzz; 12: MOV temp[3].y, const[12].zzzz; 13: MOV temp[3].z, const[13].zzzz; 14: ADD temp[4].xyz, input[4].xyzz, -input[1].xyzz; 15: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 16: DP3 temp[5].x, temp[4].xyzz, temp[4].xyzz; 17: ABS temp[5].x, temp[5].xxxx; 18: RSQ temp[5].x, temp[5].xxxx; 19: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 20: DP3 temp[1].x, temp[4].xyzz, temp[1].xyzz; 21: DP3 temp[5].x, temp[4].xyzz, temp[2].xyzz; 22: MOV temp[1].y, temp[5].xxxx; 23: DP3 temp[3].x, temp[4].xyzz, temp[3].xyzz; 24: MOV temp[1].z, temp[3].xxxx; 25: DP3 temp[4].x, temp[1].xyzz, temp[1].xyzz; 26: ABS temp[3].x, temp[4].xxxx; 27: RSQ temp[3].x, temp[3].xxxx; 28: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 29: ADD temp[1], temp[0], const[10]; 30: DP3 temp[0].x, temp[2].xyzz, -const[5].xyzz; 31: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 32: MAX temp[0].x, temp[0].xxxx, const[16].xxxx; 33: SEQ temp[3].x, const[0].xxxx, const[16].xxxx; 34: IF temp[3].xxxx; 35: ELSE; 36: ENDIF; 37: MOV temp[3], temp[1]; 38: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 39: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 40: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 41: MOV temp[2].w, const[0].xxxx; 42: MOV temp[6].xy, input[2].xyxx; 43: MOV temp[4].xw, temp[3].xxzw; 44: MOV_SAT temp[0].x, temp[0].xxxx; 45: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 46: MOV temp[4].y, temp[5].xxxx; 47: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 48: MAD temp[1].x, temp[3].zzzz, const[17].xxxx, -temp[1].wwww; 49: MOV temp[4].z, temp[1].xxxx; 50: MOV output[2], temp[0].xxxx; 51: MOV temp[7], temp[4]; 52: MOV_SAT output[1], temp[2]; 53: MOV output[3], temp[6]; 54: MOV output[0], temp[7]; 55: MOV output[4], temp[7]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: ADD temp[0].xyz, input[3].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[8]; 3: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 5: MOV temp[1].x, const[11].xxxx; 6: MOV temp[1].y, const[12].xxxx; 7: MOV temp[1].z, const[13].xxxx; 8: MOV temp[2].x, const[11].yyyy; 9: MOV temp[2].y, const[12].yyyy; 10: MOV temp[2].z, const[13].yyyy; 11: MOV temp[3].x, const[11].zzzz; 12: MOV temp[3].y, const[12].zzzz; 13: MOV temp[3].z, const[13].zzzz; 14: ADD temp[4].xyz, input[4].xyzz, -input[1].xyzz; 15: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 16: DP3 temp[5].x, temp[4].xyzz, temp[4].xyzz; 17: ABS temp[5].x, temp[5].xxxx; 18: RSQ temp[5].x, temp[5].xxxx; 19: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 20: DP3 temp[1].x, temp[4].xyzz, temp[1].xyzz; 21: DP3 temp[5].x, temp[4].xyzz, temp[2].xyzz; 22: MOV temp[1].y, temp[5].xxxx; 23: DP3 temp[3].x, temp[4].xyzz, temp[3].xyzz; 24: MOV temp[1].z, temp[3].xxxx; 25: DP3 temp[4].x, temp[1].xyzz, temp[1].xyzz; 26: ABS temp[3].x, temp[4].xxxx; 27: RSQ temp[3].x, temp[3].xxxx; 28: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 29: ADD temp[1], temp[0], const[10]; 30: DP3 temp[0].x, temp[2].xyzz, -const[5].xyzz; 31: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 32: MAX temp[0].x, temp[0].xxxx, const[16].xxxx; 33: SEQ temp[3].x, const[0].xxxx, const[16].xxxx; 34: IF temp[3].xxxx; 35: ELSE; 36: ENDIF; 37: MOV temp[3], temp[1]; 38: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 39: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 40: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 41: MOV temp[2].w, const[0].xxxx; 42: MOV temp[6].xy, input[2].xyxx; 43: MOV temp[4].xw, temp[3].xxzw; 44: MOV_SAT temp[0].x, temp[0].xxxx; 45: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 46: MOV temp[4].y, temp[5].xxxx; 47: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 48: MAD temp[1].x, temp[3].zzzz, const[17].xxxx, -temp[1].wwww; 49: MOV temp[4].z, temp[1].xxxx; 50: MOV output[2], temp[0].xxxx; 51: MOV temp[7], temp[4]; 52: MOV_SAT output[1], temp[2]; 53: MOV output[3], temp[6]; 54: MOV output[0], temp[7]; 55: MOV output[4], temp[7]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: ADD temp[0].xyz, input[3].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[8]; 3: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 5: MOV temp[1].x, const[11].xxxx; 6: MOV temp[1].y, const[12].xxxx; 7: MOV temp[1].z, const[13].xxxx; 8: MOV temp[2].x, const[11].yyyy; 9: MOV temp[2].y, const[12].yyyy; 10: MOV temp[2].z, const[13].yyyy; 11: MOV temp[3].x, const[11].zzzz; 12: MOV temp[3].y, const[12].zzzz; 13: MOV temp[3].z, const[13].zzzz; 14: ADD temp[4].xyz, input[4].xyzz, -input[1].xyzz; 15: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 16: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 17: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 18: RSQ temp[5].x, temp[5].xxxx; 19: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 20: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 21: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 22: MOV temp[1].y, temp[5].xxxx; 23: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 24: MOV temp[1].z, temp[3].xxxx; 25: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 26: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 27: RSQ temp[3].x, temp[3].xxxx; 28: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 29: ADD temp[1], temp[0], const[10]; 30: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 31: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 32: MAX temp[0].x, temp[0].xxxx, const[16].xxxx; 33: SEQ temp[3].x, const[0].xxxx, const[16].xxxx; 34: IF temp[3].xxxx; 35: ELSE; 36: ENDIF; 37: MOV temp[3], temp[1]; 38: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 39: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 40: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 41: MOV temp[2].w, const[0].xxxx; 42: MOV temp[6].xy, input[2].xyxx; 43: MOV temp[4].xw, temp[3].xxzw; 44: MOV_SAT temp[0].x, temp[0].xxxx; 45: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 46: MOV temp[4].y, temp[5].xxxx; 47: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 48: MAD temp[1].x, temp[3].zzzz, const[17].xxxx, -temp[1].wwww; 49: MOV temp[4].z, temp[1].xxxx; 50: MOV output[2], temp[0].xxxx; 51: MOV temp[7], temp[4]; 52: MOV_SAT output[1], temp[2]; 53: MOV output[3], temp[6]; 54: MOV output[0], temp[7]; 55: MOV output[4], temp[7]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[8], -input[0].xyzz; 1: ADD temp[0].xyz, input[3].xyzz, temp[8]; 2: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[8]; 4: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 6: MOV temp[1].x, const[11].xxxx; 7: MOV temp[1].y, const[12].xxxx; 8: MOV temp[1].z, const[13].xxxx; 9: MOV temp[2].x, const[11].yyyy; 10: MOV temp[2].y, const[12].yyyy; 11: MOV temp[2].z, const[13].yyyy; 12: MOV temp[3].x, const[11].zzzz; 13: MOV temp[3].y, const[12].zzzz; 14: MOV temp[3].z, const[13].zzzz; 15: MOV temp[9], -input[1].xyzz; 16: ADD temp[4].xyz, input[4].xyzz, temp[9]; 17: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 18: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 19: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 20: RSQ temp[5].x, temp[5].xxxx; 21: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 22: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 23: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 24: MOV temp[1].y, temp[5].xxxx; 25: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 26: MOV temp[1].z, temp[3].xxxx; 27: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 28: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 29: RSQ temp[3].x, temp[3].xxxx; 30: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 31: ADD temp[1], temp[0], const[10]; 32: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 33: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 34: MAX temp[0].x, temp[0].xxxx, const[16].xxxx; 35: MOV temp[10], const[16].xxxx; 36: SEQ temp[3].x, const[0].xxxx, temp[10]; 37: IF temp[3].xxxx; 38: ELSE; 39: ENDIF; 40: MOV temp[3], temp[1]; 41: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 42: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 43: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 44: MOV temp[2].w, const[0].xxxx; 45: MOV temp[6].xy, input[2].xyxx; 46: MOV temp[4].xw, temp[3].xxzw; 47: MOV_SAT temp[0].x, temp[0].xxxx; 48: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 49: MOV temp[4].y, temp[5].xxxx; 50: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 51: MAD temp[1].x, temp[3].zzzz, const[17].xxxx, -temp[1].wwww; 52: MOV temp[4].z, temp[1].xxxx; 53: MOV output[2], temp[0].xxxx; 54: MOV temp[7], temp[4]; 55: MOV_SAT output[1], temp[2]; 56: MOV output[3], temp[6]; 57: MOV output[0], temp[7]; 58: MOV output[4], temp[7]; CONST[16] = { 0.0000 128.0000 -128.0000 1.0000 } CONST[17] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[8], -input[0].xyzz; 1: ADD temp[0].xyz, input[3].xyzz, temp[8]; 2: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[8]; 4: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 6: MOV temp[1].x, const[11].xxxx; 7: MOV temp[1].y, const[12].xxxx; 8: MOV temp[1].z, const[13].xxxx; 9: MOV temp[2].x, const[11].yyyy; 10: MOV temp[2].y, const[12].yyyy; 11: MOV temp[2].z, const[13].yyyy; 12: MOV temp[3].x, const[11].zzzz; 13: MOV temp[3].y, const[12].zzzz; 14: MOV temp[3].z, const[13].zzzz; 15: MOV temp[9], -input[1].xyzz; 16: ADD temp[4].xyz, input[4].xyzz, temp[9]; 17: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 18: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 19: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 20: RSQ temp[5].x, temp[5].xxxx; 21: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 22: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 23: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 24: MOV temp[1].y, temp[5].xxxx; 25: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 26: MOV temp[1].z, temp[3].xxxx; 27: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 28: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 29: RSQ temp[3].x, temp[3].xxxx; 30: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 31: ADD temp[1], temp[0], const[10]; 32: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 33: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 34: MAX temp[0].x, temp[0].xxxx, const[16].xxxx; 35: MOV temp[10], const[16].xxxx; 36: SEQ temp[3].x, const[0].xxxx, temp[10]; 37: IF temp[3].xxxx; 38: ELSE; 39: ENDIF; 40: MOV temp[3], temp[1]; 41: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 42: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 43: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 44: MOV temp[2].w, const[0].xxxx; 45: MOV temp[6].xy, input[2].xyxx; 46: MOV temp[4].xw, temp[3].xxzw; 47: MOV_SAT temp[0].x, temp[0].xxxx; 48: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 49: MOV temp[4].y, temp[5].xxxx; 50: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 51: MAD temp[1].x, temp[3].zzzz, const[17].xxxx, -temp[1].wwww; 52: MOV temp[4].z, temp[1].xxxx; 53: MOV output[2], temp[0].xxxx; 54: MOV temp[7], temp[4]; 55: MOV_SAT output[1], temp[2]; 56: MOV output[3], temp[6]; 57: MOV output[0], temp[7]; 58: MOV output[4], temp[7]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[8], -input[0].xyzz; 1: ADD temp[0].xyz, input[3].xyzz, temp[8]; 2: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[8]; 4: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 6: MOV temp[1].x, const[11].xxxx; 7: MOV temp[1].y, const[12].xxxx; 8: MOV temp[1].z, const[13].xxxx; 9: MOV temp[2].x, const[11].yyyy; 10: MOV temp[2].y, const[12].yyyy; 11: MOV temp[2].z, const[13].yyyy; 12: MOV temp[3].x, const[11].zzzz; 13: MOV temp[3].y, const[12].zzzz; 14: MOV temp[3].z, const[13].zzzz; 15: MOV temp[9], -input[1].xyzz; 16: ADD temp[4].xyz, input[4].xyzz, temp[9]; 17: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 18: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 19: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 20: RSQ temp[5].x, temp[5].xxxx; 21: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 22: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 23: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 24: MOV temp[1].y, temp[5].xxxx; 25: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 26: MOV temp[1].z, temp[3].xxxx; 27: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 28: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 29: RSQ temp[3].x, temp[3].xxxx; 30: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 31: ADD temp[1], temp[0], const[10]; 32: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 33: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 34: MAX temp[0].x, temp[0].xxxx, const[16].xxxx; 35: MOV temp[10], const[16].xxxx; 36: SEQ temp[3].x, const[0].xxxx, temp[10]; 37: ME_PRED_SNEQ temp[11].w, temp[3].xxxx; 38: ME_PRED_SET_INV temp[11].w, temp[11].___w; 39: ME_PRED_SET_POP temp[11].w, temp[11].___w; 40: MOV temp[3], temp[1]; 41: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 42: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 43: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 44: MOV temp[2].w, const[0].xxxx; 45: MOV temp[6].xy, input[2].xyxx; 46: MOV temp[4].xw, temp[3].xxzw; 47: MOV_SAT temp[0].x, temp[0].xxxx; 48: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 49: MOV temp[4].y, temp[5].xxxx; 50: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 51: MAD temp[1].x, temp[3].zzzz, const[17].xxxx, -temp[1].wwww; 52: MOV temp[4].z, temp[1].xxxx; 53: MOV output[2], temp[0].xxxx; 54: MOV temp[7], temp[4]; 55: MOV_SAT output[1], temp[2]; 56: MOV output[3], temp[6]; 57: MOV output[0], temp[7]; 58: MOV output[4], temp[7]; Final vertex program code: 0: op: 0x00f10003 dst: 8t op: VE_ADD src0: 0x1e910001 reg: 0i swiz: -X/-Y/-Z/-Z src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00700003 dst: 0t op: VE_ADD src0: 0x00910061 reg: 3i swiz: X/ Y/ Z/ Z src1: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 2: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00910000 reg: 0t swiz: X/ Y/ Z/ Z src1: 0x000001c2 reg: 14c swiz: X/ X/ X/ X src2: 0x00910001 reg: 0i swiz: X/ Y/ Z/ Z 3: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492020 reg: 1t swiz: Y/ Y/ Y/ Y src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00000020 reg: 1t swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00924020 reg: 1t swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 6: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x00000162 reg: 11c swiz: X/ X/ X/ X src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 7: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x00000182 reg: 12c swiz: X/ X/ X/ X src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 8: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x000001a2 reg: 13c swiz: X/ X/ X/ X src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 9: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x00492162 reg: 11c swiz: Y/ Y/ Y/ Y src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 10: op: 0x00204003 dst: 2t op: VE_ADD src0: 0x00492182 reg: 12c swiz: Y/ Y/ Y/ Y src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 11: op: 0x00404003 dst: 2t op: VE_ADD src0: 0x004921a2 reg: 13c swiz: Y/ Y/ Y/ Y src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 12: op: 0x00106003 dst: 3t op: VE_ADD src0: 0x00924162 reg: 11c swiz: Z/ Z/ Z/ Z src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 13: op: 0x00206003 dst: 3t op: VE_ADD src0: 0x00924182 reg: 12c swiz: Z/ Z/ Z/ Z src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 14: op: 0x00406003 dst: 3t op: VE_ADD src0: 0x009241a2 reg: 13c swiz: Z/ Z/ Z/ Z src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 15: op: 0x00f12003 dst: 9t op: VE_ADD src0: 0x1e910021 reg: 1i swiz: -X/-Y/-Z/-Z src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00708003 dst: 4t op: VE_ADD src0: 0x00910081 reg: 4i swiz: X/ Y/ Z/ Z src1: 0x00d10120 reg: 9t swiz: X/ Y/ Z/ W src2: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 17: op: 0x00708004 dst: 4t op: VE_MULTIPLY_ADD src0: 0x00910080 reg: 4t swiz: X/ Y/ Z/ Z src1: 0x000001c2 reg: 14c swiz: X/ X/ X/ X src2: 0x00910021 reg: 1i swiz: X/ Y/ Z/ Z 18: op: 0x0010a001 dst: 5t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 19: op: 0x0010a007 dst: 5t op: VE_MAXIMUM src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x1e0000a0 reg: 5t swiz: -X/-X/-X/-X src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 20: op: 0x0010a048 dst: 5t op: ME_RECIP_SQRT_DX src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 21: op: 0x00708002 dst: 4t op: VE_MULTIPLY src0: 0x00910080 reg: 4t swiz: X/ Y/ Z/ Z src1: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 22: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 23: op: 0x0010a001 dst: 5t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 24: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 25: op: 0x00106001 dst: 3t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 26: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 27: op: 0x00108001 dst: 4t op: VE_DOT_PRODUCT src0: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 28: op: 0x00106007 dst: 3t op: VE_MAXIMUM src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x1e000080 reg: 4t swiz: -X/-X/-X/-X src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 29: op: 0x00106048 dst: 3t op: ME_RECIP_SQRT_DX src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 30: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z src1: 0x00000060 reg: 3t swiz: X/ X/ X/ X src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 31: op: 0x00f02003 dst: 1t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src2: 0x01248142 reg: 10c swiz: 0/ 0/ 0/ 0 32: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src1: 0x0f1100a2 reg: 5c swiz: -X/-Y/-Z/ 0 src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 33: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x1e924020 reg: 1t swiz: -Z/-Z/-Z/-Z src1: 0x004921e2 reg: 15c swiz: Y/ Y/ Y/ Y src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 34: op: 0x00100007 dst: 0t op: VE_MAXIMUM src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x00000202 reg: 16c swiz: X/ X/ X/ X src2: 0x01248202 reg: 16c swiz: 0/ 0/ 0/ 0 35: op: 0x00f14003 dst: 10t op: VE_ADD src0: 0x00000202 reg: 16c swiz: X/ X/ X/ X src1: 0x01248202 reg: 16c swiz: 0/ 0/ 0/ 0 src2: 0x01248202 reg: 16c swiz: 0/ 0/ 0/ 0 36: op: 0x0010601b dst: 3t op: VE_SET_EQUAL src0: 0x00000002 reg: 0c swiz: X/ X/ X/ X src1: 0x00d10140 reg: 10t swiz: X/ Y/ Z/ W src2: 0x01248140 reg: 10t swiz: 0/ 0/ 0/ 0 37: op: 0x00816058 dst: 11t op: ME_PRED_SET_NEQ src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 38: op: 0x0081605a dst: 11t op: ME_PRED_SET_INV src0: 0x00db6160 reg: 11t swiz: W/ W/ W/ W src1: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 src2: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 39: op: 0x0081605b dst: 11t op: ME_PRED_SET_POP src0: 0x00db6160 reg: 11t swiz: W/ W/ W/ W src1: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 src2: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 40: op: 0x00f06003 dst: 3t op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 41: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x00910062 reg: 3c swiz: X/ Y/ Z/ Z src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 42: op: 0x00100002 dst: 0t op: VE_MULTIPLY src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00db61e2 reg: 15c swiz: W/ W/ W/ W src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 43: op: 0x00704003 dst: 2t op: VE_ADD src0: 0x00910042 reg: 2c swiz: X/ Y/ Z/ Z src1: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 44: op: 0x00804003 dst: 2t op: VE_ADD src0: 0x00000002 reg: 0c swiz: X/ X/ X/ X src1: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 45: op: 0x0030c003 dst: 6t op: VE_ADD src0: 0x00010041 reg: 2i swiz: X/ Y/ X/ X src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 46: op: 0x00908003 dst: 4t op: VE_ADD src0: 0x00d00060 reg: 3t swiz: X/ X/ Z/ W src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 47: op: 0x01100003 dst: 0t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 48: op: 0x0010a002 dst: 5t op: VE_MULTIPLY src0: 0x00492060 reg: 3t swiz: Y/ Y/ Y/ Y src1: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src2: 0x01248022 reg: 1c swiz: 0/ 0/ 0/ 0 49: op: 0x00208003 dst: 4t op: VE_ADD src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 50: op: 0x00308004 dst: 4t op: VE_MULTIPLY_ADD src0: 0x00db4022 reg: 1c swiz: Z/ W/ W/ W src1: 0x00db6020 reg: 1t swiz: W/ W/ W/ W src2: 0x00490080 reg: 4t swiz: X/ Y/ Y/ Y 51: op: 0x00102004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00924060 reg: 3t swiz: Z/ Z/ Z/ Z src1: 0x00000222 reg: 17c swiz: X/ X/ X/ X src2: 0x1edb6020 reg: 1t swiz: -W/-W/-W/-W 52: op: 0x00408003 dst: 4t op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 53: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 54: op: 0x00f0e003 dst: 7t op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 55: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 56: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d100c0 reg: 6t swiz: X/ Y/ Z/ W src1: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 src2: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 57: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 58: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR, COLOR DCL IN[1], COLOR[1], COLOR DCL IN[2], FOG, PERSPECTIVE DCL IN[3], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL CONST[2..3] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL 0: TEX TEMP[0], IN[3].xyyy, SAMP[0], 2D 1: MOV_SAT TEMP[0], TEMP[0] 2: MUL TEMP[1], IN[1], CONST[0] 3: MAD TEMP[2], TEMP[0], IN[0], TEMP[1] 4: ADD TEMP[0].x, CONST[3].zzzz, -IN[2].xxxx 5: ADD TEMP[1].x, CONST[3].zzzz, -CONST[3].yyyy 6: RCP TEMP[1].x, TEMP[1].xxxx 7: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx 8: LRP TEMP[2].xyz, TEMP[0].xxxx, TEMP[2].xyzz, CONST[2].xyzz 9: MOV_SAT OUT[0], TEMP[2] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 4: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 5: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 6: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 7: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 9: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 10: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[3].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 9: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[3].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 9: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[5].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[5].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[4], src0.w = temp[4] SEM_WAIT MAD_SAT temp[4].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[4].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[4], src0.w = temp[4], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[5], src2.w = temp[5] MAD temp[6].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[6].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[6], src1.xyz = const[2] MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[3] MAD temp[4].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[5].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[5] REPL_ALPHA temp[5].x RCP, src0.x 9: src0.xyz = temp[4], src1.xyz = temp[5] MAD_SAT temp[4].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[4], src1.xyz = temp[7], src2.xyz = const[2] MAD temp[6].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[6], src0.w = temp[6] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4045402: src: 2 R/G/G/G dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c050:MAD dest:5 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 2 0:CMN_INST 0x00187804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c040:MAD dest:4 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00500004:Addr0: 4t, Addr1: 0t, Addr2: 5t, srcp:0 2:ALPHA_ADDR 0x00500004:Addr0: 4t, Addr1: 0t, Addr2: 5t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c060:MAD dest:6 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c222060:MAD dest:6 rgb_C_src:2 R/G/B 0 alp_C_src:2 A 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040806:Addr0: 6t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21070:MAD dest:7 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000d03:Addr0: 3c, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801040:MAD dest:4 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924050:MAD dest:5 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000005a:SOP dest:5 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001404:Addr0: 4t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10201c04:Addr0: 4t, Addr1: 7t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222060:MAD dest:6 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 10 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL CONST[0..14] DCL TEMP[0..5], LOCAL DCL TEMP[6], ARRAY(1), LOCAL IMM[0] FLT32 { 0.0000, 128.0000, -128.0000, 1.0000} IMM[1] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[0].yyyy, CONST[8] 1: MAD TEMP[0], CONST[7], IN[0].xxxx, TEMP[0] 2: MOV TEMP[1].x, CONST[11].xxxx 3: MOV TEMP[1].y, CONST[12].xxxx 4: MOV TEMP[1].z, CONST[13].xxxx 5: MAD TEMP[0], CONST[9], IN[0].zzzz, TEMP[0] 6: MOV TEMP[2].x, CONST[11].yyyy 7: MOV TEMP[2].y, CONST[12].yyyy 8: MOV TEMP[2].z, CONST[13].yyyy 9: MOV TEMP[3].x, CONST[11].zzzz 10: MOV TEMP[3].y, CONST[12].zzzz 11: MOV TEMP[3].z, CONST[13].zzzz 12: DP3 TEMP[4].x, IN[1].xyzz, IN[1].xyzz 13: ABS TEMP[5].x, TEMP[4].xxxx 14: RSQ TEMP[5].x, TEMP[5].xxxx 15: MUL TEMP[4].xyz, TEMP[5].xxxx, IN[1].xyzz 16: DP3 TEMP[1].x, TEMP[4].xyzz, TEMP[1].xyzz 17: DP3 TEMP[5].x, TEMP[4].xyzz, TEMP[2].xyzz 18: MOV TEMP[1].y, TEMP[5].xxxx 19: DP3 TEMP[3].x, TEMP[4].xyzz, TEMP[3].xyzz 20: MOV TEMP[1].z, TEMP[3].xxxx 21: DP3 TEMP[4].x, TEMP[1].xyzz, TEMP[1].xyzz 22: ABS TEMP[3].x, TEMP[4].xxxx 23: RSQ TEMP[3].x, TEMP[3].xxxx 24: MUL TEMP[2].xyz, TEMP[1].xyzz, TEMP[3].xxxx 25: ADD TEMP[1], TEMP[0], CONST[10] 26: DP3 TEMP[0].x, TEMP[2].xyzz, -CONST[5].xyzz 27: ADD TEMP[2].x, -TEMP[1].zzzz, CONST[14].yyyy 28: MAX TEMP[0].x, TEMP[0].xxxx, IMM[0].xxxx 29: SEQ TEMP[3].x, CONST[0].xxxx, IMM[0].xxxx 30: IF TEMP[3].xxxx :0 31: ELSE :0 32: ENDIF 33: MOV TEMP[3], TEMP[1] 34: MUL TEMP[1].xyz, TEMP[0].xxxx, CONST[3].xyzz 35: MUL TEMP[0].x, TEMP[2].xxxx, CONST[14].wwww 36: ADD TEMP[2].xyz, CONST[2].xyzz, TEMP[1].xyzz 37: MOV TEMP[2].w, CONST[0].xxxx 38: MOV TEMP[6].xy, IN[2].xyxx 39: MOV TEMP[4].xw, TEMP[3].xxzw 40: MOV_SAT TEMP[0].x, TEMP[0].xxxx 41: MUL TEMP[5].x, TEMP[3].yyyy, CONST[1].yyyy 42: MOV TEMP[4].y, TEMP[5].xxxx 43: MAD TEMP[4].xy, CONST[1].zwww, TEMP[1].wwww, TEMP[4].xyyy 44: MAD TEMP[1].x, TEMP[3].zzzz, IMM[1].xxxx, -TEMP[1].wwww 45: MOV TEMP[4].z, TEMP[1].xxxx 46: MOV OUT[2], TEMP[0].xxxx 47: MOV OUT[0], TEMP[4] 48: MOV_SAT OUT[1], TEMP[2] 49: MOV OUT[3], TEMP[6] 50: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[8]; 1: MAD temp[0], const[7], input[0].xxxx, temp[0]; 2: MOV temp[1].x, const[11].xxxx; 3: MOV temp[1].y, const[12].xxxx; 4: MOV temp[1].z, const[13].xxxx; 5: MAD temp[0], const[9], input[0].zzzz, temp[0]; 6: MOV temp[2].x, const[11].yyyy; 7: MOV temp[2].y, const[12].yyyy; 8: MOV temp[2].z, const[13].yyyy; 9: MOV temp[3].x, const[11].zzzz; 10: MOV temp[3].y, const[12].zzzz; 11: MOV temp[3].z, const[13].zzzz; 12: DP3 temp[4].x, input[1].xyzz, input[1].xyzz; 13: ABS temp[5].x, temp[4].xxxx; 14: RSQ temp[5].x, temp[5].xxxx; 15: MUL temp[4].xyz, temp[5].xxxx, input[1].xyzz; 16: DP3 temp[1].x, temp[4].xyzz, temp[1].xyzz; 17: DP3 temp[5].x, temp[4].xyzz, temp[2].xyzz; 18: MOV temp[1].y, temp[5].xxxx; 19: DP3 temp[3].x, temp[4].xyzz, temp[3].xyzz; 20: MOV temp[1].z, temp[3].xxxx; 21: DP3 temp[4].x, temp[1].xyzz, temp[1].xyzz; 22: ABS temp[3].x, temp[4].xxxx; 23: RSQ temp[3].x, temp[3].xxxx; 24: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 25: ADD temp[1], temp[0], const[10]; 26: DP3 temp[0].x, temp[2].xyzz, -const[5].xyzz; 27: ADD temp[2].x, -temp[1].zzzz, const[14].yyyy; 28: MAX temp[0].x, temp[0].xxxx, const[15].xxxx; 29: SEQ temp[3].x, const[0].xxxx, const[15].xxxx; 30: IF temp[3].xxxx; 31: ELSE; 32: ENDIF; 33: MOV temp[3], temp[1]; 34: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 35: MUL temp[0].x, temp[2].xxxx, const[14].wwww; 36: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 37: MOV temp[2].w, const[0].xxxx; 38: MOV temp[6].xy, input[2].xyxx; 39: MOV temp[4].xw, temp[3].xxzw; 40: MOV_SAT temp[0].x, temp[0].xxxx; 41: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 42: MOV temp[4].y, temp[5].xxxx; 43: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 44: MAD temp[1].x, temp[3].zzzz, const[16].xxxx, -temp[1].wwww; 45: MOV temp[4].z, temp[1].xxxx; 46: MOV output[2], temp[0].xxxx; 47: MOV temp[7], temp[4]; 48: MOV_SAT output[1], temp[2]; 49: MOV output[3], temp[6]; 50: MOV output[0], temp[7]; 51: MOV output[4], temp[7]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[8]; 1: MAD temp[0], const[7], input[0].xxxx, temp[0]; 2: MOV temp[1].x, const[11].xxxx; 3: MOV temp[1].y, const[12].xxxx; 4: MOV temp[1].z, const[13].xxxx; 5: MAD temp[0], const[9], input[0].zzzz, temp[0]; 6: MOV temp[2].x, const[11].yyyy; 7: MOV temp[2].y, const[12].yyyy; 8: MOV temp[2].z, const[13].yyyy; 9: MOV temp[3].x, const[11].zzzz; 10: MOV temp[3].y, const[12].zzzz; 11: MOV temp[3].z, const[13].zzzz; 12: DP3 temp[4].x, input[1].xyzz, input[1].xyzz; 13: ABS temp[5].x, temp[4].xxxx; 14: RSQ temp[5].x, temp[5].xxxx; 15: MUL temp[4].xyz, temp[5].xxxx, input[1].xyzz; 16: DP3 temp[1].x, temp[4].xyzz, temp[1].xyzz; 17: DP3 temp[5].x, temp[4].xyzz, temp[2].xyzz; 18: MOV temp[1].y, temp[5].xxxx; 19: DP3 temp[3].x, temp[4].xyzz, temp[3].xyzz; 20: MOV temp[1].z, temp[3].xxxx; 21: DP3 temp[4].x, temp[1].xyzz, temp[1].xyzz; 22: ABS temp[3].x, temp[4].xxxx; 23: RSQ temp[3].x, temp[3].xxxx; 24: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 25: ADD temp[1], temp[0], const[10]; 26: DP3 temp[0].x, temp[2].xyzz, -const[5].xyzz; 27: ADD temp[2].x, -temp[1].zzzz, const[14].yyyy; 28: MAX temp[0].x, temp[0].xxxx, const[15].xxxx; 29: SEQ temp[3].x, const[0].xxxx, const[15].xxxx; 30: IF temp[3].xxxx; 31: ELSE; 32: ENDIF; 33: MOV temp[3], temp[1]; 34: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 35: MUL temp[0].x, temp[2].xxxx, const[14].wwww; 36: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 37: MOV temp[2].w, const[0].xxxx; 38: MOV temp[6].xy, input[2].xyxx; 39: MOV temp[4].xw, temp[3].xxzw; 40: MOV_SAT temp[0].x, temp[0].xxxx; 41: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 42: MOV temp[4].y, temp[5].xxxx; 43: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 44: MAD temp[1].x, temp[3].zzzz, const[16].xxxx, -temp[1].wwww; 45: MOV temp[4].z, temp[1].xxxx; 46: MOV output[2], temp[0].xxxx; 47: MOV temp[7], temp[4]; 48: MOV_SAT output[1], temp[2]; 49: MOV output[3], temp[6]; 50: MOV output[0], temp[7]; 51: MOV output[4], temp[7]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[8]; 1: MAD temp[0], const[7], input[0].xxxx, temp[0]; 2: MOV temp[1].x, const[11].xxxx; 3: MOV temp[1].y, const[12].xxxx; 4: MOV temp[1].z, const[13].xxxx; 5: MAD temp[0], const[9], input[0].zzzz, temp[0]; 6: MOV temp[2].x, const[11].yyyy; 7: MOV temp[2].y, const[12].yyyy; 8: MOV temp[2].z, const[13].yyyy; 9: MOV temp[3].x, const[11].zzzz; 10: MOV temp[3].y, const[12].zzzz; 11: MOV temp[3].z, const[13].zzzz; 12: DP4 temp[4].x, input[1].xyz0, input[1].xyz0; 13: MAX temp[5].x, temp[4].xxxx, -temp[4].xxxx; 14: RSQ temp[5].x, temp[5].xxxx; 15: MUL temp[4].xyz, temp[5].xxxx, input[1].xyzz; 16: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 17: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 18: MOV temp[1].y, temp[5].xxxx; 19: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 20: MOV temp[1].z, temp[3].xxxx; 21: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 22: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 23: RSQ temp[3].x, temp[3].xxxx; 24: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 25: ADD temp[1], temp[0], const[10]; 26: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 27: ADD temp[2].x, -temp[1].zzzz, const[14].yyyy; 28: MAX temp[0].x, temp[0].xxxx, const[15].xxxx; 29: SEQ temp[3].x, const[0].xxxx, const[15].xxxx; 30: IF temp[3].xxxx; 31: ELSE; 32: ENDIF; 33: MOV temp[3], temp[1]; 34: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 35: MUL temp[0].x, temp[2].xxxx, const[14].wwww; 36: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 37: MOV temp[2].w, const[0].xxxx; 38: MOV temp[6].xy, input[2].xyxx; 39: MOV temp[4].xw, temp[3].xxzw; 40: MOV_SAT temp[0].x, temp[0].xxxx; 41: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 42: MOV temp[4].y, temp[5].xxxx; 43: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 44: MAD temp[1].x, temp[3].zzzz, const[16].xxxx, -temp[1].wwww; 45: MOV temp[4].z, temp[1].xxxx; 46: MOV output[2], temp[0].xxxx; 47: MOV temp[7], temp[4]; 48: MOV_SAT output[1], temp[2]; 49: MOV output[3], temp[6]; 50: MOV output[0], temp[7]; 51: MOV output[4], temp[7]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[8]; 1: MAD temp[0], const[7], input[0].xxxx, temp[0]; 2: MOV temp[1].x, const[11].xxxx; 3: MOV temp[1].y, const[12].xxxx; 4: MOV temp[1].z, const[13].xxxx; 5: MAD temp[0], const[9], input[0].zzzz, temp[0]; 6: MOV temp[2].x, const[11].yyyy; 7: MOV temp[2].y, const[12].yyyy; 8: MOV temp[2].z, const[13].yyyy; 9: MOV temp[3].x, const[11].zzzz; 10: MOV temp[3].y, const[12].zzzz; 11: MOV temp[3].z, const[13].zzzz; 12: DP4 temp[4].x, input[1].xyz0, input[1].xyz0; 13: MAX temp[5].x, temp[4].xxxx, -temp[4].xxxx; 14: RSQ temp[5].x, temp[5].xxxx; 15: MUL temp[4].xyz, temp[5].xxxx, input[1].xyzz; 16: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 17: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 18: MOV temp[1].y, temp[5].xxxx; 19: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 20: MOV temp[1].z, temp[3].xxxx; 21: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 22: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 23: RSQ temp[3].x, temp[3].xxxx; 24: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 25: ADD temp[1], temp[0], const[10]; 26: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 27: ADD temp[2].x, -temp[1].zzzz, const[14].yyyy; 28: MAX temp[0].x, temp[0].xxxx, const[15].xxxx; 29: MOV temp[8], const[15].xxxx; 30: SEQ temp[3].x, const[0].xxxx, temp[8]; 31: IF temp[3].xxxx; 32: ELSE; 33: ENDIF; 34: MOV temp[3], temp[1]; 35: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 36: MUL temp[0].x, temp[2].xxxx, const[14].wwww; 37: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 38: MOV temp[2].w, const[0].xxxx; 39: MOV temp[6].xy, input[2].xyxx; 40: MOV temp[4].xw, temp[3].xxzw; 41: MOV_SAT temp[0].x, temp[0].xxxx; 42: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 43: MOV temp[4].y, temp[5].xxxx; 44: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 45: MAD temp[1].x, temp[3].zzzz, const[16].xxxx, -temp[1].wwww; 46: MOV temp[4].z, temp[1].xxxx; 47: MOV output[2], temp[0].xxxx; 48: MOV temp[7], temp[4]; 49: MOV_SAT output[1], temp[2]; 50: MOV output[3], temp[6]; 51: MOV output[0], temp[7]; 52: MOV output[4], temp[7]; CONST[15] = { 0.0000 128.0000 -128.0000 1.0000 } CONST[16] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[8]; 1: MAD temp[0], const[7], input[0].xxxx, temp[0]; 2: MOV temp[1].x, const[11].xxxx; 3: MOV temp[1].y, const[12].xxxx; 4: MOV temp[1].z, const[13].xxxx; 5: MAD temp[0], const[9], input[0].zzzz, temp[0]; 6: MOV temp[2].x, const[11].yyyy; 7: MOV temp[2].y, const[12].yyyy; 8: MOV temp[2].z, const[13].yyyy; 9: MOV temp[3].x, const[11].zzzz; 10: MOV temp[3].y, const[12].zzzz; 11: MOV temp[3].z, const[13].zzzz; 12: DP4 temp[4].x, input[1].xyz0, input[1].xyz0; 13: MAX temp[5].x, temp[4].xxxx, -temp[4].xxxx; 14: RSQ temp[5].x, temp[5].xxxx; 15: MUL temp[4].xyz, temp[5].xxxx, input[1].xyzz; 16: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 17: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 18: MOV temp[1].y, temp[5].xxxx; 19: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 20: MOV temp[1].z, temp[3].xxxx; 21: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 22: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 23: RSQ temp[3].x, temp[3].xxxx; 24: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 25: ADD temp[1], temp[0], const[10]; 26: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 27: ADD temp[2].x, -temp[1].zzzz, const[14].yyyy; 28: MAX temp[0].x, temp[0].xxxx, const[15].xxxx; 29: MOV temp[8], const[15].xxxx; 30: SEQ temp[3].x, const[0].xxxx, temp[8]; 31: IF temp[3].xxxx; 32: ELSE; 33: ENDIF; 34: MOV temp[3], temp[1]; 35: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 36: MUL temp[0].x, temp[2].xxxx, const[14].wwww; 37: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 38: MOV temp[2].w, const[0].xxxx; 39: MOV temp[6].xy, input[2].xyxx; 40: MOV temp[4].xw, temp[3].xxzw; 41: MOV_SAT temp[0].x, temp[0].xxxx; 42: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 43: MOV temp[4].y, temp[5].xxxx; 44: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 45: MAD temp[1].x, temp[3].zzzz, const[16].xxxx, -temp[1].wwww; 46: MOV temp[4].z, temp[1].xxxx; 47: MOV output[2], temp[0].xxxx; 48: MOV temp[7], temp[4]; 49: MOV_SAT output[1], temp[2]; 50: MOV output[3], temp[6]; 51: MOV output[0], temp[7]; 52: MOV output[4], temp[7]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[8]; 1: MAD temp[0], const[7], input[0].xxxx, temp[0]; 2: MOV temp[1].x, const[11].xxxx; 3: MOV temp[1].y, const[12].xxxx; 4: MOV temp[1].z, const[13].xxxx; 5: MAD temp[0], const[9], input[0].zzzz, temp[0]; 6: MOV temp[2].x, const[11].yyyy; 7: MOV temp[2].y, const[12].yyyy; 8: MOV temp[2].z, const[13].yyyy; 9: MOV temp[3].x, const[11].zzzz; 10: MOV temp[3].y, const[12].zzzz; 11: MOV temp[3].z, const[13].zzzz; 12: DP4 temp[4].x, input[1].xyz0, input[1].xyz0; 13: MAX temp[5].x, temp[4].xxxx, -temp[4].xxxx; 14: RSQ temp[5].x, temp[5].xxxx; 15: MUL temp[4].xyz, temp[5].xxxx, input[1].xyzz; 16: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 17: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 18: MOV temp[1].y, temp[5].xxxx; 19: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 20: MOV temp[1].z, temp[3].xxxx; 21: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 22: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 23: RSQ temp[3].x, temp[3].xxxx; 24: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 25: ADD temp[1], temp[0], const[10]; 26: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 27: ADD temp[2].x, -temp[1].zzzz, const[14].yyyy; 28: MAX temp[0].x, temp[0].xxxx, const[15].xxxx; 29: MOV temp[8], const[15].xxxx; 30: SEQ temp[3].x, const[0].xxxx, temp[8]; 31: ME_PRED_SNEQ temp[9].w, temp[3].xxxx; 32: ME_PRED_SET_INV temp[9].w, temp[9].___w; 33: ME_PRED_SET_POP temp[9].w, temp[9].___w; 34: MOV temp[3], temp[1]; 35: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 36: MUL temp[0].x, temp[2].xxxx, const[14].wwww; 37: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 38: MOV temp[2].w, const[0].xxxx; 39: MOV temp[6].xy, input[2].xyxx; 40: MOV temp[4].xw, temp[3].xxzw; 41: MOV_SAT temp[0].x, temp[0].xxxx; 42: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 43: MOV temp[4].y, temp[5].xxxx; 44: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 45: MAD temp[1].x, temp[3].zzzz, const[16].xxxx, -temp[1].wwww; 46: MOV temp[4].z, temp[1].xxxx; 47: MOV output[2], temp[0].xxxx; 48: MOV temp[7], temp[4]; 49: MOV_SAT output[1], temp[2]; 50: MOV output[3], temp[6]; 51: MOV output[0], temp[7]; 52: MOV output[4], temp[7]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x00000162 reg: 11c swiz: X/ X/ X/ X src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 3: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x00000182 reg: 12c swiz: X/ X/ X/ X src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 4: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x000001a2 reg: 13c swiz: X/ X/ X/ X src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 5: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 6: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x00492162 reg: 11c swiz: Y/ Y/ Y/ Y src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 7: op: 0x00204003 dst: 2t op: VE_ADD src0: 0x00492182 reg: 12c swiz: Y/ Y/ Y/ Y src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 8: op: 0x00404003 dst: 2t op: VE_ADD src0: 0x004921a2 reg: 13c swiz: Y/ Y/ Y/ Y src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 9: op: 0x00106003 dst: 3t op: VE_ADD src0: 0x00924162 reg: 11c swiz: Z/ Z/ Z/ Z src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 10: op: 0x00206003 dst: 3t op: VE_ADD src0: 0x00924182 reg: 12c swiz: Z/ Z/ Z/ Z src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 11: op: 0x00406003 dst: 3t op: VE_ADD src0: 0x009241a2 reg: 13c swiz: Z/ Z/ Z/ Z src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 12: op: 0x00108001 dst: 4t op: VE_DOT_PRODUCT src0: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 13: op: 0x0010a007 dst: 5t op: VE_MAXIMUM src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x1e000080 reg: 4t swiz: -X/-X/-X/-X src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 14: op: 0x0010a048 dst: 5t op: ME_RECIP_SQRT_DX src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 15: op: 0x00708002 dst: 4t op: VE_MULTIPLY src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x00910021 reg: 1i swiz: X/ Y/ Z/ Z src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 17: op: 0x0010a001 dst: 5t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 18: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 19: op: 0x00106001 dst: 3t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 20: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 21: op: 0x00108001 dst: 4t op: VE_DOT_PRODUCT src0: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 22: op: 0x00106007 dst: 3t op: VE_MAXIMUM src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x1e000080 reg: 4t swiz: -X/-X/-X/-X src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 23: op: 0x00106048 dst: 3t op: ME_RECIP_SQRT_DX src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 24: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z src1: 0x00000060 reg: 3t swiz: X/ X/ X/ X src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 25: op: 0x00f02003 dst: 1t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src2: 0x01248142 reg: 10c swiz: 0/ 0/ 0/ 0 26: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src1: 0x0f1100a2 reg: 5c swiz: -X/-Y/-Z/ 0 src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 27: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x1e924020 reg: 1t swiz: -Z/-Z/-Z/-Z src1: 0x004921c2 reg: 14c swiz: Y/ Y/ Y/ Y src2: 0x012481c2 reg: 14c swiz: 0/ 0/ 0/ 0 28: op: 0x00100007 dst: 0t op: VE_MAXIMUM src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x000001e2 reg: 15c swiz: X/ X/ X/ X src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 29: op: 0x00f10003 dst: 8t op: VE_ADD src0: 0x000001e2 reg: 15c swiz: X/ X/ X/ X src1: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 30: op: 0x0010601b dst: 3t op: VE_SET_EQUAL src0: 0x00000002 reg: 0c swiz: X/ X/ X/ X src1: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 31: op: 0x00812058 dst: 9t op: ME_PRED_SET_NEQ src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 32: op: 0x0081205a dst: 9t op: ME_PRED_SET_INV src0: 0x00db6120 reg: 9t swiz: W/ W/ W/ W src1: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 src2: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 33: op: 0x0081205b dst: 9t op: ME_PRED_SET_POP src0: 0x00db6120 reg: 9t swiz: W/ W/ W/ W src1: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 src2: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 34: op: 0x00f06003 dst: 3t op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 35: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x00910062 reg: 3c swiz: X/ Y/ Z/ Z src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 36: op: 0x00100002 dst: 0t op: VE_MULTIPLY src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00db61c2 reg: 14c swiz: W/ W/ W/ W src2: 0x012481c2 reg: 14c swiz: 0/ 0/ 0/ 0 37: op: 0x00704003 dst: 2t op: VE_ADD src0: 0x00910042 reg: 2c swiz: X/ Y/ Z/ Z src1: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 38: op: 0x00804003 dst: 2t op: VE_ADD src0: 0x00000002 reg: 0c swiz: X/ X/ X/ X src1: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 39: op: 0x0030c003 dst: 6t op: VE_ADD src0: 0x00010041 reg: 2i swiz: X/ Y/ X/ X src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 40: op: 0x00908003 dst: 4t op: VE_ADD src0: 0x00d00060 reg: 3t swiz: X/ X/ Z/ W src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 41: op: 0x01100003 dst: 0t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 42: op: 0x0010a002 dst: 5t op: VE_MULTIPLY src0: 0x00492060 reg: 3t swiz: Y/ Y/ Y/ Y src1: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src2: 0x01248022 reg: 1c swiz: 0/ 0/ 0/ 0 43: op: 0x00208003 dst: 4t op: VE_ADD src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 44: op: 0x00308004 dst: 4t op: VE_MULTIPLY_ADD src0: 0x00db4022 reg: 1c swiz: Z/ W/ W/ W src1: 0x00db6020 reg: 1t swiz: W/ W/ W/ W src2: 0x00490080 reg: 4t swiz: X/ Y/ Y/ Y 45: op: 0x00102004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00924060 reg: 3t swiz: Z/ Z/ Z/ Z src1: 0x00000202 reg: 16c swiz: X/ X/ X/ X src2: 0x1edb6020 reg: 1t swiz: -W/-W/-W/-W 46: op: 0x00408003 dst: 4t op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 47: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 48: op: 0x00f0e003 dst: 7t op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 49: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 50: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d100c0 reg: 6t swiz: X/ Y/ Z/ W src1: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 src2: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 51: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 52: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR, COLOR DCL IN[1], COLOR[1], COLOR DCL IN[2], FOG, PERSPECTIVE DCL IN[3], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL CONST[2..3] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL 0: TEX TEMP[0], IN[3].xyyy, SAMP[0], 2D 1: MOV_SAT TEMP[0], TEMP[0] 2: MUL TEMP[1], IN[1], CONST[0] 3: MAD TEMP[2], TEMP[0], IN[0], TEMP[1] 4: ADD TEMP[0].x, CONST[3].zzzz, -IN[2].xxxx 5: ADD TEMP[1].x, CONST[3].zzzz, -CONST[3].yyyy 6: RCP TEMP[1].x, TEMP[1].xxxx 7: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx 8: LRP TEMP[2].xyz, TEMP[0].xxxx, TEMP[2].xyzz, CONST[2].xyzz 9: MOV_SAT OUT[0], TEMP[2] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 4: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 5: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 6: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 7: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 9: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 10: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[3].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 9: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[3].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 9: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[5].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[5].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[4], src0.w = temp[4] SEM_WAIT MAD_SAT temp[4].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[4].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[4], src0.w = temp[4], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[5], src2.w = temp[5] MAD temp[6].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[6].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[6], src1.xyz = const[2] MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[3] MAD temp[4].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[5].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[5] REPL_ALPHA temp[5].x RCP, src0.x 9: src0.xyz = temp[4], src1.xyz = temp[5] MAD_SAT temp[4].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[4], src1.xyz = temp[7], src2.xyz = const[2] MAD temp[6].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[6], src0.w = temp[6] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4045402: src: 2 R/G/G/G dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c050:MAD dest:5 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 2 0:CMN_INST 0x00187804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c040:MAD dest:4 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00500004:Addr0: 4t, Addr1: 0t, Addr2: 5t, srcp:0 2:ALPHA_ADDR 0x00500004:Addr0: 4t, Addr1: 0t, Addr2: 5t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c060:MAD dest:6 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c222060:MAD dest:6 rgb_C_src:2 R/G/B 0 alp_C_src:2 A 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040806:Addr0: 6t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21070:MAD dest:7 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000d03:Addr0: 3c, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801040:MAD dest:4 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924050:MAD dest:5 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000005a:SOP dest:5 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001404:Addr0: 4t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10201c04:Addr0: 4t, Addr1: 7t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222060:MAD dest:6 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 10 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL CONST[0..5] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL DCL TEMP[3], LOCAL IMM[0] FLT32 { 0.0000, 2.0000, 0.0000, 0.0000} 0: ADD TEMP[0].xyz, IN[2].xyzz, -IN[0].xyzz 1: MAD TEMP[1].xyz, TEMP[0].xyzz, CONST[5].xxxx, IN[0].xyzz 2: MUL TEMP[0], TEMP[1].yyyy, CONST[2] 3: MAD TEMP[0], CONST[1], TEMP[1].xxxx, TEMP[0] 4: MAD TEMP[0], CONST[3], TEMP[1].zzzz, TEMP[0] 5: ADD TEMP[0], TEMP[0], CONST[4] 6: MOV TEMP[2].xy, IN[1].xyxx 7: MOV TEMP[1].xw, TEMP[0].xxzw 8: MUL TEMP[3].x, TEMP[0].yyyy, CONST[0].yyyy 9: MOV TEMP[1].y, TEMP[3].xxxx 10: MAD TEMP[1].xy, CONST[0].zwww, TEMP[0].wwww, TEMP[1].xyyy 11: MAD TEMP[0].x, TEMP[0].zzzz, IMM[0].yyyy, -TEMP[0].wwww 12: MOV TEMP[1].z, TEMP[0].xxxx 13: MOV OUT[1], IMM[0].xxxx 14: MOV OUT[0], TEMP[1] 15: MOV OUT[2], TEMP[2] 16: END Vertex Program: before compilation # Radeon Compiler Program 0: ADD temp[0].xyz, input[2].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[5].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[2]; 3: MAD temp[0], const[1], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[3], temp[1].zzzz, temp[0]; 5: ADD temp[0], temp[0], const[4]; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[1].xw, temp[0].xxzw; 8: MUL temp[3].x, temp[0].yyyy, const[0].yyyy; 9: MOV temp[1].y, temp[3].xxxx; 10: MAD temp[1].xy, const[0].zwww, temp[0].wwww, temp[1].xyyy; 11: MAD temp[0].x, temp[0].zzzz, const[6].yyyy, -temp[0].wwww; 12: MOV temp[1].z, temp[0].xxxx; 13: MOV output[1], const[6].xxxx; 14: MOV temp[4], temp[1]; 15: MOV output[2], temp[2]; 16: MOV output[0], temp[4]; 17: MOV output[3], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: ADD temp[0].xyz, input[2].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[5].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[2]; 3: MAD temp[0], const[1], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[3], temp[1].zzzz, temp[0]; 5: ADD temp[0], temp[0], const[4]; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[1].xw, temp[0].xxzw; 8: MUL temp[3].x, temp[0].yyyy, const[0].yyyy; 9: MOV temp[1].y, temp[3].xxxx; 10: MAD temp[1].xy, const[0].zwww, temp[0].wwww, temp[1].xyyy; 11: MAD temp[0].x, temp[0].zzzz, const[6].yyyy, -temp[0].wwww; 12: MOV temp[1].z, temp[0].xxxx; 13: MOV output[1], const[6].xxxx; 14: MOV temp[4], temp[1]; 15: MOV output[2], temp[2]; 16: MOV output[0], temp[4]; 17: MOV output[3], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: ADD temp[0].xyz, input[2].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[5].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[2]; 3: MAD temp[0], const[1], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[3], temp[1].zzzz, temp[0]; 5: ADD temp[0], temp[0], const[4]; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[1].xw, temp[0].xxzw; 8: MUL temp[3].x, temp[0].yyyy, const[0].yyyy; 9: MOV temp[1].y, temp[3].xxxx; 10: MAD temp[1].xy, const[0].zwww, temp[0].wwww, temp[1].xyyy; 11: MAD temp[0].x, temp[0].zzzz, const[6].yyyy, -temp[0].wwww; 12: MOV temp[1].z, temp[0].xxxx; 13: MOV output[1], const[6].xxxx; 14: MOV temp[4], temp[1]; 15: MOV output[2], temp[2]; 16: MOV output[0], temp[4]; 17: MOV output[3], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[5], -input[0].xyzz; 1: ADD temp[0].xyz, input[2].xyzz, temp[5]; 2: MAD temp[1].xyz, temp[0].xyzz, const[5].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[2]; 4: MAD temp[0], const[1], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[3], temp[1].zzzz, temp[0]; 6: ADD temp[0], temp[0], const[4]; 7: MOV temp[2].xy, input[1].xyxx; 8: MOV temp[1].xw, temp[0].xxzw; 9: MUL temp[3].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[1].y, temp[3].xxxx; 11: MAD temp[1].xy, const[0].zwww, temp[0].wwww, temp[1].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[6].yyyy, -temp[0].wwww; 13: MOV temp[1].z, temp[0].xxxx; 14: MOV output[1], const[6].xxxx; 15: MOV temp[4], temp[1]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[4]; 18: MOV output[3], temp[4]; CONST[6] = { 0.0000 2.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[5], -input[0].xyzz; 1: ADD temp[0].xyz, input[2].xyzz, temp[5]; 2: MAD temp[1].xyz, temp[0].xyzz, const[5].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[2]; 4: MAD temp[0], const[1], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[3], temp[1].zzzz, temp[0]; 6: ADD temp[0], temp[0], const[4]; 7: MOV temp[2].xy, input[1].xyxx; 8: MOV temp[1].xw, temp[0].xxzw; 9: MUL temp[3].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[1].y, temp[3].xxxx; 11: MAD temp[1].xy, const[0].zwww, temp[0].wwww, temp[1].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[6].yyyy, -temp[0].wwww; 13: MOV temp[1].z, temp[0].xxxx; 14: MOV output[1], const[6].xxxx; 15: MOV temp[4], temp[1]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[4]; 18: MOV output[3], temp[4]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[5], -input[0].xyzz; 1: ADD temp[0].xyz, input[2].xyzz, temp[5]; 2: MAD temp[1].xyz, temp[0].xyzz, const[5].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[2]; 4: MAD temp[0], const[1], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[3], temp[1].zzzz, temp[0]; 6: ADD temp[0], temp[0], const[4]; 7: MOV temp[2].xy, input[1].xyxx; 8: MOV temp[1].xw, temp[0].xxzw; 9: MUL temp[3].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[1].y, temp[3].xxxx; 11: MAD temp[1].xy, const[0].zwww, temp[0].wwww, temp[1].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[6].yyyy, -temp[0].wwww; 13: MOV temp[1].z, temp[0].xxxx; 14: MOV output[1], const[6].xxxx; 15: MOV temp[4], temp[1]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[4]; 18: MOV output[3], temp[4]; Final vertex program code: 0: op: 0x00f0a003 dst: 5t op: VE_ADD src0: 0x1e910001 reg: 0i swiz: -X/-Y/-Z/-Z src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00700003 dst: 0t op: VE_ADD src0: 0x00910041 reg: 2i swiz: X/ Y/ Z/ Z src1: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 2: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00910000 reg: 0t swiz: X/ Y/ Z/ Z src1: 0x000000a2 reg: 5c swiz: X/ X/ X/ X src2: 0x00910001 reg: 0i swiz: X/ Y/ Z/ Z 3: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492020 reg: 1t swiz: Y/ Y/ Y/ Y src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00000020 reg: 1t swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00924020 reg: 1t swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 6: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 7: op: 0x00304003 dst: 2t op: VE_ADD src0: 0x00010021 reg: 1i swiz: X/ Y/ X/ X src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 8: op: 0x00902003 dst: 1t op: VE_ADD src0: 0x00d00000 reg: 0t swiz: X/ X/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 9: op: 0x00106002 dst: 3t op: VE_MULTIPLY src0: 0x00492000 reg: 0t swiz: Y/ Y/ Y/ Y src1: 0x00492002 reg: 0c swiz: Y/ Y/ Y/ Y src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 10: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 11: op: 0x00302004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db4002 reg: 0c swiz: Z/ W/ W/ W src1: 0x00db6000 reg: 0t swiz: W/ W/ W/ W src2: 0x00490020 reg: 1t swiz: X/ Y/ Y/ Y 12: op: 0x00100004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924000 reg: 0t swiz: Z/ Z/ Z/ Z src1: 0x004920c2 reg: 6c swiz: Y/ Y/ Y/ Y src2: 0x1edb6000 reg: 0t swiz: -W/-W/-W/-W 13: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 14: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x000000c2 reg: 6c swiz: X/ X/ X/ X src1: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 15: op: 0x00f08003 dst: 4t op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 16: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 17: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 18: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR[1], COLOR DCL IN[1], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0..1] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL 0: MOV TEMP[0].xyz, CONST[1].xyzx 1: TEX TEMP[1], IN[1].xyyy, SAMP[0], 2D 2: MOV_SAT TEMP[1], TEMP[1] 3: MOV TEMP[0].w, TEMP[1].wwww 4: MAD TEMP[2], IN[0], CONST[0], TEMP[0] 5: MOV_SAT OUT[0], TEMP[2] 6: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[0].xyz, const[1].xyzx; 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: MOV_SAT temp[1], temp[1]; 3: MOV temp[0].w, temp[1].wwww; 4: MAD temp[2], input[0], const[0], temp[0]; 5: MOV_SAT output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[1] MAD temp[0].xyz, src0.xyz, src0.111, src0.000 1: TEX temp[1], input[1].xyyy, 2D[0]; 2: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT temp[1].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[1].w, src0.w, src0.1, src0.0 3: src0.w = temp[1] MAD temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = input[0], src0.w = input[0], src1.xyz = const[0], src1.w = const[0], src2.xyz = temp[0], src2.w = temp[0] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[1] MAD temp[0].xyz, src0.xyz, src0.111, src0.000 3: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD_SAT temp[1].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[1].w, src0.w, src0.1, src0.0 4: src0.w = temp[1] MAD temp[0].w, src0.w, src0.1, src0.0 5: src0.xyz = input[0], src0.w = input[0], src1.xyz = const[0], src1.w = const[0], src2.xyz = temp[0], src2.w = temp[0] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 6: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[1] MAD temp[0].xyz, src0.xyz, src0.111, src0.000 3: src0.xyz = temp[1], src0.w = temp[1] SEM_WAIT MAD_SAT temp[1].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[1].w, src0.w, src0.1, src0.0 4: src0.w = temp[1] MAD temp[0].w, src0.w, src0.1, src0.0 5: src0.xyz = input[0], src0.w = input[0], src1.xyz = const[0], src1.w = const[0], src2.xyz = temp[0], src2.w = temp[0] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 6: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[3], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[1] MAD temp[2].xyz, src0.xyz, src0.111, src0.000 3: src0.xyz = temp[3], src0.w = temp[3] SEM_WAIT MAD_SAT temp[3].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[3].w, src0.w, src0.1, src0.0 4: src0.w = temp[3] MAD temp[2].w, src0.w, src0.1, src0.0 5: src0.xyz = input[0], src0.w = input[0], src1.xyz = const[0], src1.w = const[0], src2.xyz = temp[2], src2.w = temp[2] MAD temp[4].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[4].w, src0.w, src1.w, src2.w 6: src0.xyz = temp[4], src0.w = temp[4] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4035401: src: 1 R/G/G/G dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020101:Addr0: 1c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00187804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00240000:Addr0: 0t, Addr1: 0c, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x00240000:Addr0: 0t, Addr1: 0c, Addr2: 2t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c040:MAD dest:4 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c222040:MAD dest:4 rgb_C_src:2 R/G/B 0 alp_C_src:2 A 0 5 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL CONST[0..6] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL DCL TEMP[3..4], LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[0].yyyy, CONST[2] 1: MAD TEMP[0], CONST[1], IN[0].xxxx, TEMP[0] 2: MAD TEMP[0], CONST[3], IN[0].zzzz, TEMP[0] 3: ADD TEMP[0], TEMP[0], CONST[4] 4: ADD TEMP[1].x, -TEMP[0].zzzz, CONST[6].yyyy 5: MUL TEMP[1].x, TEMP[1].xxxx, CONST[6].wwww 6: MOV TEMP[2].xy, IN[1].xyxx 7: MOV TEMP[3].xw, TEMP[0].xxzw 8: MOV_SAT TEMP[1].x, TEMP[1].xxxx 9: MUL TEMP[4].x, TEMP[0].yyyy, CONST[0].yyyy 10: MOV TEMP[3].y, TEMP[4].xxxx 11: MAD TEMP[3].xy, CONST[0].zwww, TEMP[0].wwww, TEMP[3].xyyy 12: MAD TEMP[0].x, TEMP[0].zzzz, IMM[0].xxxx, -TEMP[0].wwww 13: MOV TEMP[3].z, TEMP[0].xxxx 14: MOV OUT[1], TEMP[1].xxxx 15: MOV OUT[0], TEMP[3] 16: MOV OUT[2], TEMP[2] 17: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; CONST[7] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[2]; 1: MAD temp[0], const[1], input[0].xxxx, temp[0]; 2: MAD temp[0], const[3], input[0].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[4]; 4: ADD temp[1].x, -temp[0].zzzz, const[6].yyyy; 5: MUL temp[1].x, temp[1].xxxx, const[6].wwww; 6: MOV temp[2].xy, input[1].xyxx; 7: MOV temp[3].xw, temp[0].xxzw; 8: MOV_SAT temp[1].x, temp[1].xxxx; 9: MUL temp[4].x, temp[0].yyyy, const[0].yyyy; 10: MOV temp[3].y, temp[4].xxxx; 11: MAD temp[3].xy, const[0].zwww, temp[0].wwww, temp[3].xyyy; 12: MAD temp[0].x, temp[0].zzzz, const[7].xxxx, -temp[0].wwww; 13: MOV temp[3].z, temp[0].xxxx; 14: MOV output[1], temp[1].xxxx; 15: MOV temp[5], temp[3]; 16: MOV output[2], temp[2]; 17: MOV output[0], temp[5]; 18: MOV output[3], temp[5]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 4: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x1e924000 reg: 0t swiz: -Z/-Z/-Z/-Z src1: 0x004920c2 reg: 6c swiz: Y/ Y/ Y/ Y src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 5: op: 0x00102002 dst: 1t op: VE_MULTIPLY src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x00db60c2 reg: 6c swiz: W/ W/ W/ W src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 6: op: 0x00304003 dst: 2t op: VE_ADD src0: 0x00010021 reg: 1i swiz: X/ Y/ X/ X src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 7: op: 0x00906003 dst: 3t op: VE_ADD src0: 0x00d00000 reg: 0t swiz: X/ X/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x01102003 dst: 1t op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00108002 dst: 4t op: VE_MULTIPLY src0: 0x00492000 reg: 0t swiz: Y/ Y/ Y/ Y src1: 0x00492002 reg: 0c swiz: Y/ Y/ Y/ Y src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 10: op: 0x00206003 dst: 3t op: VE_ADD src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 11: op: 0x00306004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00db4002 reg: 0c swiz: Z/ W/ W/ W src1: 0x00db6000 reg: 0t swiz: W/ W/ W/ W src2: 0x00490060 reg: 3t swiz: X/ Y/ Y/ Y 12: op: 0x00100004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924000 reg: 0t swiz: Z/ Z/ Z/ Z src1: 0x000000e2 reg: 7c swiz: X/ X/ X/ X src2: 0x1edb6000 reg: 0t swiz: -W/-W/-W/-W 13: op: 0x00406003 dst: 3t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 14: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 15: op: 0x00f0a003 dst: 5t op: VE_ADD src0: 0x00d10060 reg: 3t swiz: X/ Y/ Z/ W src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 16: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 17: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 18: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..2] DCL TEMP[0], LOCAL DCL TEMP[1], ARRAY(1), LOCAL DCL TEMP[2], LOCAL 0: TEX TEMP[0], IN[1].xyyy, SAMP[0], 2D 1: MOV TEMP[1], TEMP[0] 2: ADD TEMP[0].x, CONST[2].zzzz, -IN[0].xxxx 3: ADD TEMP[2].x, CONST[2].zzzz, -CONST[2].yyyy 4: RCP TEMP[2].x, TEMP[2].xxxx 5: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[2].xxxx 6: LRP TEMP[1].xyz, TEMP[0].xxxx, TEMP[1].xyzz, CONST[1].xyzz 7: MOV_SAT OUT[0], TEMP[1] 8: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: LRP temp[1].xyz, temp[0].xxxx, temp[1].xyzz, const[1].xyzz; 7: MOV_SAT output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: ADD temp[3].xyz, temp[1].xyzz, -const[1].xyzz; 7: MAD temp[1].xyz, temp[0].xxxx, temp[3], const[1].xyzz; 8: MOV_SAT output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: ADD temp[3].xyz, temp[1].xyzz, -const[1].xyzz; 7: MAD temp[1].xyz, temp[0].xxxx, temp[3], const[1].xyzz; 8: MOV_SAT output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: MOV temp[1], temp[0]; 2: ADD temp[0].x, const[2].zzzz, -input[0].xxxx; 3: ADD temp[2].x, const[2].zzzz, -const[2].yyyy; 4: RCP temp[2].x, temp[2].xxxx; 5: MUL_SAT temp[0].x, temp[0].xxxx, temp[2].xxxx; 6: ADD temp[3].xyz, temp[1].xyzz, -const[1].xyzz; 7: MAD temp[1].xyz, temp[0].xxxx, temp[3], const[1].xyzz; 8: MOV_SAT output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 2: src0.xyz = const[2], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 3: src0.xyz = const[2] MAD temp[2].x, src0.zzz, src0.111, -src0.yyy 4: src0.xyz = temp[2] REPL_ALPHA temp[2].x RCP, src0.x 5: src0.xyz = temp[0], src1.xyz = temp[2] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 6: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 7: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[1] MAD temp[1].xyz, src0.xxx, src1.xyz, src2.xyz 8: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[2] MAD temp[2].x, src0.zzz, src0.111, -src0.yyy 3: src0.xyz = temp[2] REPL_ALPHA temp[2].x RCP, src0.x 4: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[2], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = temp[0], src1.xyz = temp[2] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[1] MAD temp[1].xyz, src0.xxx, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[2] MAD temp[2].x, src0.zzz, src0.111, -src0.yyy 3: src0.xyz = temp[2] REPL_ALPHA temp[2].x RCP, src0.x 4: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[2], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = temp[0], src1.xyz = temp[2] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[1] MAD temp[1].xyz, src0.xxx, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[2], input[0].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[2] MAD temp[4].x, src0.zzz, src0.111, -src0.yyy 3: src0.xyz = temp[4] REPL_ALPHA temp[4].x RCP, src0.x 4: src0.xyz = temp[2], src0.w = temp[2] SEM_WAIT MAD temp[3].xyz, src0.xyz, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[3], src1.xyz = const[1] MAD temp[5].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[2], src1.xyz = input[1] MAD temp[2].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = temp[2], src1.xyz = temp[4] MAD_SAT temp[2].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[2], src1.xyz = temp[5], src2.xyz = const[1] MAD temp[3].xyz, src0.xxx, src1.xyz, src2.xyz 9: src0.xyz = temp[3], src0.w = temp[3] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4025400: src: 0 R/G/G/G dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020102:Addr0: 2c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924040:MAD dest:4 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 2 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000004a:SOP dest:4 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040403:Addr0: 3t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21050:MAD dest:5 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000502:Addr0: 2c, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801020:MAD dest:2 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001002:Addr0: 2t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10101402:Addr0: 2t, Addr1: 5t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222030:MAD dest:3 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 8 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0..10] DCL TEMP[0..4], LOCAL DCL TEMP[5..7], ARRAY(1), LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0].xyz, IN[0].yyyy, CONST[6].xyzz 1: MUL TEMP[1].xyz, IN[1].yyyy, CONST[6].xyzz 2: MAD TEMP[0].xyz, CONST[5].xyzz, IN[0].xxxx, TEMP[0].xyzz 3: MAD TEMP[1].xyz, CONST[5].xyzz, IN[1].xxxx, TEMP[1].xyzz 4: MAD TEMP[2].xyz, CONST[7].xyzz, IN[0].zzzz, TEMP[0].xyzz 5: MAD TEMP[0].xyz, CONST[7].xyzz, IN[1].zzzz, TEMP[1].xyzz 6: MUL TEMP[1], TEMP[2].yyyy, CONST[2] 7: ADD TEMP[3].xyz, TEMP[2].xyzz, -CONST[10].xyzz 8: MAD TEMP[1], CONST[1], TEMP[2].xxxx, TEMP[1] 9: DP3 TEMP[4].x, TEMP[3].xyzz, TEMP[0].xyzz 10: MAD TEMP[1], CONST[3], TEMP[2].zzzz, TEMP[1] 11: ADD TEMP[2].x, TEMP[4].xxxx, TEMP[4].xxxx 12: ADD TEMP[1], TEMP[1], CONST[4] 13: MAD TEMP[0].xyz, TEMP[0].xyzz, -TEMP[2].xxxx, TEMP[3].xyzz 14: ADD TEMP[2].x, -TEMP[1].zzzz, CONST[9].yyyy 15: MUL TEMP[2].x, TEMP[2].xxxx, CONST[9].wwww 16: MOV TEMP[5].xy, IN[2].xyxx 17: MOV TEMP[6].xy, IN[2].xyxx 18: MOV TEMP[7].xyz, TEMP[0].xyzx 19: MOV TEMP[0].xw, TEMP[1].xxzw 20: MOV_SAT TEMP[2].x, TEMP[2].xxxx 21: MUL TEMP[3].x, TEMP[1].yyyy, CONST[0].yyyy 22: MOV TEMP[0].y, TEMP[3].xxxx 23: MAD TEMP[0].xy, CONST[0].zwww, TEMP[1].wwww, TEMP[0].xyyy 24: MAD TEMP[1].x, TEMP[1].zzzz, IMM[0].xxxx, -TEMP[1].wwww 25: MOV TEMP[0].z, TEMP[1].xxxx 26: MOV OUT[1], TEMP[2].xxxx 27: MOV OUT[0], TEMP[0] 28: MOV OUT[2], TEMP[5] 29: MOV OUT[3], TEMP[6] 30: MOV OUT[4], TEMP[7] 31: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0].xyz, input[0].yyyy, const[6].xyzz; 1: MUL temp[1].xyz, input[1].yyyy, const[6].xyzz; 2: MAD temp[0].xyz, const[5].xyzz, input[0].xxxx, temp[0].xyzz; 3: MAD temp[1].xyz, const[5].xyzz, input[1].xxxx, temp[1].xyzz; 4: MAD temp[2].xyz, const[7].xyzz, input[0].zzzz, temp[0].xyzz; 5: MAD temp[0].xyz, const[7].xyzz, input[1].zzzz, temp[1].xyzz; 6: MUL temp[1], temp[2].yyyy, const[2]; 7: ADD temp[3].xyz, temp[2].xyzz, -const[10].xyzz; 8: MAD temp[1], const[1], temp[2].xxxx, temp[1]; 9: DP3 temp[4].x, temp[3].xyzz, temp[0].xyzz; 10: MAD temp[1], const[3], temp[2].zzzz, temp[1]; 11: ADD temp[2].x, temp[4].xxxx, temp[4].xxxx; 12: ADD temp[1], temp[1], const[4]; 13: MAD temp[0].xyz, temp[0].xyzz, -temp[2].xxxx, temp[3].xyzz; 14: ADD temp[2].x, -temp[1].zzzz, const[9].yyyy; 15: MUL temp[2].x, temp[2].xxxx, const[9].wwww; 16: MOV temp[5].xy, input[2].xyxx; 17: MOV temp[6].xy, input[2].xyxx; 18: MOV temp[7].xyz, temp[0].xyzx; 19: MOV temp[0].xw, temp[1].xxzw; 20: MOV_SAT temp[2].x, temp[2].xxxx; 21: MUL temp[3].x, temp[1].yyyy, const[0].yyyy; 22: MOV temp[0].y, temp[3].xxxx; 23: MAD temp[0].xy, const[0].zwww, temp[1].wwww, temp[0].xyyy; 24: MAD temp[1].x, temp[1].zzzz, const[11].xxxx, -temp[1].wwww; 25: MOV temp[0].z, temp[1].xxxx; 26: MOV output[1], temp[2].xxxx; 27: MOV temp[8], temp[0]; 28: MOV output[2], temp[5]; 29: MOV output[3], temp[6]; 30: MOV output[4], temp[7]; 31: MOV output[0], temp[8]; 32: MOV output[5], temp[8]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0].xyz, input[0].yyyy, const[6].xyzz; 1: MUL temp[1].xyz, input[1].yyyy, const[6].xyzz; 2: MAD temp[0].xyz, const[5].xyzz, input[0].xxxx, temp[0].xyzz; 3: MAD temp[1].xyz, const[5].xyzz, input[1].xxxx, temp[1].xyzz; 4: MAD temp[2].xyz, const[7].xyzz, input[0].zzzz, temp[0].xyzz; 5: MAD temp[0].xyz, const[7].xyzz, input[1].zzzz, temp[1].xyzz; 6: MUL temp[1], temp[2].yyyy, const[2]; 7: ADD temp[3].xyz, temp[2].xyzz, -const[10].xyzz; 8: MAD temp[1], const[1], temp[2].xxxx, temp[1]; 9: DP3 temp[4].x, temp[3].xyzz, temp[0].xyzz; 10: MAD temp[1], const[3], temp[2].zzzz, temp[1]; 11: ADD temp[2].x, temp[4].xxxx, temp[4].xxxx; 12: ADD temp[1], temp[1], const[4]; 13: MAD temp[0].xyz, temp[0].xyzz, -temp[2].xxxx, temp[3].xyzz; 14: ADD temp[2].x, -temp[1].zzzz, const[9].yyyy; 15: MUL temp[2].x, temp[2].xxxx, const[9].wwww; 16: MOV temp[5].xy, input[2].xyxx; 17: MOV temp[6].xy, input[2].xyxx; 18: MOV temp[7].xyz, temp[0].xyzx; 19: MOV temp[0].xw, temp[1].xxzw; 20: MOV_SAT temp[2].x, temp[2].xxxx; 21: MUL temp[3].x, temp[1].yyyy, const[0].yyyy; 22: MOV temp[0].y, temp[3].xxxx; 23: MAD temp[0].xy, const[0].zwww, temp[1].wwww, temp[0].xyyy; 24: MAD temp[1].x, temp[1].zzzz, const[11].xxxx, -temp[1].wwww; 25: MOV temp[0].z, temp[1].xxxx; 26: MOV output[1], temp[2].xxxx; 27: MOV temp[8], temp[0]; 28: MOV output[2], temp[5]; 29: MOV output[3], temp[6]; 30: MOV output[4], temp[7]; 31: MOV output[0], temp[8]; 32: MOV output[5], temp[8]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0].xyz, input[0].yyyy, const[6].xyzz; 1: MUL temp[1].xyz, input[1].yyyy, const[6].xyzz; 2: MAD temp[0].xyz, const[5].xyzz, input[0].xxxx, temp[0].xyzz; 3: MAD temp[1].xyz, const[5].xyzz, input[1].xxxx, temp[1].xyzz; 4: MAD temp[2].xyz, const[7].xyzz, input[0].zzzz, temp[0].xyzz; 5: MAD temp[0].xyz, const[7].xyzz, input[1].zzzz, temp[1].xyzz; 6: MUL temp[1], temp[2].yyyy, const[2]; 7: ADD temp[3].xyz, temp[2].xyzz, -const[10].xyzz; 8: MAD temp[1], const[1], temp[2].xxxx, temp[1]; 9: DP4 temp[4].x, temp[3].xyz0, temp[0].xyz0; 10: MAD temp[1], const[3], temp[2].zzzz, temp[1]; 11: ADD temp[2].x, temp[4].xxxx, temp[4].xxxx; 12: ADD temp[1], temp[1], const[4]; 13: MAD temp[0].xyz, temp[0].xyzz, -temp[2].xxxx, temp[3].xyzz; 14: ADD temp[2].x, -temp[1].zzzz, const[9].yyyy; 15: MUL temp[2].x, temp[2].xxxx, const[9].wwww; 16: MOV temp[5].xy, input[2].xyxx; 17: MOV temp[6].xy, input[2].xyxx; 18: MOV temp[7].xyz, temp[0].xyzx; 19: MOV temp[0].xw, temp[1].xxzw; 20: MOV_SAT temp[2].x, temp[2].xxxx; 21: MUL temp[3].x, temp[1].yyyy, const[0].yyyy; 22: MOV temp[0].y, temp[3].xxxx; 23: MAD temp[0].xy, const[0].zwww, temp[1].wwww, temp[0].xyyy; 24: MAD temp[1].x, temp[1].zzzz, const[11].xxxx, -temp[1].wwww; 25: MOV temp[0].z, temp[1].xxxx; 26: MOV output[1], temp[2].xxxx; 27: MOV temp[8], temp[0]; 28: MOV output[2], temp[5]; 29: MOV output[3], temp[6]; 30: MOV output[4], temp[7]; 31: MOV output[0], temp[8]; 32: MOV output[5], temp[8]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0].xyz, input[0].yyyy, const[6].xyzz; 1: MUL temp[1].xyz, input[1].yyyy, const[6].xyzz; 2: MAD temp[0].xyz, const[5].xyzz, input[0].xxxx, temp[0].xyzz; 3: MAD temp[1].xyz, const[5].xyzz, input[1].xxxx, temp[1].xyzz; 4: MAD temp[2].xyz, const[7].xyzz, input[0].zzzz, temp[0].xyzz; 5: MAD temp[0].xyz, const[7].xyzz, input[1].zzzz, temp[1].xyzz; 6: MUL temp[1], temp[2].yyyy, const[2]; 7: ADD temp[3].xyz, temp[2].xyzz, -const[10].xyzz; 8: MAD temp[1], const[1], temp[2].xxxx, temp[1]; 9: DP4 temp[4].x, temp[3].xyz0, temp[0].xyz0; 10: MAD temp[1], const[3], temp[2].zzzz, temp[1]; 11: ADD temp[2].x, temp[4].xxxx, temp[4].xxxx; 12: ADD temp[1], temp[1], const[4]; 13: MAD temp[0].xyz, temp[0].xyzz, -temp[2].xxxx, temp[3].xyzz; 14: ADD temp[2].x, -temp[1].zzzz, const[9].yyyy; 15: MUL temp[2].x, temp[2].xxxx, const[9].wwww; 16: MOV temp[5].xy, input[2].xyxx; 17: MOV temp[6].xy, input[2].xyxx; 18: MOV temp[7].xyz, temp[0].xyzx; 19: MOV temp[0].xw, temp[1].xxzw; 20: MOV_SAT temp[2].x, temp[2].xxxx; 21: MUL temp[3].x, temp[1].yyyy, const[0].yyyy; 22: MOV temp[0].y, temp[3].xxxx; 23: MAD temp[0].xy, const[0].zwww, temp[1].wwww, temp[0].xyyy; 24: MAD temp[1].x, temp[1].zzzz, const[11].xxxx, -temp[1].wwww; 25: MOV temp[0].z, temp[1].xxxx; 26: MOV output[1], temp[2].xxxx; 27: MOV temp[8], temp[0]; 28: MOV output[2], temp[5]; 29: MOV output[3], temp[6]; 30: MOV output[4], temp[7]; 31: MOV output[0], temp[8]; 32: MOV output[5], temp[8]; CONST[11] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0].xyz, input[0].yyyy, const[6].xyzz; 1: MUL temp[1].xyz, input[1].yyyy, const[6].xyzz; 2: MAD temp[0].xyz, const[5].xyzz, input[0].xxxx, temp[0].xyzz; 3: MAD temp[1].xyz, const[5].xyzz, input[1].xxxx, temp[1].xyzz; 4: MAD temp[2].xyz, const[7].xyzz, input[0].zzzz, temp[0].xyzz; 5: MAD temp[0].xyz, const[7].xyzz, input[1].zzzz, temp[1].xyzz; 6: MUL temp[1], temp[2].yyyy, const[2]; 7: ADD temp[3].xyz, temp[2].xyzz, -const[10].xyzz; 8: MAD temp[1], const[1], temp[2].xxxx, temp[1]; 9: DP4 temp[4].x, temp[3].xyz0, temp[0].xyz0; 10: MAD temp[1], const[3], temp[2].zzzz, temp[1]; 11: ADD temp[2].x, temp[4].xxxx, temp[4].xxxx; 12: ADD temp[1], temp[1], const[4]; 13: MAD temp[0].xyz, temp[0].xyzz, -temp[2].xxxx, temp[3].xyzz; 14: ADD temp[2].x, -temp[1].zzzz, const[9].yyyy; 15: MUL temp[2].x, temp[2].xxxx, const[9].wwww; 16: MOV temp[5].xy, input[2].xyxx; 17: MOV temp[6].xy, input[2].xyxx; 18: MOV temp[7].xyz, temp[0].xyzx; 19: MOV temp[0].xw, temp[1].xxzw; 20: MOV_SAT temp[2].x, temp[2].xxxx; 21: MUL temp[3].x, temp[1].yyyy, const[0].yyyy; 22: MOV temp[0].y, temp[3].xxxx; 23: MAD temp[0].xy, const[0].zwww, temp[1].wwww, temp[0].xyyy; 24: MAD temp[1].x, temp[1].zzzz, const[11].xxxx, -temp[1].wwww; 25: MOV temp[0].z, temp[1].xxxx; 26: MOV output[1], temp[2].xxxx; 27: MOV temp[8], temp[0]; 28: MOV output[2], temp[5]; 29: MOV output[3], temp[6]; 30: MOV output[4], temp[7]; 31: MOV output[0], temp[8]; 32: MOV output[5], temp[8]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0].xyz, input[0].yyyy, const[6].xyzz; 1: MUL temp[1].xyz, input[1].yyyy, const[6].xyzz; 2: MAD temp[0].xyz, const[5].xyzz, input[0].xxxx, temp[0].xyzz; 3: MAD temp[1].xyz, const[5].xyzz, input[1].xxxx, temp[1].xyzz; 4: MAD temp[2].xyz, const[7].xyzz, input[0].zzzz, temp[0].xyzz; 5: MAD temp[0].xyz, const[7].xyzz, input[1].zzzz, temp[1].xyzz; 6: MUL temp[1], temp[2].yyyy, const[2]; 7: ADD temp[3].xyz, temp[2].xyzz, -const[10].xyzz; 8: MAD temp[1], const[1], temp[2].xxxx, temp[1]; 9: DP4 temp[4].x, temp[3].xyz0, temp[0].xyz0; 10: MAD temp[1], const[3], temp[2].zzzz, temp[1]; 11: ADD temp[2].x, temp[4].xxxx, temp[4].xxxx; 12: ADD temp[1], temp[1], const[4]; 13: MAD temp[0].xyz, temp[0].xyzz, -temp[2].xxxx, temp[3].xyzz; 14: ADD temp[2].x, -temp[1].zzzz, const[9].yyyy; 15: MUL temp[2].x, temp[2].xxxx, const[9].wwww; 16: MOV temp[5].xy, input[2].xyxx; 17: MOV temp[6].xy, input[2].xyxx; 18: MOV temp[7].xyz, temp[0].xyzx; 19: MOV temp[0].xw, temp[1].xxzw; 20: MOV_SAT temp[2].x, temp[2].xxxx; 21: MUL temp[3].x, temp[1].yyyy, const[0].yyyy; 22: MOV temp[0].y, temp[3].xxxx; 23: MAD temp[0].xy, const[0].zwww, temp[1].wwww, temp[0].xyyy; 24: MAD temp[1].x, temp[1].zzzz, const[11].xxxx, -temp[1].wwww; 25: MOV temp[0].z, temp[1].xxxx; 26: MOV output[1], temp[2].xxxx; 27: MOV temp[8], temp[0]; 28: MOV output[2], temp[5]; 29: MOV output[3], temp[6]; 30: MOV output[4], temp[7]; 31: MOV output[0], temp[8]; 32: MOV output[5], temp[8]; Final vertex program code: 0: op: 0x00700002 dst: 0t op: VE_MULTIPLY src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x009100c2 reg: 6c swiz: X/ Y/ Z/ Z src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 1: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x00492021 reg: 1i swiz: Y/ Y/ Y/ Y src1: 0x009100c2 reg: 6c swiz: X/ Y/ Z/ Z src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 2: op: 0x00700004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x009100a2 reg: 5c swiz: X/ Y/ Z/ Z src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x00910000 reg: 0t swiz: X/ Y/ Z/ Z 3: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x009100a2 reg: 5c swiz: X/ Y/ Z/ Z src1: 0x00000021 reg: 1i swiz: X/ X/ X/ X src2: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z 4: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x009100e2 reg: 7c swiz: X/ Y/ Z/ Z src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00910000 reg: 0t swiz: X/ Y/ Z/ Z 5: op: 0x00700004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x009100e2 reg: 7c swiz: X/ Y/ Z/ Z src1: 0x00924021 reg: 1i swiz: Z/ Z/ Z/ Z src2: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z 6: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00492040 reg: 2t swiz: Y/ Y/ Y/ Y src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 7: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x00910040 reg: 2t swiz: X/ Y/ Z/ Z src1: 0x1e910142 reg: 10c swiz: -X/-Y/-Z/-Z src2: 0x01248142 reg: 10c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00000040 reg: 2t swiz: X/ X/ X/ X src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 9: op: 0x00108001 dst: 4t op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 10: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00924040 reg: 2t swiz: Z/ Z/ Z/ Z src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 11: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x00000080 reg: 4t swiz: X/ X/ X/ X src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 12: op: 0x00f02003 dst: 1t op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 13: op: 0x00700080 dst: 0t op: PVS_MACRO_OP_2CLK_MADD src0: 0x00910000 reg: 0t swiz: X/ Y/ Z/ Z src1: 0x1e000040 reg: 2t swiz: -X/-X/-X/-X src2: 0x00910060 reg: 3t swiz: X/ Y/ Z/ Z 14: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x1e924020 reg: 1t swiz: -Z/-Z/-Z/-Z src1: 0x00492122 reg: 9c swiz: Y/ Y/ Y/ Y src2: 0x01248122 reg: 9c swiz: 0/ 0/ 0/ 0 15: op: 0x00104002 dst: 2t op: VE_MULTIPLY src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00db6122 reg: 9c swiz: W/ W/ W/ W src2: 0x01248122 reg: 9c swiz: 0/ 0/ 0/ 0 16: op: 0x0030a003 dst: 5t op: VE_ADD src0: 0x00010041 reg: 2i swiz: X/ Y/ X/ X src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 17: op: 0x0030c003 dst: 6t op: VE_ADD src0: 0x00010041 reg: 2i swiz: X/ Y/ X/ X src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 18: op: 0x0070e003 dst: 7t op: VE_ADD src0: 0x00110000 reg: 0t swiz: X/ Y/ Z/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 19: op: 0x00900003 dst: 0t op: VE_ADD src0: 0x00d00020 reg: 1t swiz: X/ X/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 20: op: 0x01104003 dst: 2t op: VE_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 21: op: 0x00106002 dst: 3t op: VE_MULTIPLY src0: 0x00492020 reg: 1t swiz: Y/ Y/ Y/ Y src1: 0x00492002 reg: 0c swiz: Y/ Y/ Y/ Y src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 22: op: 0x00200003 dst: 0t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 23: op: 0x00300004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db4002 reg: 0c swiz: Z/ W/ W/ W src1: 0x00db6020 reg: 1t swiz: W/ W/ W/ W src2: 0x00490000 reg: 0t swiz: X/ Y/ Y/ Y 24: op: 0x00102004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00924020 reg: 1t swiz: Z/ Z/ Z/ Z src1: 0x00000162 reg: 11c swiz: X/ X/ X/ X src2: 0x1edb6020 reg: 1t swiz: -W/-W/-W/-W 25: op: 0x00400003 dst: 0t op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 26: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 27: op: 0x00f10003 dst: 8t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 28: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 29: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d100c0 reg: 6t swiz: X/ Y/ Z/ W src1: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 src2: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 30: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 31: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src1: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 32: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src1: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL CONST[0..1] DCL CONST[5..6] DCL TEMP[0..5], LOCAL DCL TEMP[6], ARRAY(1), LOCAL IMM[0] FLT32 { 0.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2].xyyy, SAMP[2], 2D 1: MAD TEMP[1].xy, CONST[0].xxxx, TEMP[0].zyyy, CONST[0].yyyy 2: MOV TEMP[2].yz, TEMP[1].yxyy 3: MAD TEMP[2].x, CONST[0].xxxx, TEMP[0].xxxx, CONST[0].yyyy 4: MAD TEMP[3].xyz, TEMP[2].xyzz, CONST[0].wwww, IN[3].xyzz 5: DP3 TEMP[4].x, TEMP[3].xyzz, TEMP[3].xyzz 6: SEQ TEMP[5].x, TEMP[4].xxxx, IMM[0].xxxx 7: IF TEMP[5].xxxx :0 8: MOV TEMP[5].xyz, IMM[0].xxxx 9: ELSE :0 10: RSQ TEMP[4].x, TEMP[4].xxxx 11: MUL TEMP[5].xyz, TEMP[3].xyzz, TEMP[4].xxxx 12: ENDIF 13: DP3 TEMP[1].x, TEMP[5].xyzz, CONST[1].xyzz 14: MOV_SAT TEMP[1].x, TEMP[1].xxxx 15: MUL TEMP[3].x, TEMP[1].xxxx, TEMP[1].xxxx 16: MAD TEMP[2].xyz, TEMP[2].xyzz, CONST[0].zzzz, IN[3].xyzz 17: MUL TEMP[1].x, TEMP[1].xxxx, TEMP[3].xxxx 18: TEX TEMP[3].xyz, TEMP[2].xyzz, SAMP[0], CUBE 19: TEX TEMP[4], IN[1].xyyy, SAMP[1], 2D 20: MOV TEMP[2].w, TEMP[4].wwww 21: LRP TEMP[0].xyz, TEMP[0].wwww, TEMP[4].xyzz, TEMP[3].xyzz 22: MAD TEMP[2].xyz, TEMP[1].xxxx, CONST[1].wwww, TEMP[0].xyzz 23: MOV TEMP[6], TEMP[2] 24: ADD TEMP[0].x, CONST[6].zzzz, -IN[0].xxxx 25: ADD TEMP[1].x, CONST[6].zzzz, -CONST[6].yyyy 26: RCP TEMP[1].x, TEMP[1].xxxx 27: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx 28: LRP TEMP[6].xyz, TEMP[0].xxxx, TEMP[6].xyzz, CONST[5].xyzz 29: MOV_SAT OUT[0], TEMP[6] 30: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: SEQ temp[5].x, temp[4].xxxx, temp[0].0000; 7: IF temp[5].xxxx; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, temp[4].xxxx; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: LRP temp[0].xyz, temp[0].wwww, temp[4].xyzz, temp[3].xyzz; 22: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 23: MOV temp[6], temp[2]; 24: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 25: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 26: RCP temp[1].x, temp[1].xxxx; 27: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 28: LRP temp[6].xyz, temp[0].xxxx, temp[6].xyzz, const[5].xyzz; 29: MOV_SAT output[0], temp[6]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: SEQ temp[5].x, temp[4].xxxx, temp[0].0000; 7: IF temp[5].xxxx; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, temp[4].xxxx; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: LRP temp[0].xyz, temp[0].wwww, temp[4].xyzz, temp[3].xyzz; 22: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 23: MOV temp[6], temp[2]; 24: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 25: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 26: RCP temp[1].x, temp[1].xxxx; 27: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 28: LRP temp[6].xyz, temp[0].xxxx, temp[6].xyzz, const[5].xyzz; 29: MOV_SAT output[0], temp[6]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: SEQ temp[5].x, temp[4].xxxx, temp[0].0000; 7: IF temp[5].xxxx; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, temp[4].xxxx; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: LRP temp[0].xyz, temp[0].wwww, temp[4].xyzz, temp[3].xyzz; 22: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 23: MOV temp[6], temp[2]; 24: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 25: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 26: RCP temp[1].x, temp[1].xxxx; 27: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 28: LRP temp[6].xyz, temp[0].xxxx, temp[6].xyzz, const[5].xyzz; 29: MOV_SAT output[0], temp[6]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: SEQ temp[5].x, temp[4].xxxx, temp[0].0000; 7: IF temp[5].xxxx; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, temp[4].xxxx; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: LRP temp[0].xyz, temp[0].wwww, temp[4].xyzz, temp[3].xyzz; 22: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 23: MOV temp[6], temp[2]; 24: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 25: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 26: RCP temp[1].x, temp[1].xxxx; 27: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 28: LRP temp[6].xyz, temp[0].xxxx, temp[6].xyzz, const[5].xyzz; 29: MOV_SAT output[0], temp[6]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: SEQ temp[5].x, temp[4].xxxx, temp[0].0000; 7: IF temp[5].xxxx; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, temp[4].xxxx; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: LRP temp[0].xyz, temp[0].wwww, temp[4].xyzz, temp[3].xyzz; 22: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 23: MOV temp[6], temp[2]; 24: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 25: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 26: RCP temp[1].x, temp[1].xxxx; 27: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 28: LRP temp[6].xyz, temp[0].xxxx, temp[6].xyzz, const[5].xyzz; 29: MOV_SAT output[0], temp[6]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: SUB none., temp[4].xxxx, temp[0].0000; [aluresult = (x == 0)] 7: IF aluresult.x___; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, temp[4].xxxx; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: LRP temp[0].xyz, temp[0].wwww, temp[4].xyzz, temp[3].xyzz; 22: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 23: MOV temp[6], temp[2]; 24: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 25: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 26: RCP temp[1].x, temp[1].xxxx; 27: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 28: LRP temp[6].xyz, temp[0].xxxx, temp[6].xyzz, const[5].xyzz; 29: MOV_SAT output[0], temp[6]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: ADD none., temp[4].xxxx, -temp[0].0000; [aluresult = (x == 0)] 7: IF aluresult.x___; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, |temp[4].xxxx|; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: ADD temp[7].xyz, temp[4].xyzz, -temp[3].xyzz; 22: MAD temp[0].xyz, temp[0].wwww, temp[7], temp[3].xyzz; 23: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 24: MOV temp[6], temp[2]; 25: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 26: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 27: RCP temp[1].x, temp[1].xxxx; 28: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 29: ADD temp[8].xyz, temp[6].xyzz, -const[5].xyzz; 30: MAD temp[6].xyz, temp[0].xxxx, temp[8], const[5].xyzz; 31: MOV_SAT output[0], temp[6]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: ADD none., temp[4].xxxx, -temp[0].0000; [aluresult = (x == 0)] 7: IF aluresult.x___; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, |temp[4].xxxx|; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: ADD temp[7].xyz, temp[4].xyzz, -temp[3].xyzz; 22: MAD temp[0].xyz, temp[0].wwww, temp[7], temp[3].xyzz; 23: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 24: MOV temp[6], temp[2]; 25: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 26: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 27: RCP temp[1].x, temp[1].xxxx; 28: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 29: ADD temp[8].xyz, temp[6].xyzz, -const[5].xyzz; 30: MAD temp[6].xyz, temp[0].xxxx, temp[8], const[5].xyzz; 31: MOV_SAT output[0], temp[6]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: MAD temp[1].xy, const[0].xxxx, temp[0].zyyy, const[0].yyyy; 2: MOV temp[2].yz, temp[1].yxyy; 3: MAD temp[2].x, const[0].xxxx, temp[0].xxxx, const[0].yyyy; 4: MAD temp[3].xyz, temp[2].xyzz, const[0].wwww, input[3].xyzz; 5: DP3 temp[4].x, temp[3].xyzz, temp[3].xyzz; 6: ADD none., temp[4].xxxx, -temp[0].0000; [aluresult = (x == 0)] 7: IF aluresult.x___; 8: MOV temp[5].xyz, temp[0].0000; 9: ELSE; 10: RSQ temp[4].x, |temp[4].xxxx|; 11: MUL temp[5].xyz, temp[3].xyzz, temp[4].xxxx; 12: ENDIF; 13: DP3 temp[1].x, temp[5].xyzz, const[1].xyzz; 14: MOV_SAT temp[1].x, temp[1].xxxx; 15: MUL temp[3].x, temp[1].xxxx, temp[1].xxxx; 16: MAD temp[2].xyz, temp[2].xyzz, const[0].zzzz, input[3].xyzz; 17: MUL temp[1].x, temp[1].xxxx, temp[3].xxxx; 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: MOV temp[2].w, temp[4].wwww; 21: ADD temp[7].xyz, temp[4].xyzz, -temp[3].xyzz; 22: MAD temp[0].xyz, temp[0].wwww, temp[7], temp[3].xyzz; 23: MAD temp[2].xyz, temp[1].xxxx, const[1].wwww, temp[0].xyzz; 24: MOV temp[6], temp[2]; 25: ADD temp[0].x, const[6].zzzz, -input[0].xxxx; 26: ADD temp[1].x, const[6].zzzz, -const[6].yyyy; 27: RCP temp[1].x, temp[1].xxxx; 28: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 29: ADD temp[8].xyz, temp[6].xyzz, -const[5].xyzz; 30: MAD temp[6].xyz, temp[0].xxxx, temp[8], const[5].xyzz; 31: MOV_SAT output[0], temp[6]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[2]; 1: src0.xyz = const[0], src1.xyz = temp[0] MAD temp[1].xy, src0.xxx, src1.zyy, src0.yyy 2: src0.xyz = temp[1] MAD temp[2].yz, src0.yxy, src0.111, src0.000 3: src0.xyz = const[0], src1.xyz = temp[0] MAD temp[2].x, src0.xxx, src1.xxx, src0.yyy 4: src0.xyz = temp[2], src0.w = const[0], src1.xyz = input[3] MAD temp[3].xyz, src0.xyz, src0.www, src1.xyz 5: src0.xyz = temp[3] DP3 temp[4].x, src0.xyz, src0.xyz 6: src0.xyz = temp[4] MAD aluresult, src0.xxx, src0.111, -src0.000 [aluresult = (result == 0)] 7: IF aluresult.x___; 8: MAD temp[5].xyz, src0.000, src0.111, src0.000 9: ELSE; 10: src0.xyz = temp[4] REPL_ALPHA temp[4].x RSQ, |src0.x| 11: src0.xyz = temp[3], src1.xyz = temp[4] MAD temp[5].xyz, src0.xyz, src1.xxx, src0.000 12: ENDIF; 13: src0.xyz = temp[5], src1.xyz = const[1] DP3 temp[1].x, src0.xyz, src1.xyz 14: src0.xyz = temp[1] MAD_SAT temp[1].x, src0.xxx, src0.111, src0.000 15: src0.xyz = temp[1] MAD temp[3].x, src0.xxx, src0.xxx, src0.000 16: src0.xyz = temp[2], src1.xyz = const[0], src2.xyz = input[3] MAD temp[2].xyz, src0.xyz, src1.zzz, src2.xyz 17: src0.xyz = temp[1], src1.xyz = temp[3] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 18: TEX temp[3].xyz, temp[2].xyzz, CUBE[0]; 19: TEX temp[4], input[1].xyyy, 2D[1]; 20: src0.w = temp[4] MAD temp[2].w, src0.w, src0.1, src0.0 21: src0.xyz = temp[4], src1.xyz = temp[3] MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz 22: src0.xyz = temp[7], src0.w = temp[0], src1.xyz = temp[3] MAD temp[0].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[1], src0.w = const[1], src1.xyz = temp[0] MAD temp[2].xyz, src0.xxx, src0.www, src1.xyz 24: src0.xyz = temp[2], src0.w = temp[2] MAD temp[6].xyz, src0.xyz, src0.111, src0.000 MAD temp[6].w, src0.w, src0.1, src0.0 25: src0.xyz = const[6], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 26: src0.xyz = const[6] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 27: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 28: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 29: src0.xyz = temp[6], src1.xyz = const[5] MAD temp[8].xyz, src0.xyz, src0.111, -src1.xyz 30: src0.xyz = temp[0], src1.xyz = temp[8], src2.xyz = const[5] MAD temp[6].xyz, src0.xxx, src1.xyz, src2.xyz 31: src0.xyz = temp[6], src0.w = temp[6] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[2].xyyy, 2D[2] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[0], src1.xyz = temp[0] SEM_WAIT MAD temp[1].xy, src0.xxx, src1.zyy, src0.yyy 3: src0.xyz = temp[1] MAD temp[2].yz, src0.yxy, src0.111, src0.000 4: src0.xyz = const[0], src1.xyz = temp[0] MAD temp[2].x, src0.xxx, src1.xxx, src0.yyy 5: src0.xyz = temp[2], src0.w = const[0], src1.xyz = input[3] MAD temp[3].xyz, src0.xyz, src0.www, src1.xyz 6: src0.xyz = temp[3] DP3 temp[4].x, src0.xyz, src0.xyz 7: src0.xyz = temp[4] MAD aluresult, src0.xxx, src0.111, -src0.000 [aluresult = (result == 0)] 8: IF aluresult.x___; 9: MAD temp[5].xyz, src0.000, src0.111, src0.000 10: ELSE; 11: src0.xyz = temp[4] REPL_ALPHA temp[4].x RSQ, |src0.x| 12: src0.xyz = temp[3], src1.xyz = temp[4] MAD temp[5].xyz, src0.xyz, src1.xxx, src0.000 13: ENDIF; 14: src0.xyz = temp[5], src1.xyz = const[1] DP3 temp[1].x, src0.xyz, src1.xyz 15: src0.xyz = temp[2], src1.xyz = const[0], src2.xyz = input[3] MAD temp[2].xyz, src0.xyz, src1.zzz, src2.xyz 16: src0.xyz = temp[1] MAD_SAT temp[1].x, src0.xxx, src0.111, src0.000 17: src0.xyz = temp[1] MAD temp[3].x, src0.xxx, src0.xxx, src0.000 18: src0.xyz = temp[1], src1.xyz = temp[3] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 19: BEGIN_TEX; 20: TEX temp[4], input[1].xyyy, 2D[1]; 21: TEX temp[3].xyz, temp[2].xyzz, CUBE[0] SEM_WAIT SEM_ACQUIRE; 22: src0.xyz = temp[4], src0.w = temp[4], src1.xyz = temp[3] SEM_WAIT MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[2].w, src0.w, src0.1, src0.0 23: src0.xyz = temp[7], src0.w = temp[0], src1.xyz = temp[3] MAD temp[0].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[1], src0.w = const[1], src1.xyz = temp[0] MAD temp[2].xyz, src0.xxx, src0.www, src1.xyz 25: src0.xyz = temp[2], src0.w = temp[2] MAD temp[6].xyz, src0.xyz, src0.111, src0.000 MAD temp[6].w, src0.w, src0.1, src0.0 26: src0.xyz = temp[6], src1.xyz = const[5] MAD temp[8].xyz, src0.xyz, src0.111, -src1.xyz 27: src0.xyz = const[6] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 28: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 29: src0.xyz = const[6], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 30: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 31: src0.xyz = temp[0], src1.xyz = temp[8], src2.xyz = const[5] MAD temp[6].xyz, src0.xxx, src1.xyz, src2.xyz 32: src0.xyz = temp[6], src0.w = temp[6] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[2].xyyy, 2D[2] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[0], src1.xyz = temp[0] SEM_WAIT MAD temp[1].xy, src0.xxx, src1.zyy, src0.yyy 3: src0.xyz = temp[1] MAD temp[2].yz, src0.yxy, src0.111, src0.000 4: src0.xyz = const[0], src1.xyz = temp[0] MAD temp[2].x, src0.xxx, src1.xxx, src0.yyy 5: src0.xyz = temp[2], src0.w = const[0], src1.xyz = input[3] MAD temp[3].xyz, src0.xyz, src0.www, src1.xyz 6: src0.xyz = temp[3] DP3 temp[4].x, src0.xyz, src0.xyz 7: src0.xyz = temp[4] MAD aluresult, src0.xxx, src0.111, -src0.000 [aluresult = (result == 0)] 8: IF aluresult.x___; 9: MAD temp[5].xyz, src0.000, src0.111, src0.000 10: ELSE; 11: src0.xyz = temp[4] REPL_ALPHA temp[4].x RSQ, |src0.x| 12: src0.xyz = temp[3], src1.xyz = temp[4] MAD temp[5].xyz, src0.xyz, src1.xxx, src0.000 13: ENDIF; 14: src0.xyz = temp[5], src1.xyz = const[1] DP3 temp[1].x, src0.xyz, src1.xyz 15: src0.xyz = temp[2], src1.xyz = const[0], src2.xyz = input[3] MAD temp[2].xyz, src0.xyz, src1.zzz, src2.xyz 16: src0.xyz = temp[1] MAD_SAT temp[1].x, src0.xxx, src0.111, src0.000 17: src0.xyz = temp[1] MAD temp[3].x, src0.xxx, src0.xxx, src0.000 18: src0.xyz = temp[1], src1.xyz = temp[3] MAD temp[1].x, src0.xxx, src1.xxx, src0.000 19: BEGIN_TEX; 20: TEX temp[4], input[1].xyyy, 2D[1]; 21: TEX temp[3].xyz, temp[2].xyzz, CUBE[0] SEM_WAIT SEM_ACQUIRE; 22: src0.xyz = temp[4], src0.w = temp[4], src1.xyz = temp[3] SEM_WAIT MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[2].w, src0.w, src0.1, src0.0 23: src0.xyz = temp[7], src0.w = temp[0], src1.xyz = temp[3] MAD temp[0].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[1], src0.w = const[1], src1.xyz = temp[0] MAD temp[2].xyz, src0.xxx, src0.www, src1.xyz 25: src0.xyz = temp[2], src0.w = temp[2] MAD temp[6].xyz, src0.xyz, src0.111, src0.000 MAD temp[6].w, src0.w, src0.1, src0.0 26: src0.xyz = temp[6], src1.xyz = const[5] MAD temp[8].xyz, src0.xyz, src0.111, -src1.xyz 27: src0.xyz = const[6] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 28: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 29: src0.xyz = const[6], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 30: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 31: src0.xyz = temp[0], src1.xyz = temp[8], src2.xyz = const[5] MAD temp[6].xyz, src0.xxx, src1.xyz, src2.xyz 32: src0.xyz = temp[6], src0.w = temp[6] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xyyy, 2D[2] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = const[0], src1.xyz = temp[4] SEM_WAIT MAD temp[5].xy, src0.xxx, src1.zyy, src0.yyy 3: src0.xyz = temp[5] MAD temp[6].yz, src0.yxy, src0.111, src0.000 4: src0.xyz = const[0], src1.xyz = temp[4] MAD temp[6].x, src0.xxx, src1.xxx, src0.yyy 5: src0.xyz = temp[6], src0.w = const[0], src1.xyz = input[2] MAD temp[7].xyz, src0.xyz, src0.www, src1.xyz 6: src0.xyz = temp[7] DP3 temp[8].x, src0.xyz, src0.xyz 7: src0.xyz = temp[8] MAD aluresult, src0.xxx, src0.111, -src0.000 [aluresult = (result == 0)] 8: IF aluresult.x___; 9: MAD temp[9].xyz, src0.000, src0.111, src0.000 10: ELSE; 11: src0.xyz = temp[8] REPL_ALPHA temp[8].x RSQ, |src0.x| 12: src0.xyz = temp[7], src1.xyz = temp[8] MAD temp[9].xyz, src0.xyz, src1.xxx, src0.000 13: ENDIF; 14: src0.xyz = temp[9], src1.xyz = const[1] DP3 temp[5].x, src0.xyz, src1.xyz 15: src0.xyz = temp[6], src1.xyz = const[0], src2.xyz = input[2] MAD temp[6].xyz, src0.xyz, src1.zzz, src2.xyz 16: src0.xyz = temp[5] MAD_SAT temp[5].x, src0.xxx, src0.111, src0.000 17: src0.xyz = temp[5] MAD temp[7].x, src0.xxx, src0.xxx, src0.000 18: src0.xyz = temp[5], src1.xyz = temp[7] MAD temp[5].x, src0.xxx, src1.xxx, src0.000 19: BEGIN_TEX; 20: TEX temp[8], input[0].xyyy, 2D[1]; 21: TEX temp[7].xyz, temp[6].xyzz, CUBE[0] SEM_WAIT SEM_ACQUIRE; 22: src0.xyz = temp[8], src0.w = temp[8], src1.xyz = temp[7] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[6].w, src0.w, src0.1, src0.0 23: src0.xyz = temp[11], src0.w = temp[4], src1.xyz = temp[7] MAD temp[4].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[5], src0.w = const[1], src1.xyz = temp[4] MAD temp[6].xyz, src0.xxx, src0.www, src1.xyz 25: src0.xyz = temp[6], src0.w = temp[6] MAD temp[10].xyz, src0.xyz, src0.111, src0.000 MAD temp[10].w, src0.w, src0.1, src0.0 26: src0.xyz = temp[10], src1.xyz = const[5] MAD temp[12].xyz, src0.xyz, src0.111, -src1.xyz 27: src0.xyz = const[6] MAD temp[5].x, src0.zzz, src0.111, -src0.yyy 28: src0.xyz = temp[5] REPL_ALPHA temp[5].x RCP, src0.x 29: src0.xyz = const[6], src1.xyz = input[3] MAD temp[4].x, src0.zzz, src0.111, -src1.xxx 30: src0.xyz = temp[4], src1.xyz = temp[5] MAD_SAT temp[4].x, src0.xxx, src1.xxx, src0.000 31: src0.xyz = temp[4], src1.xyz = temp[12], src2.xyz = const[5] MAD temp[10].xyz, src0.xxx, src1.xyz, src2.xyz 32: src0.xyz = temp[10], src0.w = temp[10] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4045401: src: 1 R/G/G/G dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x08001100:Addr0: 0c, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00252000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 B/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00124050:MAD dest:5 rgb_C_src:0 G/G/G 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00003000:ALU wmask: GB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0104:rgb_A_src:0 G/R/G 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001100:Addr0: 0c, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00124060:MAD dest:6 rgb_C_src:0 G/G/G 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000806:Addr0: 6t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221070:MAD dest:7 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020007:Addr0: 7t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000081:DP3 dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00000000:ALU wmask: NONE omask: NONE 1:RGB_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x80db0000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c90050:MAD dest:5 rgb_C_src:0 0/0/0 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00000402:FC ALU WAIT wmask: NONE omask: NONE 2:FC_INST 0x1a000f00:0x0f 0 JUMP NONE INCR INCR 0 0 10 IGN_UNC 3:FC_ADDR 0x000a0000:BOOL: 0x00, INT: 0x00, JUMP_ADDR: 10, JMP_GLBL: 0 8 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0490:rgb_A_src:0 0/0/0 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00000402:FC ALU WAIT wmask: NONE omask: NONE 2:FC_INST 0x04010010:0x00 0 JUMP NONE NONE DECR 1 1 13 3:FC_ADDR 0x000d0000:BOOL: 0x00, INT: 0x00, JUMP_ADDR: 13, JMP_GLBL: 0 10 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004000b:RSQ dest:0 alp_A_src:0 R 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000008a:SOP dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08002007:Addr0: 7t, Addr1: 8t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00000402:FC ALU WAIT wmask: NONE omask: NONE 2:FC_INST 0x01010020:0x00 1 JUMP NONE DECR NONE 1 0 13 3:FC_ADDR 0x000d0000:BOOL: 0x00, INT: 0x00, JUMP_ADDR: 13, JMP_GLBL: 0 13 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08040409:Addr0: 9t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000051:DP3 dest:5 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x00240006:Addr0: 6t, Addr1: 0c, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222060:MAD dest:6 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001c05:Addr0: 5t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe4085400: src: 0 R/G/G/G dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 19 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe407a406: src: 6 R/G/B/B dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 20 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001c08:Addr0: 8t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c060:MAD dest:6 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20a210b0:MAD dest:11 rgb_C_src:1 R/G/B 1 alp_C_src:0 0 0 21 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08001c0b:Addr0: 11t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221040:MAD dest:4 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08001005:Addr0: 5t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020101:Addr0: 1c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221060:MAD dest:6 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c0a0:MAD dest:10 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x204900a0:MAD dest:10 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 24 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x0804140a:Addr0: 10t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a210c0:MAD dest:12 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 25 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020106:Addr0: 6c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924050:MAD dest:5 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 26 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000005a:SOP dest:5 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 27 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000d06:Addr0: 6c, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801040:MAD dest:4 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 28 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001404:Addr0: 4t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 29 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10503004:Addr0: 4t, Addr1: 12t, Addr2: 5c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x002220a0:MAD dest:10 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 30 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x0802000a:Addr0: 10t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0802000a:Addr0: 10t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL CONST[0..16] DCL TEMP[0..5], LOCAL DCL TEMP[6], ARRAY(1), LOCAL IMM[0] FLT32 { 0.0000, 128.0000, -128.0000, 1.0000} IMM[1] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: ADD TEMP[0].xyz, IN[3].xyzz, -IN[0].xyzz 1: MAD TEMP[1].xyz, TEMP[0].xyzz, CONST[14].xxxx, IN[0].xyzz 2: MUL TEMP[0], TEMP[1].yyyy, CONST[8] 3: MAD TEMP[0], CONST[7], TEMP[1].xxxx, TEMP[0] 4: MAD TEMP[0], CONST[9], TEMP[1].zzzz, TEMP[0] 5: MOV TEMP[1].x, CONST[11].xxxx 6: MOV TEMP[1].y, CONST[12].xxxx 7: MOV TEMP[1].z, CONST[13].xxxx 8: MOV TEMP[2].x, CONST[11].yyyy 9: MOV TEMP[2].y, CONST[12].yyyy 10: MOV TEMP[2].z, CONST[13].yyyy 11: MOV TEMP[3].x, CONST[11].zzzz 12: MOV TEMP[3].y, CONST[12].zzzz 13: MOV TEMP[3].z, CONST[13].zzzz 14: ADD TEMP[4].xyz, IN[4].xyzz, -IN[1].xyzz 15: MAD TEMP[4].xyz, TEMP[4].xyzz, CONST[14].xxxx, IN[1].xyzz 16: DP3 TEMP[5].x, TEMP[4].xyzz, TEMP[4].xyzz 17: ABS TEMP[5].x, TEMP[5].xxxx 18: RSQ TEMP[5].x, TEMP[5].xxxx 19: MUL TEMP[4].xyz, TEMP[4].xyzz, TEMP[5].xxxx 20: DP3 TEMP[1].x, TEMP[4].xyzz, TEMP[1].xyzz 21: DP3 TEMP[5].x, TEMP[4].xyzz, TEMP[2].xyzz 22: MOV TEMP[1].y, TEMP[5].xxxx 23: DP3 TEMP[3].x, TEMP[4].xyzz, TEMP[3].xyzz 24: MOV TEMP[1].z, TEMP[3].xxxx 25: DP3 TEMP[4].x, TEMP[1].xyzz, TEMP[1].xyzz 26: ABS TEMP[3].x, TEMP[4].xxxx 27: RSQ TEMP[3].x, TEMP[3].xxxx 28: MUL TEMP[2].xyz, TEMP[1].xyzz, TEMP[3].xxxx 29: ADD TEMP[1], TEMP[0], CONST[10] 30: DP3 TEMP[0].x, TEMP[2].xyzz, -CONST[5].xyzz 31: ADD TEMP[2].x, -TEMP[1].zzzz, CONST[15].yyyy 32: MAX TEMP[0].x, TEMP[0].xxxx, IMM[0].xxxx 33: SEQ TEMP[3].x, CONST[0].xxxx, IMM[0].xxxx 34: IF TEMP[3].xxxx :0 35: ELSE :0 36: ENDIF 37: MOV TEMP[3], TEMP[1] 38: MUL TEMP[1].xyz, TEMP[0].xxxx, CONST[3].xyzz 39: MUL TEMP[0].x, TEMP[2].xxxx, CONST[15].wwww 40: ADD TEMP[2].xyz, CONST[2].xyzz, TEMP[1].xyzz 41: MOV TEMP[2].w, CONST[16].zzzz 42: MOV TEMP[6].xy, IN[2].xyxx 43: MOV TEMP[4].xw, TEMP[3].xxzw 44: MOV_SAT TEMP[0].x, TEMP[0].xxxx 45: MUL TEMP[5].x, TEMP[3].yyyy, CONST[1].yyyy 46: MOV TEMP[4].y, TEMP[5].xxxx 47: MAD TEMP[4].xy, CONST[1].zwww, TEMP[1].wwww, TEMP[4].xyyy 48: MAD TEMP[1].x, TEMP[3].zzzz, IMM[1].xxxx, -TEMP[1].wwww 49: MOV TEMP[4].z, TEMP[1].xxxx 50: MOV OUT[2], TEMP[0].xxxx 51: MOV OUT[0], TEMP[4] 52: MOV_SAT OUT[1], TEMP[2] 53: MOV OUT[3], TEMP[6] 54: END Vertex Program: before compilation # Radeon Compiler Program 0: ADD temp[0].xyz, input[3].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[8]; 3: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 5: MOV temp[1].x, const[11].xxxx; 6: MOV temp[1].y, const[12].xxxx; 7: MOV temp[1].z, const[13].xxxx; 8: MOV temp[2].x, const[11].yyyy; 9: MOV temp[2].y, const[12].yyyy; 10: MOV temp[2].z, const[13].yyyy; 11: MOV temp[3].x, const[11].zzzz; 12: MOV temp[3].y, const[12].zzzz; 13: MOV temp[3].z, const[13].zzzz; 14: ADD temp[4].xyz, input[4].xyzz, -input[1].xyzz; 15: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 16: DP3 temp[5].x, temp[4].xyzz, temp[4].xyzz; 17: ABS temp[5].x, temp[5].xxxx; 18: RSQ temp[5].x, temp[5].xxxx; 19: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 20: DP3 temp[1].x, temp[4].xyzz, temp[1].xyzz; 21: DP3 temp[5].x, temp[4].xyzz, temp[2].xyzz; 22: MOV temp[1].y, temp[5].xxxx; 23: DP3 temp[3].x, temp[4].xyzz, temp[3].xyzz; 24: MOV temp[1].z, temp[3].xxxx; 25: DP3 temp[4].x, temp[1].xyzz, temp[1].xyzz; 26: ABS temp[3].x, temp[4].xxxx; 27: RSQ temp[3].x, temp[3].xxxx; 28: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 29: ADD temp[1], temp[0], const[10]; 30: DP3 temp[0].x, temp[2].xyzz, -const[5].xyzz; 31: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 32: MAX temp[0].x, temp[0].xxxx, const[17].xxxx; 33: SEQ temp[3].x, const[0].xxxx, const[17].xxxx; 34: IF temp[3].xxxx; 35: ELSE; 36: ENDIF; 37: MOV temp[3], temp[1]; 38: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 39: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 40: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 41: MOV temp[2].w, const[16].zzzz; 42: MOV temp[6].xy, input[2].xyxx; 43: MOV temp[4].xw, temp[3].xxzw; 44: MOV_SAT temp[0].x, temp[0].xxxx; 45: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 46: MOV temp[4].y, temp[5].xxxx; 47: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 48: MAD temp[1].x, temp[3].zzzz, const[18].xxxx, -temp[1].wwww; 49: MOV temp[4].z, temp[1].xxxx; 50: MOV output[2], temp[0].xxxx; 51: MOV temp[7], temp[4]; 52: MOV_SAT output[1], temp[2]; 53: MOV output[3], temp[6]; 54: MOV output[0], temp[7]; 55: MOV output[4], temp[7]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: ADD temp[0].xyz, input[3].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[8]; 3: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 5: MOV temp[1].x, const[11].xxxx; 6: MOV temp[1].y, const[12].xxxx; 7: MOV temp[1].z, const[13].xxxx; 8: MOV temp[2].x, const[11].yyyy; 9: MOV temp[2].y, const[12].yyyy; 10: MOV temp[2].z, const[13].yyyy; 11: MOV temp[3].x, const[11].zzzz; 12: MOV temp[3].y, const[12].zzzz; 13: MOV temp[3].z, const[13].zzzz; 14: ADD temp[4].xyz, input[4].xyzz, -input[1].xyzz; 15: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 16: DP3 temp[5].x, temp[4].xyzz, temp[4].xyzz; 17: ABS temp[5].x, temp[5].xxxx; 18: RSQ temp[5].x, temp[5].xxxx; 19: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 20: DP3 temp[1].x, temp[4].xyzz, temp[1].xyzz; 21: DP3 temp[5].x, temp[4].xyzz, temp[2].xyzz; 22: MOV temp[1].y, temp[5].xxxx; 23: DP3 temp[3].x, temp[4].xyzz, temp[3].xyzz; 24: MOV temp[1].z, temp[3].xxxx; 25: DP3 temp[4].x, temp[1].xyzz, temp[1].xyzz; 26: ABS temp[3].x, temp[4].xxxx; 27: RSQ temp[3].x, temp[3].xxxx; 28: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 29: ADD temp[1], temp[0], const[10]; 30: DP3 temp[0].x, temp[2].xyzz, -const[5].xyzz; 31: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 32: MAX temp[0].x, temp[0].xxxx, const[17].xxxx; 33: SEQ temp[3].x, const[0].xxxx, const[17].xxxx; 34: IF temp[3].xxxx; 35: ELSE; 36: ENDIF; 37: MOV temp[3], temp[1]; 38: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 39: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 40: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 41: MOV temp[2].w, const[16].zzzz; 42: MOV temp[6].xy, input[2].xyxx; 43: MOV temp[4].xw, temp[3].xxzw; 44: MOV_SAT temp[0].x, temp[0].xxxx; 45: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 46: MOV temp[4].y, temp[5].xxxx; 47: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 48: MAD temp[1].x, temp[3].zzzz, const[18].xxxx, -temp[1].wwww; 49: MOV temp[4].z, temp[1].xxxx; 50: MOV output[2], temp[0].xxxx; 51: MOV temp[7], temp[4]; 52: MOV_SAT output[1], temp[2]; 53: MOV output[3], temp[6]; 54: MOV output[0], temp[7]; 55: MOV output[4], temp[7]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: ADD temp[0].xyz, input[3].xyzz, -input[0].xyzz; 1: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 2: MUL temp[0], temp[1].yyyy, const[8]; 3: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 4: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 5: MOV temp[1].x, const[11].xxxx; 6: MOV temp[1].y, const[12].xxxx; 7: MOV temp[1].z, const[13].xxxx; 8: MOV temp[2].x, const[11].yyyy; 9: MOV temp[2].y, const[12].yyyy; 10: MOV temp[2].z, const[13].yyyy; 11: MOV temp[3].x, const[11].zzzz; 12: MOV temp[3].y, const[12].zzzz; 13: MOV temp[3].z, const[13].zzzz; 14: ADD temp[4].xyz, input[4].xyzz, -input[1].xyzz; 15: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 16: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 17: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 18: RSQ temp[5].x, temp[5].xxxx; 19: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 20: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 21: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 22: MOV temp[1].y, temp[5].xxxx; 23: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 24: MOV temp[1].z, temp[3].xxxx; 25: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 26: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 27: RSQ temp[3].x, temp[3].xxxx; 28: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 29: ADD temp[1], temp[0], const[10]; 30: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 31: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 32: MAX temp[0].x, temp[0].xxxx, const[17].xxxx; 33: SEQ temp[3].x, const[0].xxxx, const[17].xxxx; 34: IF temp[3].xxxx; 35: ELSE; 36: ENDIF; 37: MOV temp[3], temp[1]; 38: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 39: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 40: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 41: MOV temp[2].w, const[16].zzzz; 42: MOV temp[6].xy, input[2].xyxx; 43: MOV temp[4].xw, temp[3].xxzw; 44: MOV_SAT temp[0].x, temp[0].xxxx; 45: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 46: MOV temp[4].y, temp[5].xxxx; 47: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 48: MAD temp[1].x, temp[3].zzzz, const[18].xxxx, -temp[1].wwww; 49: MOV temp[4].z, temp[1].xxxx; 50: MOV output[2], temp[0].xxxx; 51: MOV temp[7], temp[4]; 52: MOV_SAT output[1], temp[2]; 53: MOV output[3], temp[6]; 54: MOV output[0], temp[7]; 55: MOV output[4], temp[7]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV temp[8], -input[0].xyzz; 1: ADD temp[0].xyz, input[3].xyzz, temp[8]; 2: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[8]; 4: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 6: MOV temp[1].x, const[11].xxxx; 7: MOV temp[1].y, const[12].xxxx; 8: MOV temp[1].z, const[13].xxxx; 9: MOV temp[2].x, const[11].yyyy; 10: MOV temp[2].y, const[12].yyyy; 11: MOV temp[2].z, const[13].yyyy; 12: MOV temp[3].x, const[11].zzzz; 13: MOV temp[3].y, const[12].zzzz; 14: MOV temp[3].z, const[13].zzzz; 15: MOV temp[9], -input[1].xyzz; 16: ADD temp[4].xyz, input[4].xyzz, temp[9]; 17: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 18: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 19: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 20: RSQ temp[5].x, temp[5].xxxx; 21: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 22: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 23: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 24: MOV temp[1].y, temp[5].xxxx; 25: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 26: MOV temp[1].z, temp[3].xxxx; 27: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 28: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 29: RSQ temp[3].x, temp[3].xxxx; 30: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 31: ADD temp[1], temp[0], const[10]; 32: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 33: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 34: MAX temp[0].x, temp[0].xxxx, const[17].xxxx; 35: MOV temp[10], const[17].xxxx; 36: SEQ temp[3].x, const[0].xxxx, temp[10]; 37: IF temp[3].xxxx; 38: ELSE; 39: ENDIF; 40: MOV temp[3], temp[1]; 41: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 42: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 43: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 44: MOV temp[2].w, const[16].zzzz; 45: MOV temp[6].xy, input[2].xyxx; 46: MOV temp[4].xw, temp[3].xxzw; 47: MOV_SAT temp[0].x, temp[0].xxxx; 48: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 49: MOV temp[4].y, temp[5].xxxx; 50: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 51: MAD temp[1].x, temp[3].zzzz, const[18].xxxx, -temp[1].wwww; 52: MOV temp[4].z, temp[1].xxxx; 53: MOV output[2], temp[0].xxxx; 54: MOV temp[7], temp[4]; 55: MOV_SAT output[1], temp[2]; 56: MOV output[3], temp[6]; 57: MOV output[0], temp[7]; 58: MOV output[4], temp[7]; CONST[17] = { 0.0000 128.0000 -128.0000 1.0000 } CONST[18] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[8], -input[0].xyzz; 1: ADD temp[0].xyz, input[3].xyzz, temp[8]; 2: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[8]; 4: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 6: MOV temp[1].x, const[11].xxxx; 7: MOV temp[1].y, const[12].xxxx; 8: MOV temp[1].z, const[13].xxxx; 9: MOV temp[2].x, const[11].yyyy; 10: MOV temp[2].y, const[12].yyyy; 11: MOV temp[2].z, const[13].yyyy; 12: MOV temp[3].x, const[11].zzzz; 13: MOV temp[3].y, const[12].zzzz; 14: MOV temp[3].z, const[13].zzzz; 15: MOV temp[9], -input[1].xyzz; 16: ADD temp[4].xyz, input[4].xyzz, temp[9]; 17: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 18: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 19: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 20: RSQ temp[5].x, temp[5].xxxx; 21: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 22: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 23: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 24: MOV temp[1].y, temp[5].xxxx; 25: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 26: MOV temp[1].z, temp[3].xxxx; 27: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 28: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 29: RSQ temp[3].x, temp[3].xxxx; 30: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 31: ADD temp[1], temp[0], const[10]; 32: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 33: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 34: MAX temp[0].x, temp[0].xxxx, const[17].xxxx; 35: MOV temp[10], const[17].xxxx; 36: SEQ temp[3].x, const[0].xxxx, temp[10]; 37: IF temp[3].xxxx; 38: ELSE; 39: ENDIF; 40: MOV temp[3], temp[1]; 41: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 42: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 43: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 44: MOV temp[2].w, const[16].zzzz; 45: MOV temp[6].xy, input[2].xyxx; 46: MOV temp[4].xw, temp[3].xxzw; 47: MOV_SAT temp[0].x, temp[0].xxxx; 48: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 49: MOV temp[4].y, temp[5].xxxx; 50: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 51: MAD temp[1].x, temp[3].zzzz, const[18].xxxx, -temp[1].wwww; 52: MOV temp[4].z, temp[1].xxxx; 53: MOV output[2], temp[0].xxxx; 54: MOV temp[7], temp[4]; 55: MOV_SAT output[1], temp[2]; 56: MOV output[3], temp[6]; 57: MOV output[0], temp[7]; 58: MOV output[4], temp[7]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV temp[8], -input[0].xyzz; 1: ADD temp[0].xyz, input[3].xyzz, temp[8]; 2: MAD temp[1].xyz, temp[0].xyzz, const[14].xxxx, input[0].xyzz; 3: MUL temp[0], temp[1].yyyy, const[8]; 4: MAD temp[0], const[7], temp[1].xxxx, temp[0]; 5: MAD temp[0], const[9], temp[1].zzzz, temp[0]; 6: MOV temp[1].x, const[11].xxxx; 7: MOV temp[1].y, const[12].xxxx; 8: MOV temp[1].z, const[13].xxxx; 9: MOV temp[2].x, const[11].yyyy; 10: MOV temp[2].y, const[12].yyyy; 11: MOV temp[2].z, const[13].yyyy; 12: MOV temp[3].x, const[11].zzzz; 13: MOV temp[3].y, const[12].zzzz; 14: MOV temp[3].z, const[13].zzzz; 15: MOV temp[9], -input[1].xyzz; 16: ADD temp[4].xyz, input[4].xyzz, temp[9]; 17: MAD temp[4].xyz, temp[4].xyzz, const[14].xxxx, input[1].xyzz; 18: DP4 temp[5].x, temp[4].xyz0, temp[4].xyz0; 19: MAX temp[5].x, temp[5].xxxx, -temp[5].xxxx; 20: RSQ temp[5].x, temp[5].xxxx; 21: MUL temp[4].xyz, temp[4].xyzz, temp[5].xxxx; 22: DP4 temp[1].x, temp[4].xyz0, temp[1].xyz0; 23: DP4 temp[5].x, temp[4].xyz0, temp[2].xyz0; 24: MOV temp[1].y, temp[5].xxxx; 25: DP4 temp[3].x, temp[4].xyz0, temp[3].xyz0; 26: MOV temp[1].z, temp[3].xxxx; 27: DP4 temp[4].x, temp[1].xyz0, temp[1].xyz0; 28: MAX temp[3].x, temp[4].xxxx, -temp[4].xxxx; 29: RSQ temp[3].x, temp[3].xxxx; 30: MUL temp[2].xyz, temp[1].xyzz, temp[3].xxxx; 31: ADD temp[1], temp[0], const[10]; 32: DP4 temp[0].x, temp[2].xyz0, const[5].-x-y-z0; 33: ADD temp[2].x, -temp[1].zzzz, const[15].yyyy; 34: MAX temp[0].x, temp[0].xxxx, const[17].xxxx; 35: MOV temp[10], const[17].xxxx; 36: SEQ temp[3].x, const[0].xxxx, temp[10]; 37: ME_PRED_SNEQ temp[11].w, temp[3].xxxx; 38: ME_PRED_SET_INV temp[11].w, temp[11].___w; 39: ME_PRED_SET_POP temp[11].w, temp[11].___w; 40: MOV temp[3], temp[1]; 41: MUL temp[1].xyz, temp[0].xxxx, const[3].xyzz; 42: MUL temp[0].x, temp[2].xxxx, const[15].wwww; 43: ADD temp[2].xyz, const[2].xyzz, temp[1].xyzz; 44: MOV temp[2].w, const[16].zzzz; 45: MOV temp[6].xy, input[2].xyxx; 46: MOV temp[4].xw, temp[3].xxzw; 47: MOV_SAT temp[0].x, temp[0].xxxx; 48: MUL temp[5].x, temp[3].yyyy, const[1].yyyy; 49: MOV temp[4].y, temp[5].xxxx; 50: MAD temp[4].xy, const[1].zwww, temp[1].wwww, temp[4].xyyy; 51: MAD temp[1].x, temp[3].zzzz, const[18].xxxx, -temp[1].wwww; 52: MOV temp[4].z, temp[1].xxxx; 53: MOV output[2], temp[0].xxxx; 54: MOV temp[7], temp[4]; 55: MOV_SAT output[1], temp[2]; 56: MOV output[3], temp[6]; 57: MOV output[0], temp[7]; 58: MOV output[4], temp[7]; Final vertex program code: 0: op: 0x00f10003 dst: 8t op: VE_ADD src0: 0x1e910001 reg: 0i swiz: -X/-Y/-Z/-Z src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00700003 dst: 0t op: VE_ADD src0: 0x00910061 reg: 3i swiz: X/ Y/ Z/ Z src1: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 2: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00910000 reg: 0t swiz: X/ Y/ Z/ Z src1: 0x000001c2 reg: 14c swiz: X/ X/ X/ X src2: 0x00910001 reg: 0i swiz: X/ Y/ Z/ Z 3: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492020 reg: 1t swiz: Y/ Y/ Y/ Y src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00000020 reg: 1t swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00924020 reg: 1t swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 6: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x00000162 reg: 11c swiz: X/ X/ X/ X src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 7: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x00000182 reg: 12c swiz: X/ X/ X/ X src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 8: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x000001a2 reg: 13c swiz: X/ X/ X/ X src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 9: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x00492162 reg: 11c swiz: Y/ Y/ Y/ Y src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 10: op: 0x00204003 dst: 2t op: VE_ADD src0: 0x00492182 reg: 12c swiz: Y/ Y/ Y/ Y src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 11: op: 0x00404003 dst: 2t op: VE_ADD src0: 0x004921a2 reg: 13c swiz: Y/ Y/ Y/ Y src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 12: op: 0x00106003 dst: 3t op: VE_ADD src0: 0x00924162 reg: 11c swiz: Z/ Z/ Z/ Z src1: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 13: op: 0x00206003 dst: 3t op: VE_ADD src0: 0x00924182 reg: 12c swiz: Z/ Z/ Z/ Z src1: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 14: op: 0x00406003 dst: 3t op: VE_ADD src0: 0x009241a2 reg: 13c swiz: Z/ Z/ Z/ Z src1: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 src2: 0x012481a2 reg: 13c swiz: 0/ 0/ 0/ 0 15: op: 0x00f12003 dst: 9t op: VE_ADD src0: 0x1e910021 reg: 1i swiz: -X/-Y/-Z/-Z src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00708003 dst: 4t op: VE_ADD src0: 0x00910081 reg: 4i swiz: X/ Y/ Z/ Z src1: 0x00d10120 reg: 9t swiz: X/ Y/ Z/ W src2: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 17: op: 0x00708004 dst: 4t op: VE_MULTIPLY_ADD src0: 0x00910080 reg: 4t swiz: X/ Y/ Z/ Z src1: 0x000001c2 reg: 14c swiz: X/ X/ X/ X src2: 0x00910021 reg: 1i swiz: X/ Y/ Z/ Z 18: op: 0x0010a001 dst: 5t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 19: op: 0x0010a007 dst: 5t op: VE_MAXIMUM src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x1e0000a0 reg: 5t swiz: -X/-X/-X/-X src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 20: op: 0x0010a048 dst: 5t op: ME_RECIP_SQRT_DX src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 21: op: 0x00708002 dst: 4t op: VE_MULTIPLY src0: 0x00910080 reg: 4t swiz: X/ Y/ Z/ Z src1: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 22: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 23: op: 0x0010a001 dst: 5t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 24: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 25: op: 0x00106001 dst: 3t op: VE_DOT_PRODUCT src0: 0x01110080 reg: 4t swiz: X/ Y/ Z/ 0 src1: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 26: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 27: op: 0x00108001 dst: 4t op: VE_DOT_PRODUCT src0: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 28: op: 0x00106007 dst: 3t op: VE_MAXIMUM src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x1e000080 reg: 4t swiz: -X/-X/-X/-X src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 29: op: 0x00106048 dst: 3t op: ME_RECIP_SQRT_DX src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 30: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z src1: 0x00000060 reg: 3t swiz: X/ X/ X/ X src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 31: op: 0x00f02003 dst: 1t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src2: 0x01248142 reg: 10c swiz: 0/ 0/ 0/ 0 32: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src1: 0x0f1100a2 reg: 5c swiz: -X/-Y/-Z/ 0 src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 33: op: 0x00104003 dst: 2t op: VE_ADD src0: 0x1e924020 reg: 1t swiz: -Z/-Z/-Z/-Z src1: 0x004921e2 reg: 15c swiz: Y/ Y/ Y/ Y src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 34: op: 0x00100007 dst: 0t op: VE_MAXIMUM src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x00000222 reg: 17c swiz: X/ X/ X/ X src2: 0x01248222 reg: 17c swiz: 0/ 0/ 0/ 0 35: op: 0x00f14003 dst: 10t op: VE_ADD src0: 0x00000222 reg: 17c swiz: X/ X/ X/ X src1: 0x01248222 reg: 17c swiz: 0/ 0/ 0/ 0 src2: 0x01248222 reg: 17c swiz: 0/ 0/ 0/ 0 36: op: 0x0010601b dst: 3t op: VE_SET_EQUAL src0: 0x00000002 reg: 0c swiz: X/ X/ X/ X src1: 0x00d10140 reg: 10t swiz: X/ Y/ Z/ W src2: 0x01248140 reg: 10t swiz: 0/ 0/ 0/ 0 37: op: 0x00816058 dst: 11t op: ME_PRED_SET_NEQ src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 38: op: 0x0081605a dst: 11t op: ME_PRED_SET_INV src0: 0x00db6160 reg: 11t swiz: W/ W/ W/ W src1: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 src2: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 39: op: 0x0081605b dst: 11t op: ME_PRED_SET_POP src0: 0x00db6160 reg: 11t swiz: W/ W/ W/ W src1: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 src2: 0x01248160 reg: 11t swiz: 0/ 0/ 0/ 0 40: op: 0x00f06003 dst: 3t op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 41: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x00910062 reg: 3c swiz: X/ Y/ Z/ Z src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 42: op: 0x00100002 dst: 0t op: VE_MULTIPLY src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x00db61e2 reg: 15c swiz: W/ W/ W/ W src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 43: op: 0x00704003 dst: 2t op: VE_ADD src0: 0x00910042 reg: 2c swiz: X/ Y/ Z/ Z src1: 0x00910020 reg: 1t swiz: X/ Y/ Z/ Z src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 44: op: 0x00804003 dst: 2t op: VE_ADD src0: 0x00924202 reg: 16c swiz: Z/ Z/ Z/ Z src1: 0x01248202 reg: 16c swiz: 0/ 0/ 0/ 0 src2: 0x01248202 reg: 16c swiz: 0/ 0/ 0/ 0 45: op: 0x0030c003 dst: 6t op: VE_ADD src0: 0x00010041 reg: 2i swiz: X/ Y/ X/ X src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 46: op: 0x00908003 dst: 4t op: VE_ADD src0: 0x00d00060 reg: 3t swiz: X/ X/ Z/ W src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 47: op: 0x01100003 dst: 0t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 48: op: 0x0010a002 dst: 5t op: VE_MULTIPLY src0: 0x00492060 reg: 3t swiz: Y/ Y/ Y/ Y src1: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src2: 0x01248022 reg: 1c swiz: 0/ 0/ 0/ 0 49: op: 0x00208003 dst: 4t op: VE_ADD src0: 0x000000a0 reg: 5t swiz: X/ X/ X/ X src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 50: op: 0x00308004 dst: 4t op: VE_MULTIPLY_ADD src0: 0x00db4022 reg: 1c swiz: Z/ W/ W/ W src1: 0x00db6020 reg: 1t swiz: W/ W/ W/ W src2: 0x00490080 reg: 4t swiz: X/ Y/ Y/ Y 51: op: 0x00102004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00924060 reg: 3t swiz: Z/ Z/ Z/ Z src1: 0x00000242 reg: 18c swiz: X/ X/ X/ X src2: 0x1edb6020 reg: 1t swiz: -W/-W/-W/-W 52: op: 0x00408003 dst: 4t op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 53: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 54: op: 0x00f0e003 dst: 7t op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 55: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 56: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d100c0 reg: 6t swiz: X/ Y/ Z/ W src1: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 src2: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 57: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 58: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR, COLOR DCL IN[1], FOG, PERSPECTIVE DCL IN[2], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..3] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2].xyyy, SAMP[0], 2D 1: MUL TEMP[1].xyz, TEMP[0].xyzz, IN[0].xyzz 2: MUL TEMP[0].x, TEMP[0].wwww, IN[0].wwww 3: MOV TEMP[1].w, TEMP[0].xxxx 4: MUL TEMP[0].xyz, TEMP[1].xyzz, CONST[1].xyzz 5: MUL TEMP[1].xyz, IMM[0].xxxx, TEMP[0].xyzz 6: MOV TEMP[2], TEMP[1] 7: ADD TEMP[0].x, CONST[3].zzzz, -IN[1].xxxx 8: ADD TEMP[1].x, CONST[3].zzzz, -CONST[3].yyyy 9: RCP TEMP[1].x, TEMP[1].xxxx 10: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx 11: LRP TEMP[2].xyz, TEMP[0].xxxx, TEMP[2].xyzz, CONST[2].xyzz 12: MOV_SAT OUT[0], TEMP[2] 13: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 12: MOV_SAT output[0], temp[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 12: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 12: MOV_SAT output[0], temp[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 12: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 12: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 12: MOV_SAT output[0], temp[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 12: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 13: MOV_SAT output[0], temp[2]; CONST[4] = { 2.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 12: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 13: MOV_SAT output[0], temp[2]; CONST[4] = { 2.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: MUL temp[1].xyz, temp[0].xyzz, input[0].xyzz; 2: MUL temp[0].x, temp[0].wwww, input[0].wwww; 3: MOV temp[1].w, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, const[1].xyzz; 5: MUL temp[1].xyz, const[4].xxxx, temp[0].xyzz; 6: MOV temp[2], temp[1]; 7: ADD temp[0].x, const[3].zzzz, -input[1].xxxx; 8: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 9: RCP temp[1].x, temp[1].xxxx; 10: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 11: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 12: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 13: MOV_SAT output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[2].xyyy, 2D[0]; 1: src0.xyz = temp[0], src1.xyz = input[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 2: src0.w = temp[0], src1.w = input[0] MAD temp[0].x, src0.www, src1.www, src0.000 3: src0.xyz = temp[0] MAD temp[1].w, src0.x, src0.1, src0.0 4: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 5: src0.xyz = const[4], src1.xyz = temp[0] MAD temp[1].xyz, src0.xxx, src1.xyz, src0.000 6: src0.xyz = temp[1], src0.w = temp[1] MAD temp[2].xyz, src0.xyz, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 7: src0.xyz = const[3], src1.xyz = input[1] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 8: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 9: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 10: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 11: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 12: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 13: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src1.xyz = input[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 3: src0.w = temp[0], src1.w = input[0] MAD temp[0].x, src0.www, src1.www, src0.000 4: src0.xyz = temp[0] MAD temp[1].w, src0.x, src0.1, src0.0 5: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 6: src0.xyz = const[4], src1.xyz = temp[0] MAD temp[1].xyz, src0.xxx, src1.xyz, src0.000 7: src0.xyz = temp[1], src0.w = temp[1] MAD temp[2].xyz, src0.xyz, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 8: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 9: src0.xyz = const[3], src1.xyz = input[1] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 10: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 11: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 12: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 13: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 14: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src1.xyz = input[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 3: src0.w = temp[0], src1.w = input[0] MAD temp[0].x, src0.www, src1.www, src0.000 4: src0.xyz = temp[0] MAD temp[1].w, src0.x, src0.1, src0.0 5: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 6: src0.xyz = const[4], src1.xyz = temp[0] MAD temp[1].xyz, src0.xxx, src1.xyz, src0.000 7: src0.xyz = temp[1], src0.w = temp[1] MAD temp[2].xyz, src0.xyz, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 8: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 9: src0.xyz = const[3], src1.xyz = input[1] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 10: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 11: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 12: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 13: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 14: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[3], input[1].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[3], src1.xyz = input[0] SEM_WAIT MAD temp[4].xyz, src0.xyz, src1.xyz, src0.000 3: src0.w = temp[3], src1.w = input[0] MAD temp[3].x, src0.www, src1.www, src0.000 4: src0.xyz = temp[3] MAD temp[4].w, src0.x, src0.1, src0.0 5: src0.xyz = temp[4], src1.xyz = const[1] MAD temp[3].xyz, src0.xyz, src1.xyz, src0.000 6: src0.xyz = const[4], src1.xyz = temp[3] MAD temp[4].xyz, src0.xxx, src1.xyz, src0.000 7: src0.xyz = temp[4], src0.w = temp[4] MAD temp[5].xyz, src0.xyz, src0.111, src0.000 MAD temp[5].w, src0.w, src0.1, src0.0 8: src0.xyz = temp[5], src1.xyz = const[2] MAD temp[6].xyz, src0.xyz, src0.111, -src1.xyz 9: src0.xyz = const[3], src1.xyz = input[2] MAD temp[3].x, src0.zzz, src0.111, -src1.xxx 10: src0.xyz = const[3] MAD temp[4].x, src0.zzz, src0.111, -src0.yyy 11: src0.xyz = temp[4] REPL_ALPHA temp[4].x RCP, src0.x 12: src0.xyz = temp[3], src1.xyz = temp[4] MAD_SAT temp[3].x, src0.xxx, src1.xxx, src0.000 13: src0.xyz = temp[3], src1.xyz = temp[6], src2.xyz = const[2] MAD temp[5].xyz, src0.xxx, src1.xyz, src2.xyz 14: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4035401: src: 1 R/G/G/G dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000003:Addr0: 3t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000003:Addr0: 3t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006da36c:rgb_A_src:0 A/A/A 0 rgb_B_src:1 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c00040:MAD dest:4 alp_A_src:0 R 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040404:Addr0: 4t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000d04:Addr0: 4c, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c050:MAD dest:5 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040805:Addr0: 5t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21060:MAD dest:6 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 8 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000903:Addr0: 3c, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801030:MAD dest:3 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 9 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924040:MAD dest:4 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 10 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000004a:SOP dest:4 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10201803:Addr0: 3t, Addr1: 6t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222050:MAD dest:5 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 13 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL OUT[5], GENERIC[3] DCL CONST[0..8] DCL TEMP[0..4], LOCAL DCL TEMP[5..8], ARRAY(1), LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[0].yyyy, CONST[3] 1: MAD TEMP[1].xy, IN[1].xyyy, CONST[0].xyyy, CONST[8].zzzz 2: MAD TEMP[0], CONST[2], IN[0].xxxx, TEMP[0] 3: MAD TEMP[2].xy, IN[1].xyyy, CONST[0].yzzz, CONST[8].zzzz 4: MAD TEMP[0], CONST[4], IN[0].zzzz, TEMP[0] 5: MAD TEMP[3].xy, CONST[8].zzzz, CONST[0].xwww, IN[1].xyyy 6: ADD TEMP[0], TEMP[0], CONST[5] 7: ADD TEMP[4].x, -TEMP[0].zzzz, CONST[7].yyyy 8: MUL TEMP[4].x, TEMP[4].xxxx, CONST[7].wwww 9: MOV TEMP[5].xy, IN[1].xyxx 10: MOV TEMP[6].xy, TEMP[1].xyxx 11: MOV TEMP[7].xy, TEMP[2].xyxx 12: MOV TEMP[8].xy, TEMP[3].xyxx 13: MOV TEMP[1].xw, TEMP[0].xxzw 14: MOV_SAT TEMP[2].x, TEMP[4].xxxx 15: MUL TEMP[3].x, TEMP[0].yyyy, CONST[1].yyyy 16: MOV TEMP[1].y, TEMP[3].xxxx 17: MAD TEMP[1].xy, CONST[1].zwww, TEMP[0].wwww, TEMP[1].xyyy 18: MAD TEMP[0].x, TEMP[0].zzzz, IMM[0].xxxx, -TEMP[0].wwww 19: MOV TEMP[1].z, TEMP[0].xxxx 20: MOV OUT[1], TEMP[2].xxxx 21: MOV OUT[0], TEMP[1] 22: MOV OUT[2], TEMP[5] 23: MOV OUT[3], TEMP[6] 24: MOV OUT[4], TEMP[7] 25: MOV OUT[5], TEMP[8] 26: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[3]; 1: MAD temp[1].xy, input[1].xyyy, const[0].xyyy, const[8].zzzz; 2: MAD temp[0], const[2], input[0].xxxx, temp[0]; 3: MAD temp[2].xy, input[1].xyyy, const[0].yzzz, const[8].zzzz; 4: MAD temp[0], const[4], input[0].zzzz, temp[0]; 5: MAD temp[3].xy, const[8].zzzz, const[0].xwww, input[1].xyyy; 6: ADD temp[0], temp[0], const[5]; 7: ADD temp[4].x, -temp[0].zzzz, const[7].yyyy; 8: MUL temp[4].x, temp[4].xxxx, const[7].wwww; 9: MOV temp[5].xy, input[1].xyxx; 10: MOV temp[6].xy, temp[1].xyxx; 11: MOV temp[7].xy, temp[2].xyxx; 12: MOV temp[8].xy, temp[3].xyxx; 13: MOV temp[1].xw, temp[0].xxzw; 14: MOV_SAT temp[2].x, temp[4].xxxx; 15: MUL temp[3].x, temp[0].yyyy, const[1].yyyy; 16: MOV temp[1].y, temp[3].xxxx; 17: MAD temp[1].xy, const[1].zwww, temp[0].wwww, temp[1].xyyy; 18: MAD temp[0].x, temp[0].zzzz, const[9].xxxx, -temp[0].wwww; 19: MOV temp[1].z, temp[0].xxxx; 20: MOV output[1], temp[2].xxxx; 21: MOV temp[9], temp[1]; 22: MOV output[2], temp[5]; 23: MOV output[3], temp[6]; 24: MOV output[4], temp[7]; 25: MOV output[5], temp[8]; 26: MOV output[0], temp[9]; 27: MOV output[6], temp[9]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[3]; 1: MAD temp[1].xy, input[1].xyyy, const[0].xyyy, const[8].zzzz; 2: MAD temp[0], const[2], input[0].xxxx, temp[0]; 3: MAD temp[2].xy, input[1].xyyy, const[0].yzzz, const[8].zzzz; 4: MAD temp[0], const[4], input[0].zzzz, temp[0]; 5: MAD temp[3].xy, const[8].zzzz, const[0].xwww, input[1].xyyy; 6: ADD temp[0], temp[0], const[5]; 7: ADD temp[4].x, -temp[0].zzzz, const[7].yyyy; 8: MUL temp[4].x, temp[4].xxxx, const[7].wwww; 9: MOV temp[5].xy, input[1].xyxx; 10: MOV temp[6].xy, temp[1].xyxx; 11: MOV temp[7].xy, temp[2].xyxx; 12: MOV temp[8].xy, temp[3].xyxx; 13: MOV temp[1].xw, temp[0].xxzw; 14: MOV_SAT temp[2].x, temp[4].xxxx; 15: MUL temp[3].x, temp[0].yyyy, const[1].yyyy; 16: MOV temp[1].y, temp[3].xxxx; 17: MAD temp[1].xy, const[1].zwww, temp[0].wwww, temp[1].xyyy; 18: MAD temp[0].x, temp[0].zzzz, const[9].xxxx, -temp[0].wwww; 19: MOV temp[1].z, temp[0].xxxx; 20: MOV output[1], temp[2].xxxx; 21: MOV temp[9], temp[1]; 22: MOV output[2], temp[5]; 23: MOV output[3], temp[6]; 24: MOV output[4], temp[7]; 25: MOV output[5], temp[8]; 26: MOV output[0], temp[9]; 27: MOV output[6], temp[9]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[3]; 1: MAD temp[1].xy, input[1].xyyy, const[0].xyyy, const[8].zzzz; 2: MAD temp[0], const[2], input[0].xxxx, temp[0]; 3: MAD temp[2].xy, input[1].xyyy, const[0].yzzz, const[8].zzzz; 4: MAD temp[0], const[4], input[0].zzzz, temp[0]; 5: MAD temp[3].xy, const[8].zzzz, const[0].xwww, input[1].xyyy; 6: ADD temp[0], temp[0], const[5]; 7: ADD temp[4].x, -temp[0].zzzz, const[7].yyyy; 8: MUL temp[4].x, temp[4].xxxx, const[7].wwww; 9: MOV temp[5].xy, input[1].xyxx; 10: MOV temp[6].xy, temp[1].xyxx; 11: MOV temp[7].xy, temp[2].xyxx; 12: MOV temp[8].xy, temp[3].xyxx; 13: MOV temp[1].xw, temp[0].xxzw; 14: MOV_SAT temp[2].x, temp[4].xxxx; 15: MUL temp[3].x, temp[0].yyyy, const[1].yyyy; 16: MOV temp[1].y, temp[3].xxxx; 17: MAD temp[1].xy, const[1].zwww, temp[0].wwww, temp[1].xyyy; 18: MAD temp[0].x, temp[0].zzzz, const[9].xxxx, -temp[0].wwww; 19: MOV temp[1].z, temp[0].xxxx; 20: MOV output[1], temp[2].xxxx; 21: MOV temp[9], temp[1]; 22: MOV output[2], temp[5]; 23: MOV output[3], temp[6]; 24: MOV output[4], temp[7]; 25: MOV output[5], temp[8]; 26: MOV output[0], temp[9]; 27: MOV output[6], temp[9]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[3]; 1: MOV temp[10], const[8].zzzz; 2: MAD temp[1].xy, input[1].xyyy, const[0].xyyy, temp[10]; 3: MAD temp[0], const[2], input[0].xxxx, temp[0]; 4: MOV temp[11], const[8].zzzz; 5: MAD temp[2].xy, input[1].xyyy, const[0].yzzz, temp[11]; 6: MAD temp[0], const[4], input[0].zzzz, temp[0]; 7: MOV temp[12], const[0].xwww; 8: MAD temp[3].xy, const[8].zzzz, temp[12], input[1].xyyy; 9: ADD temp[0], temp[0], const[5]; 10: ADD temp[4].x, -temp[0].zzzz, const[7].yyyy; 11: MUL temp[4].x, temp[4].xxxx, const[7].wwww; 12: MOV temp[5].xy, input[1].xyxx; 13: MOV temp[6].xy, temp[1].xyxx; 14: MOV temp[7].xy, temp[2].xyxx; 15: MOV temp[8].xy, temp[3].xyxx; 16: MOV temp[1].xw, temp[0].xxzw; 17: MOV_SAT temp[2].x, temp[4].xxxx; 18: MUL temp[3].x, temp[0].yyyy, const[1].yyyy; 19: MOV temp[1].y, temp[3].xxxx; 20: MAD temp[1].xy, const[1].zwww, temp[0].wwww, temp[1].xyyy; 21: MAD temp[0].x, temp[0].zzzz, const[9].xxxx, -temp[0].wwww; 22: MOV temp[1].z, temp[0].xxxx; 23: MOV output[1], temp[2].xxxx; 24: MOV temp[9], temp[1]; 25: MOV output[2], temp[5]; 26: MOV output[3], temp[6]; 27: MOV output[4], temp[7]; 28: MOV output[5], temp[8]; 29: MOV output[0], temp[9]; 30: MOV output[6], temp[9]; CONST[9] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[3]; 1: MOV temp[10], const[8].zzzz; 2: MAD temp[1].xy, input[1].xyyy, const[0].xyyy, temp[10]; 3: MAD temp[0], const[2], input[0].xxxx, temp[0]; 4: MOV temp[11], const[8].zzzz; 5: MAD temp[2].xy, input[1].xyyy, const[0].yzzz, temp[11]; 6: MAD temp[0], const[4], input[0].zzzz, temp[0]; 7: MOV temp[12], const[0].xwww; 8: MAD temp[3].xy, const[8].zzzz, temp[12], input[1].xyyy; 9: ADD temp[0], temp[0], const[5]; 10: ADD temp[4].x, -temp[0].zzzz, const[7].yyyy; 11: MUL temp[4].x, temp[4].xxxx, const[7].wwww; 12: MOV temp[5].xy, input[1].xyxx; 13: MOV temp[6].xy, temp[1].xyxx; 14: MOV temp[7].xy, temp[2].xyxx; 15: MOV temp[8].xy, temp[3].xyxx; 16: MOV temp[1].xw, temp[0].xxzw; 17: MOV_SAT temp[2].x, temp[4].xxxx; 18: MUL temp[3].x, temp[0].yyyy, const[1].yyyy; 19: MOV temp[1].y, temp[3].xxxx; 20: MAD temp[1].xy, const[1].zwww, temp[0].wwww, temp[1].xyyy; 21: MAD temp[0].x, temp[0].zzzz, const[9].xxxx, -temp[0].wwww; 22: MOV temp[1].z, temp[0].xxxx; 23: MOV output[1], temp[2].xxxx; 24: MOV temp[9], temp[1]; 25: MOV output[2], temp[5]; 26: MOV output[3], temp[6]; 27: MOV output[4], temp[7]; 28: MOV output[5], temp[8]; 29: MOV output[0], temp[9]; 30: MOV output[6], temp[9]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].yyyy, const[3]; 1: MOV temp[10], const[8].zzzz; 2: MAD temp[1].xy, input[1].xyyy, const[0].xyyy, temp[10]; 3: MAD temp[0], const[2], input[0].xxxx, temp[0]; 4: MOV temp[11], const[8].zzzz; 5: MAD temp[2].xy, input[1].xyyy, const[0].yzzz, temp[11]; 6: MAD temp[0], const[4], input[0].zzzz, temp[0]; 7: MOV temp[12], const[0].xwww; 8: MAD temp[3].xy, const[8].zzzz, temp[12], input[1].xyyy; 9: ADD temp[0], temp[0], const[5]; 10: ADD temp[4].x, -temp[0].zzzz, const[7].yyyy; 11: MUL temp[4].x, temp[4].xxxx, const[7].wwww; 12: MOV temp[5].xy, input[1].xyxx; 13: MOV temp[6].xy, temp[1].xyxx; 14: MOV temp[7].xy, temp[2].xyxx; 15: MOV temp[8].xy, temp[3].xyxx; 16: MOV temp[1].xw, temp[0].xxzw; 17: MOV_SAT temp[2].x, temp[4].xxxx; 18: MUL temp[3].x, temp[0].yyyy, const[1].yyyy; 19: MOV temp[1].y, temp[3].xxxx; 20: MAD temp[1].xy, const[1].zwww, temp[0].wwww, temp[1].xyyy; 21: MAD temp[0].x, temp[0].zzzz, const[9].xxxx, -temp[0].wwww; 22: MOV temp[1].z, temp[0].xxxx; 23: MOV output[1], temp[2].xxxx; 24: MOV temp[9], temp[1]; 25: MOV output[2], temp[5]; 26: MOV output[3], temp[6]; 27: MOV output[4], temp[7]; 28: MOV output[5], temp[8]; 29: MOV output[0], temp[9]; 30: MOV output[6], temp[9]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 1: op: 0x00f14003 dst: 10t op: VE_ADD src0: 0x00924102 reg: 8c swiz: Z/ Z/ Z/ Z src1: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 2: op: 0x00302004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00490021 reg: 1i swiz: X/ Y/ Y/ Y src1: 0x00490002 reg: 0c swiz: X/ Y/ Y/ Y src2: 0x00d10140 reg: 10t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f16003 dst: 11t op: VE_ADD src0: 0x00924102 reg: 8c swiz: Z/ Z/ Z/ Z src1: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 5: op: 0x00304004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x00490021 reg: 1i swiz: X/ Y/ Y/ Y src1: 0x00922002 reg: 0c swiz: Y/ Z/ Z/ Z src2: 0x00d10160 reg: 11t swiz: X/ Y/ Z/ W 6: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 7: op: 0x00f18003 dst: 12t op: VE_ADD src0: 0x00db0002 reg: 0c swiz: X/ W/ W/ W src1: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 8: op: 0x00306004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00924102 reg: 8c swiz: Z/ Z/ Z/ Z src1: 0x00d10180 reg: 12t swiz: X/ Y/ Z/ W src2: 0x00490021 reg: 1i swiz: X/ Y/ Y/ Y 9: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 10: op: 0x00108003 dst: 4t op: VE_ADD src0: 0x1e924000 reg: 0t swiz: -Z/-Z/-Z/-Z src1: 0x004920e2 reg: 7c swiz: Y/ Y/ Y/ Y src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 11: op: 0x00108002 dst: 4t op: VE_MULTIPLY src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x00db60e2 reg: 7c swiz: W/ W/ W/ W src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 12: op: 0x0030a003 dst: 5t op: VE_ADD src0: 0x00010021 reg: 1i swiz: X/ Y/ X/ X src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 13: op: 0x0030c003 dst: 6t op: VE_ADD src0: 0x00010020 reg: 1t swiz: X/ Y/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x0030e003 dst: 7t op: VE_ADD src0: 0x00010040 reg: 2t swiz: X/ Y/ X/ X src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00310003 dst: 8t op: VE_ADD src0: 0x00010060 reg: 3t swiz: X/ Y/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 16: op: 0x00902003 dst: 1t op: VE_ADD src0: 0x00d00000 reg: 0t swiz: X/ X/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x01104003 dst: 2t op: VE_ADD src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 18: op: 0x00106002 dst: 3t op: VE_MULTIPLY src0: 0x00492000 reg: 0t swiz: Y/ Y/ Y/ Y src1: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src2: 0x01248022 reg: 1c swiz: 0/ 0/ 0/ 0 19: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x00000060 reg: 3t swiz: X/ X/ X/ X src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 20: op: 0x00302004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db4022 reg: 1c swiz: Z/ W/ W/ W src1: 0x00db6000 reg: 0t swiz: W/ W/ W/ W src2: 0x00490020 reg: 1t swiz: X/ Y/ Y/ Y 21: op: 0x00100004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924000 reg: 0t swiz: Z/ Z/ Z/ Z src1: 0x00000122 reg: 9c swiz: X/ X/ X/ X src2: 0x1edb6000 reg: 0t swiz: -W/-W/-W/-W 22: op: 0x00402003 dst: 1t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 23: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00000040 reg: 2t swiz: X/ X/ X/ X src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 24: op: 0x00f12003 dst: 9t op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 25: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 26: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d100c0 reg: 6t swiz: X/ Y/ Z/ W src1: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 src2: 0x012480c0 reg: 6t swiz: 0/ 0/ 0/ 0 27: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d100e0 reg: 7t swiz: X/ Y/ Z/ W src1: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 src2: 0x012480e0 reg: 7t swiz: 0/ 0/ 0/ 0 28: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10100 reg: 8t swiz: X/ Y/ Z/ W src1: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 src2: 0x01248100 reg: 8t swiz: 0/ 0/ 0/ 0 29: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10120 reg: 9t swiz: X/ Y/ Z/ W src1: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 src2: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 30: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10120 reg: 9t swiz: X/ Y/ Z/ W src1: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 src2: 0x01248120 reg: 9t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL IN[4], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL CONST[0..1] DCL CONST[6..7] DCL TEMP[0..4], LOCAL DCL TEMP[5], ARRAY(1), LOCAL IMM[0] FLT32 { 4.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1].xyyy, SAMP[3], 2D 1: TEX TEMP[1].w, IN[2].xyyy, SAMP[2], 2D 2: TEX TEMP[2].w, IN[3].xyyy, SAMP[1], 2D 3: TEX TEMP[3].w, IN[4].xyyy, SAMP[0], 2D 4: DP3 TEMP[4].x, CONST[0].xyzz, TEMP[0].xyzz 5: MUL TEMP[4].x, TEMP[1].wwww, TEMP[4].xxxx 6: MUL TEMP[2].x, TEMP[2].wwww, TEMP[4].xxxx 7: MUL TEMP[2].x, IMM[0].xxxx, TEMP[2].xxxx 8: ADD TEMP[1].xyz, TEMP[2].xxxx, CONST[1].xyzz 9: MAD TEMP[0].x, TEMP[0].wwww, TEMP[3].wwww, TEMP[2].xxxx 10: MOV TEMP[1].w, TEMP[0].xxxx 11: MOV TEMP[5], TEMP[1] 12: ADD TEMP[0].x, CONST[7].zzzz, -IN[0].xxxx 13: ADD TEMP[1].x, CONST[7].zzzz, -CONST[7].yyyy 14: RCP TEMP[1].x, TEMP[1].xxxx 15: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx 16: LRP TEMP[5].xyz, TEMP[0].xxxx, TEMP[5].xyzz, CONST[6].xyzz 17: MOV_SAT OUT[0], TEMP[5] 18: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 17: MOV_SAT output[0], temp[5]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 17: MOV_SAT output[0], temp[5]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 17: MOV_SAT output[0], temp[5]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 17: MOV_SAT output[0], temp[5]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 17: MOV_SAT output[0], temp[5]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: LRP temp[5].xyz, temp[0].xxxx, temp[5].xyzz, const[6].xyzz; 17: MOV_SAT output[0], temp[5]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: ADD temp[6].xyz, temp[5].xyzz, -const[6].xyzz; 17: MAD temp[5].xyz, temp[0].xxxx, temp[6], const[6].xyzz; 18: MOV_SAT output[0], temp[5]; CONST[8] = { 4.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: ADD temp[6].xyz, temp[5].xyzz, -const[6].xyzz; 17: MAD temp[5].xyz, temp[0].xxxx, temp[6], const[6].xyzz; 18: MOV_SAT output[0], temp[5]; CONST[8] = { 4.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: DP3 temp[4].x, const[0].xyzz, temp[0].xyzz; 5: MUL temp[4].x, temp[1].wwww, temp[4].xxxx; 6: MUL temp[2].x, temp[2].wwww, temp[4].xxxx; 7: MUL temp[2].x, const[8].xxxx, temp[2].xxxx; 8: ADD temp[1].xyz, temp[2].xxxx, const[1].xyzz; 9: MAD temp[0].x, temp[0].wwww, temp[3].wwww, temp[2].xxxx; 10: MOV temp[1].w, temp[0].xxxx; 11: MOV temp[5], temp[1]; 12: ADD temp[0].x, const[7].zzzz, -input[0].xxxx; 13: ADD temp[1].x, const[7].zzzz, -const[7].yyyy; 14: RCP temp[1].x, temp[1].xxxx; 15: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 16: ADD temp[6].xyz, temp[5].xyzz, -const[6].xyzz; 17: MAD temp[5].xyz, temp[0].xxxx, temp[6], const[6].xyzz; 18: MOV_SAT output[0], temp[5]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[1].xyyy, 2D[3]; 1: TEX temp[1].w, input[2].xyyy, 2D[2]; 2: TEX temp[2].w, input[3].xyyy, 2D[1]; 3: TEX temp[3].w, input[4].xyyy, 2D[0]; 4: src0.xyz = const[0], src1.xyz = temp[0] DP3 temp[4].x, src0.xyz, src1.xyz 5: src0.xyz = temp[4], src0.w = temp[1] MAD temp[4].x, src0.www, src0.xxx, src0.000 6: src0.xyz = temp[4], src0.w = temp[2] MAD temp[2].x, src0.www, src0.xxx, src0.000 7: src0.xyz = const[8], src1.xyz = temp[2] MAD temp[2].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[2], src1.xyz = const[1] MAD temp[1].xyz, src0.xxx, src0.111, src1.xyz 9: src0.xyz = temp[2], src0.w = temp[0], src1.w = temp[3] MAD temp[0].x, src0.www, src1.www, src0.xxx 10: src0.xyz = temp[0] MAD temp[1].w, src0.x, src0.1, src0.0 11: src0.xyz = temp[1], src0.w = temp[1] MAD temp[5].xyz, src0.xyz, src0.111, src0.000 MAD temp[5].w, src0.w, src0.1, src0.0 12: src0.xyz = const[7], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 13: src0.xyz = const[7] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 14: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 15: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 16: src0.xyz = temp[5], src1.xyz = const[6] MAD temp[6].xyz, src0.xyz, src0.111, -src1.xyz 17: src0.xyz = temp[0], src1.xyz = temp[6], src2.xyz = const[6] MAD temp[5].xyz, src0.xxx, src1.xyz, src2.xyz 18: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xyyy, 2D[3]; 2: TEX temp[1].w, input[2].xyyy, 2D[2]; 3: TEX temp[2].w, input[3].xyyy, 2D[1]; 4: TEX temp[3].w, input[4].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = const[0], src1.xyz = temp[0] SEM_WAIT DP3 temp[4].x, src0.xyz, src1.xyz 6: src0.xyz = temp[4], src0.w = temp[1] MAD temp[4].x, src0.www, src0.xxx, src0.000 7: src0.xyz = temp[4], src0.w = temp[2] MAD temp[2].x, src0.www, src0.xxx, src0.000 8: src0.xyz = const[8], src1.xyz = temp[2] MAD temp[2].x, src0.xxx, src1.xxx, src0.000 9: src0.xyz = temp[2], src0.w = temp[0], src1.w = temp[3] MAD temp[0].x, src0.www, src1.www, src0.xxx 10: src0.xyz = temp[2], src1.xyz = const[1], src2.xyz = temp[0] MAD temp[1].xyz, src0.xxx, src0.111, src1.xyz MAD temp[1].w, src2.x, src0.1, src0.0 11: src0.xyz = temp[1], src0.w = temp[1] MAD temp[5].xyz, src0.xyz, src0.111, src0.000 MAD temp[5].w, src0.w, src0.1, src0.0 12: src0.xyz = temp[5], src1.xyz = const[6] MAD temp[6].xyz, src0.xyz, src0.111, -src1.xyz 13: src0.xyz = const[7], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 14: src0.xyz = const[7] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 15: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 16: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 17: src0.xyz = temp[0], src1.xyz = temp[6], src2.xyz = const[6] MAD temp[5].xyz, src0.xxx, src1.xyz, src2.xyz 18: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xyyy, 2D[3]; 2: TEX temp[1].w, input[2].xyyy, 2D[2]; 3: TEX temp[2].w, input[3].xyyy, 2D[1]; 4: TEX temp[3].w, input[4].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = const[0], src1.xyz = temp[0] SEM_WAIT DP3 temp[4].x, src0.xyz, src1.xyz 6: src0.xyz = temp[4], src0.w = temp[1] MAD temp[4].x, src0.www, src0.xxx, src0.000 7: src0.xyz = temp[4], src0.w = temp[2] MAD temp[2].x, src0.www, src0.xxx, src0.000 8: src0.xyz = const[8], src1.xyz = temp[2] MAD temp[2].x, src0.xxx, src1.xxx, src0.000 9: src0.xyz = temp[2], src0.w = temp[0], src1.w = temp[3] MAD temp[0].x, src0.www, src1.www, src0.xxx 10: src0.xyz = temp[2], src1.xyz = const[1], src2.xyz = temp[0] MAD temp[1].xyz, src0.xxx, src0.111, src1.xyz MAD temp[1].w, src2.x, src0.1, src0.0 11: src0.xyz = temp[1], src0.w = temp[1] MAD temp[5].xyz, src0.xyz, src0.111, src0.000 MAD temp[5].w, src0.w, src0.1, src0.0 12: src0.xyz = temp[5], src1.xyz = const[6] MAD temp[6].xyz, src0.xyz, src0.111, -src1.xyz 13: src0.xyz = const[7], src1.xyz = input[0] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 14: src0.xyz = const[7] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 15: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 16: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 17: src0.xyz = temp[0], src1.xyz = temp[6], src2.xyz = const[6] MAD temp[5].xyz, src0.xxx, src1.xyz, src2.xyz 18: src0.xyz = temp[5], src0.w = temp[5] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[0].xyyy, 2D[3]; 2: TEX temp[6].w, input[1].xyyy, 2D[2]; 3: TEX temp[7].w, input[2].xyyy, 2D[1]; 4: TEX temp[8].w, input[3].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = const[0], src1.xyz = temp[5] SEM_WAIT DP3 temp[9].x, src0.xyz, src1.xyz 6: src0.xyz = temp[9], src0.w = temp[6] MAD temp[9].x, src0.www, src0.xxx, src0.000 7: src0.xyz = temp[9], src0.w = temp[7] MAD temp[7].x, src0.www, src0.xxx, src0.000 8: src0.xyz = const[8], src1.xyz = temp[7] MAD temp[7].x, src0.xxx, src1.xxx, src0.000 9: src0.xyz = temp[7], src0.w = temp[5], src1.w = temp[8] MAD temp[5].x, src0.www, src1.www, src0.xxx 10: src0.xyz = temp[7], src1.xyz = const[1], src2.xyz = temp[5] MAD temp[6].xyz, src0.xxx, src0.111, src1.xyz MAD temp[6].w, src2.x, src0.1, src0.0 11: src0.xyz = temp[6], src0.w = temp[6] MAD temp[10].xyz, src0.xyz, src0.111, src0.000 MAD temp[10].w, src0.w, src0.1, src0.0 12: src0.xyz = temp[10], src1.xyz = const[6] MAD temp[11].xyz, src0.xyz, src0.111, -src1.xyz 13: src0.xyz = const[7], src1.xyz = input[4] MAD temp[5].x, src0.zzz, src0.111, -src1.xxx 14: src0.xyz = const[7] MAD temp[6].x, src0.zzz, src0.111, -src0.yyy 15: src0.xyz = temp[6] REPL_ALPHA temp[6].x RCP, src0.x 16: src0.xyz = temp[5], src1.xyz = temp[6] MAD_SAT temp[5].x, src0.xxx, src1.xxx, src0.000 17: src0.xyz = temp[5], src1.xyz = temp[11], src2.xyz = const[6] MAD temp[10].xyz, src0.xxx, src1.xyz, src2.xyz 18: src0.xyz = temp[10], src0.w = temp[10] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe4055400: src: 0 R/G/G/G dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00004003:TEX wmask: A omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe4065401: src: 1 R/G/G/G dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004003:TEX wmask: A omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe4075402: src: 2 R/G/G/G dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4085403: src: 3 R/G/G/G dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08001500:Addr0: 0c, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000091:DP3 dest:9 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020009:Addr0: 9t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0000036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020009:Addr0: 9t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020007:Addr0: 7t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0000036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001d08:Addr0: 8c, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020007:Addr0: 7t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08002005:Addr0: 5t, Addr1: 8t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006da36c:rgb_A_src:0 A/A/A 0 rgb_B_src:1 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000050:MAD dest:5 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00540407:Addr0: 7t, Addr1: 1c, Addr2: 5t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c02060:MAD dest:6 alp_A_src:2 R 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20221060:MAD dest:6 rgb_C_src:1 R/G/B 0 alp_C_src:0 0 0 10 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c0a0:MAD dest:10 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x204900a0:MAD dest:10 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 11 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x0804180a:Addr0: 10t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a210b0:MAD dest:11 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 12 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001107:Addr0: 7c, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801050:MAD dest:5 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 13 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020107:Addr0: 7c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924060:MAD dest:6 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 14 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000006a:SOP dest:6 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10602c05:Addr0: 5t, Addr1: 11t, Addr2: 6c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x002220a0:MAD dest:10 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 17 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x0802000a:Addr0: 10t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0802000a:Addr0: 10t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL CONST[0..11] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL DCL TEMP[3..4], LOCAL IMM[0] FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[1].yyyy, CONST[3] 1: MAD TEMP[0], CONST[2], IN[1].xxxx, TEMP[0] 2: MAD TEMP[0], CONST[4], IN[1].zzzz, TEMP[0] 3: ADD TEMP[0], TEMP[0], CONST[5] 4: MAD TEMP[0], IN[0].xyxx, CONST[0].xxyy, TEMP[0] 5: MUL TEMP[1], TEMP[0].yyyy, CONST[7] 6: MAD TEMP[1], CONST[6], TEMP[0].xxxx, TEMP[1] 7: MAD TEMP[1], CONST[8], TEMP[0].zzzz, TEMP[1] 8: MAD TEMP[0], CONST[9], TEMP[0].wwww, TEMP[1] 9: ADD TEMP[1].x, -TEMP[0].zzzz, CONST[11].yyyy 10: MUL TEMP[1].x, TEMP[1].xxxx, CONST[11].wwww 11: MOV TEMP[2].xy, IN[3].xyxx 12: MOV TEMP[3].xw, TEMP[0].xxzw 13: MOV_SAT TEMP[1].x, TEMP[1].xxxx 14: MUL TEMP[4].x, TEMP[0].yyyy, CONST[1].yyyy 15: MOV TEMP[3].y, TEMP[4].xxxx 16: MAD TEMP[3].xy, CONST[1].zwww, TEMP[0].wwww, TEMP[3].xyyy 17: MAD TEMP[0].x, TEMP[0].zzzz, IMM[0].xxxx, -TEMP[0].wwww 18: MOV TEMP[3].z, TEMP[0].xxxx 19: MOV OUT[2], TEMP[1].xxxx 20: MOV OUT[0], TEMP[3] 21: MOV_SAT OUT[1], IN[2] 22: MOV OUT[3], TEMP[2] 23: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[1].yyyy, const[3]; 1: MAD temp[0], const[2], input[1].xxxx, temp[0]; 2: MAD temp[0], const[4], input[1].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[5]; 4: MAD temp[0], input[0].xyxx, const[0].xxyy, temp[0]; 5: MUL temp[1], temp[0].yyyy, const[7]; 6: MAD temp[1], const[6], temp[0].xxxx, temp[1]; 7: MAD temp[1], const[8], temp[0].zzzz, temp[1]; 8: MAD temp[0], const[9], temp[0].wwww, temp[1]; 9: ADD temp[1].x, -temp[0].zzzz, const[11].yyyy; 10: MUL temp[1].x, temp[1].xxxx, const[11].wwww; 11: MOV temp[2].xy, input[3].xyxx; 12: MOV temp[3].xw, temp[0].xxzw; 13: MOV_SAT temp[1].x, temp[1].xxxx; 14: MUL temp[4].x, temp[0].yyyy, const[1].yyyy; 15: MOV temp[3].y, temp[4].xxxx; 16: MAD temp[3].xy, const[1].zwww, temp[0].wwww, temp[3].xyyy; 17: MAD temp[0].x, temp[0].zzzz, const[12].xxxx, -temp[0].wwww; 18: MOV temp[3].z, temp[0].xxxx; 19: MOV output[2], temp[1].xxxx; 20: MOV temp[5], temp[3]; 21: MOV_SAT output[1], input[2]; 22: MOV output[3], temp[2]; 23: MOV output[0], temp[5]; 24: MOV output[4], temp[5]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[1].yyyy, const[3]; 1: MAD temp[0], const[2], input[1].xxxx, temp[0]; 2: MAD temp[0], const[4], input[1].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[5]; 4: MAD temp[0], input[0].xyxx, const[0].xxyy, temp[0]; 5: MUL temp[1], temp[0].yyyy, const[7]; 6: MAD temp[1], const[6], temp[0].xxxx, temp[1]; 7: MAD temp[1], const[8], temp[0].zzzz, temp[1]; 8: MAD temp[0], const[9], temp[0].wwww, temp[1]; 9: ADD temp[1].x, -temp[0].zzzz, const[11].yyyy; 10: MUL temp[1].x, temp[1].xxxx, const[11].wwww; 11: MOV temp[2].xy, input[3].xyxx; 12: MOV temp[3].xw, temp[0].xxzw; 13: MOV_SAT temp[1].x, temp[1].xxxx; 14: MUL temp[4].x, temp[0].yyyy, const[1].yyyy; 15: MOV temp[3].y, temp[4].xxxx; 16: MAD temp[3].xy, const[1].zwww, temp[0].wwww, temp[3].xyyy; 17: MAD temp[0].x, temp[0].zzzz, const[12].xxxx, -temp[0].wwww; 18: MOV temp[3].z, temp[0].xxxx; 19: MOV output[2], temp[1].xxxx; 20: MOV temp[5], temp[3]; 21: MOV_SAT output[1], input[2]; 22: MOV output[3], temp[2]; 23: MOV output[0], temp[5]; 24: MOV output[4], temp[5]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[1].yyyy, const[3]; 1: MAD temp[0], const[2], input[1].xxxx, temp[0]; 2: MAD temp[0], const[4], input[1].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[5]; 4: MAD temp[0], input[0].xyxx, const[0].xxyy, temp[0]; 5: MUL temp[1], temp[0].yyyy, const[7]; 6: MAD temp[1], const[6], temp[0].xxxx, temp[1]; 7: MAD temp[1], const[8], temp[0].zzzz, temp[1]; 8: MAD temp[0], const[9], temp[0].wwww, temp[1]; 9: ADD temp[1].x, -temp[0].zzzz, const[11].yyyy; 10: MUL temp[1].x, temp[1].xxxx, const[11].wwww; 11: MOV temp[2].xy, input[3].xyxx; 12: MOV temp[3].xw, temp[0].xxzw; 13: MOV_SAT temp[1].x, temp[1].xxxx; 14: MUL temp[4].x, temp[0].yyyy, const[1].yyyy; 15: MOV temp[3].y, temp[4].xxxx; 16: MAD temp[3].xy, const[1].zwww, temp[0].wwww, temp[3].xyyy; 17: MAD temp[0].x, temp[0].zzzz, const[12].xxxx, -temp[0].wwww; 18: MOV temp[3].z, temp[0].xxxx; 19: MOV output[2], temp[1].xxxx; 20: MOV temp[5], temp[3]; 21: MOV_SAT output[1], input[2]; 22: MOV output[3], temp[2]; 23: MOV output[0], temp[5]; 24: MOV output[4], temp[5]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[1].yyyy, const[3]; 1: MAD temp[0], const[2], input[1].xxxx, temp[0]; 2: MAD temp[0], const[4], input[1].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[5]; 4: MAD temp[0], input[0].xyxx, const[0].xxyy, temp[0]; 5: MUL temp[1], temp[0].yyyy, const[7]; 6: MAD temp[1], const[6], temp[0].xxxx, temp[1]; 7: MAD temp[1], const[8], temp[0].zzzz, temp[1]; 8: MAD temp[0], const[9], temp[0].wwww, temp[1]; 9: ADD temp[1].x, -temp[0].zzzz, const[11].yyyy; 10: MUL temp[1].x, temp[1].xxxx, const[11].wwww; 11: MOV temp[2].xy, input[3].xyxx; 12: MOV temp[3].xw, temp[0].xxzw; 13: MOV_SAT temp[1].x, temp[1].xxxx; 14: MUL temp[4].x, temp[0].yyyy, const[1].yyyy; 15: MOV temp[3].y, temp[4].xxxx; 16: MAD temp[3].xy, const[1].zwww, temp[0].wwww, temp[3].xyyy; 17: MAD temp[0].x, temp[0].zzzz, const[12].xxxx, -temp[0].wwww; 18: MOV temp[3].z, temp[0].xxxx; 19: MOV output[2], temp[1].xxxx; 20: MOV temp[5], temp[3]; 21: MOV_SAT output[1], input[2]; 22: MOV output[3], temp[2]; 23: MOV output[0], temp[5]; 24: MOV output[4], temp[5]; CONST[12] = { 2.0000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[1].yyyy, const[3]; 1: MAD temp[0], const[2], input[1].xxxx, temp[0]; 2: MAD temp[0], const[4], input[1].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[5]; 4: MAD temp[0], input[0].xyxx, const[0].xxyy, temp[0]; 5: MUL temp[1], temp[0].yyyy, const[7]; 6: MAD temp[1], const[6], temp[0].xxxx, temp[1]; 7: MAD temp[1], const[8], temp[0].zzzz, temp[1]; 8: MAD temp[0], const[9], temp[0].wwww, temp[1]; 9: ADD temp[1].x, -temp[0].zzzz, const[11].yyyy; 10: MUL temp[1].x, temp[1].xxxx, const[11].wwww; 11: MOV temp[2].xy, input[3].xyxx; 12: MOV temp[3].xw, temp[0].xxzw; 13: MOV_SAT temp[1].x, temp[1].xxxx; 14: MUL temp[4].x, temp[0].yyyy, const[1].yyyy; 15: MOV temp[3].y, temp[4].xxxx; 16: MAD temp[3].xy, const[1].zwww, temp[0].wwww, temp[3].xyyy; 17: MAD temp[0].x, temp[0].zzzz, const[12].xxxx, -temp[0].wwww; 18: MOV temp[3].z, temp[0].xxxx; 19: MOV output[2], temp[1].xxxx; 20: MOV temp[5], temp[3]; 21: MOV_SAT output[1], input[2]; 22: MOV output[3], temp[2]; 23: MOV output[0], temp[5]; 24: MOV output[4], temp[5]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[1].yyyy, const[3]; 1: MAD temp[0], const[2], input[1].xxxx, temp[0]; 2: MAD temp[0], const[4], input[1].zzzz, temp[0]; 3: ADD temp[0], temp[0], const[5]; 4: MAD temp[0], input[0].xyxx, const[0].xxyy, temp[0]; 5: MUL temp[1], temp[0].yyyy, const[7]; 6: MAD temp[1], const[6], temp[0].xxxx, temp[1]; 7: MAD temp[1], const[8], temp[0].zzzz, temp[1]; 8: MAD temp[0], const[9], temp[0].wwww, temp[1]; 9: ADD temp[1].x, -temp[0].zzzz, const[11].yyyy; 10: MUL temp[1].x, temp[1].xxxx, const[11].wwww; 11: MOV temp[2].xy, input[3].xyxx; 12: MOV temp[3].xw, temp[0].xxzw; 13: MOV_SAT temp[1].x, temp[1].xxxx; 14: MUL temp[4].x, temp[0].yyyy, const[1].yyyy; 15: MOV temp[3].y, temp[4].xxxx; 16: MAD temp[3].xy, const[1].zwww, temp[0].wwww, temp[3].xyyy; 17: MAD temp[0].x, temp[0].zzzz, const[12].xxxx, -temp[0].wwww; 18: MOV temp[3].z, temp[0].xxxx; 19: MOV output[2], temp[1].xxxx; 20: MOV temp[5], temp[3]; 21: MOV_SAT output[1], input[2]; 22: MOV output[3], temp[2]; 23: MOV output[0], temp[5]; 24: MOV output[4], temp[5]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00492021 reg: 1i swiz: Y/ Y/ Y/ Y src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00000021 reg: 1i swiz: X/ X/ X/ X src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00924021 reg: 1i swiz: Z/ Z/ Z/ Z src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00010001 reg: 0i swiz: X/ Y/ X/ X src1: 0x00480002 reg: 0c swiz: X/ X/ Y/ Y src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00492000 reg: 0t swiz: Y/ Y/ Y/ Y src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 6: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00000000 reg: 0t swiz: X/ X/ X/ X src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 7: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00924000 reg: 0t swiz: Z/ Z/ Z/ Z src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 8: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00db6000 reg: 0t swiz: W/ W/ W/ W src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 9: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x1e924000 reg: 0t swiz: -Z/-Z/-Z/-Z src1: 0x00492162 reg: 11c swiz: Y/ Y/ Y/ Y src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 10: op: 0x00102002 dst: 1t op: VE_MULTIPLY src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x00db6162 reg: 11c swiz: W/ W/ W/ W src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 11: op: 0x00304003 dst: 2t op: VE_ADD src0: 0x00010061 reg: 3i swiz: X/ Y/ X/ X src1: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 src2: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 12: op: 0x00906003 dst: 3t op: VE_ADD src0: 0x00d00000 reg: 0t swiz: X/ X/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 13: op: 0x01102003 dst: 1t op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00108002 dst: 4t op: VE_MULTIPLY src0: 0x00492000 reg: 0t swiz: Y/ Y/ Y/ Y src1: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src2: 0x01248022 reg: 1c swiz: 0/ 0/ 0/ 0 15: op: 0x00206003 dst: 3t op: VE_ADD src0: 0x00000080 reg: 4t swiz: X/ X/ X/ X src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 16: op: 0x00306004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x00db4022 reg: 1c swiz: Z/ W/ W/ W src1: 0x00db6000 reg: 0t swiz: W/ W/ W/ W src2: 0x00490060 reg: 3t swiz: X/ Y/ Y/ Y 17: op: 0x00100004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924000 reg: 0t swiz: Z/ Z/ Z/ Z src1: 0x00000182 reg: 12c swiz: X/ X/ X/ X src2: 0x1edb6000 reg: 0t swiz: -W/-W/-W/-W 18: op: 0x00406003 dst: 3t op: VE_ADD src0: 0x00000000 reg: 0t swiz: X/ X/ X/ X src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 19: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00000020 reg: 1t swiz: X/ X/ X/ X src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 20: op: 0x00f0a003 dst: 5t op: VE_ADD src0: 0x00d10060 reg: 3t swiz: X/ Y/ Z/ W src1: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 21: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 22: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 23: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 24: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d100a0 reg: 5t swiz: X/ Y/ Z/ W src1: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR, COLOR DCL IN[1], COLOR[1], COLOR DCL IN[2], FOG, PERSPECTIVE DCL IN[3], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL CONST[2..3] DCL TEMP[0..1], LOCAL DCL TEMP[2], ARRAY(1), LOCAL 0: TEX TEMP[0], IN[3].xyyy, SAMP[0], 2D 1: MOV_SAT TEMP[0], TEMP[0] 2: MUL TEMP[1], IN[1], CONST[0] 3: MAD TEMP[2], TEMP[0], IN[0], TEMP[1] 4: ADD TEMP[0].x, CONST[3].zzzz, -IN[2].xxxx 5: ADD TEMP[1].x, CONST[3].zzzz, -CONST[3].yyyy 6: RCP TEMP[1].x, TEMP[1].xxxx 7: MUL_SAT TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx 8: LRP TEMP[2].xyz, TEMP[0].xxxx, TEMP[2].xyzz, CONST[2].xyzz 9: MOV_SAT OUT[0], TEMP[2] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: LRP temp[2].xyz, temp[0].xxxx, temp[2].xyzz, const[2].xyzz; 9: MOV_SAT output[0], temp[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: MOV_SAT temp[0], temp[0]; 2: MUL temp[1], input[1], const[0]; 3: MAD temp[2], temp[0], input[0], temp[1]; 4: ADD temp[0].x, const[3].zzzz, -input[2].xxxx; 5: ADD temp[1].x, const[3].zzzz, -const[3].yyyy; 6: RCP temp[1].x, temp[1].xxxx; 7: MUL_SAT temp[0].x, temp[0].xxxx, temp[1].xxxx; 8: ADD temp[3].xyz, temp[2].xyzz, -const[2].xyzz; 9: MAD temp[2].xyz, temp[0].xxxx, temp[3], const[2].xyzz; 10: MOV_SAT output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[3].xyyy, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 4: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 5: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 6: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 7: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 8: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 9: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 10: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[3].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 9: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[3].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[1].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[1], src2.w = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[2].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[2], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[2] MAD temp[0].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[1].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[1] REPL_ALPHA temp[1].x RCP, src0.x 9: src0.xyz = temp[0], src1.xyz = temp[1] MAD_SAT temp[0].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[0], src1.xyz = temp[3], src2.xyz = const[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[2], src0.w = temp[2] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[2].xyyy, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[1], src0.w = input[1], src1.xyz = const[0], src1.w = const[0] MAD temp[5].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[5].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[4], src0.w = temp[4] SEM_WAIT MAD_SAT temp[4].xyz, src0.xyz, src0.111, src0.000 MAD_SAT temp[4].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[4], src0.w = temp[4], src1.xyz = input[0], src1.w = input[0], src2.xyz = temp[5], src2.w = temp[5] MAD temp[6].xyz, src0.xyz, src1.xyz, src2.xyz MAD temp[6].w, src0.w, src1.w, src2.w 5: src0.xyz = temp[6], src1.xyz = const[2] MAD temp[7].xyz, src0.xyz, src0.111, -src1.xyz 6: src0.xyz = const[3], src1.xyz = input[3] MAD temp[4].x, src0.zzz, src0.111, -src1.xxx 7: src0.xyz = const[3] MAD temp[5].x, src0.zzz, src0.111, -src0.yyy 8: src0.xyz = temp[5] REPL_ALPHA temp[5].x RCP, src0.x 9: src0.xyz = temp[4], src1.xyz = temp[5] MAD_SAT temp[4].x, src0.xxx, src1.xxx, src0.000 10: src0.xyz = temp[4], src1.xyz = temp[7], src2.xyz = const[2] MAD temp[6].xyz, src0.xxx, src1.xyz, src2.xyz 11: src0.xyz = temp[6], src0.w = temp[6] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe4045402: src: 2 R/G/G/G dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040001:Addr0: 1t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c050:MAD dest:5 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 2 0:CMN_INST 0x00187804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c040:MAD dest:4 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00500004:Addr0: 4t, Addr1: 0t, Addr2: 5t, srcp:0 2:ALPHA_ADDR 0x00500004:Addr0: 4t, Addr1: 0t, Addr2: 5t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c060:MAD dest:6 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c222060:MAD dest:6 rgb_C_src:2 R/G/B 0 alp_C_src:2 A 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040806:Addr0: 6t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21070:MAD dest:7 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 5 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000d03:Addr0: 3c, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00801040:MAD dest:4 rgb_C_src:1 R/R/R 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0248:rgb_A_src:0 B/B/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00924050:MAD dest:5 rgb_C_src:0 G/G/G 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000005a:SOP dest:5 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001404:Addr0: 4t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x10201c04:Addr0: 4t, Addr1: 7t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222060:MAD dest:6 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 10 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], COLOR[1] DCL OUT[3], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV_SAT OUT[1], IN[1] 5: MOV_SAT OUT[2], IN[2] 6: MOV OUT[3], IN[3] 7: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV_SAT output[2], input[2]; 6: MOV output[3], input[3]; 7: MOV output[0], temp[1]; 8: MOV output[4], temp[1]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x01f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 6: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10061 reg: 3i swiz: X/ Y/ Z/ W src1: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 src2: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 7: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 8: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 radeon: Released access to Hyper-Z. Rendered 130 frames in 9.97852 secs, average of 13.028 fps