AL lib: (WW) FreeContext: (0x8d3a5c0) Deleting 64 Source(s) ATTENTION: default value of option force_s3tc_enable overridden by environment. r300: DRM version: 2.40.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression RAM: YES, Z compression RAM: YES, HiZ RAM: YES r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV_SAT OUT[1], IN[1] 5: MOV OUT[2], IN[2] 6: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV_SAT output[1], input[1]; 5: MOV output[2], input[2]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 8 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], CONSTANT DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..4] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[0].xyyy 1: MOV TEMP[0].w, IN[0].wwww 2: TXP TEMP[0].xyz, TEMP[0], SAMP[0], 2D 3: MOV TEMP[0].xyz, TEMP[0].xyzx 4: MOV TEMP[0].w, CONST[4].wwww 5: MOV OUT[0], TEMP[0] 6: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0].xyz, temp[0], 2D[0]; 3: MOV temp[0].xyz, temp[0].xyzx; 4: MOV temp[0].w, const[4].wwww; 5: MOV output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0].xyz, temp[0], 2D[0]; 3: MOV temp[0].xyz, temp[0].xyzx; 4: MOV temp[0].w, const[4].wwww; 5: MOV output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0].xyz, temp[0], 2D[0]; 3: MOV temp[0].xyz, temp[0].xyzx; 4: MOV temp[0].w, const[4].wwww; 5: MOV output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0].xyz, temp[0], 2D[0]; 3: MOV temp[0].xyz, temp[0].xyzx; 4: MOV temp[0].w, const[4].wwww; 5: MOV output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0].xyz, temp[0], 2D[0]; 3: MOV temp[0].xyz, temp[0].xyzx; 4: MOV temp[0].w, const[4].wwww; 5: MOV output[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0].xyz, temp[0], 2D[0]; 3: MOV temp[0].xyz, temp[0].xyzx; 4: MOV temp[0].w, const[4].wwww; 5: MOV output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0].xyz, temp[0], 2D[0]; 3: MOV temp[0].xyz, temp[0].xyzx; 4: MOV temp[0].w, const[4].wwww; 5: MOV output[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xy__; 1: MOV temp[0].w, input[0].___w; 2: TXP temp[0].xyz, temp[0].xy_w, 2D[0]; 3: MOV temp[0].xyz, temp[0].xyz_; 4: MOV temp[0].w, const[4].___w; 5: MOV output[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0]; 3: MOV temp[3].xyz, temp[2].xyz_; 4: MOV temp[3].w, const[4].___w; 5: MOV output[0], temp[3]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0]; 3: MOV temp[3].xyz, temp[2].xyz_; 4: MOV temp[3].w, const[4].___w; 5: MOV output[0], temp[3]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0]; 3: MOV temp[3].xyz, temp[2].xyz_; 4: MOV temp[3].w, const[4].___w; 5: MOV output[0], temp[3]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0]; 3: MOV temp[3].xyz, temp[2].xyz_; 4: MOV temp[3].w, const[4].___w; 5: MOV output[0], temp[3]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0]; 3: MOV temp[3].xyz, temp[2].xyz_; 4: MOV temp[3].w, const[4].___w; 5: MOV output[0], temp[3]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 1: src0.w = input[0] MAD temp[1].w, src0.w, src0.1, src0.0 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0]; 3: src0.xyz = temp[2] MAD temp[3].xyz, src0.xyz, src0.111, src0.000 4: src0.w = const[4] MAD temp[3].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[3], src0.w = temp[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = const[4] SEM_WAIT MAD temp[3].xyz, src0.xyz, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[3], src0.w = temp[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2].xyz, temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = const[4] SEM_WAIT MAD temp[3].xyz, src0.xyz, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[3], src0.w = temp[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[0].xy, src0.xy_, src0.11_, src0.00_ MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[0].xyz, temp[0].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[0], src0.w = const[4] SEM_WAIT MAD temp[0].xyz, src0.xyz, src0.111, src0.000 MAD temp[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020104:Addr0: 4c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 4 Instructions ~ 3 Vector Instructions (RGB) ~ 3 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 1 Texture Instructions ~ 0 Presub Operations ~ 0 OMOD Operations ~ 1 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[0].xyyy 1: MOV TEMP[0].w, IN[0].wwww 2: TXP TEMP[0], TEMP[0], SAMP[0], 2D 3: MOV OUT[0], TEMP[0] 4: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xy__; 1: MOV temp[0].w, input[0].___w; 2: TXP temp[0], temp[0].xy_w, 2D[0]; 3: MOV output[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV output[0], temp[2]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV output[0], temp[2]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV output[0], temp[2]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV output[0], temp[2]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 1: src0.w = input[0] MAD temp[1].w, src0.w, src0.1, src0.0 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: src0.xyz = temp[2], src0.w = temp[2] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2], temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = temp[2] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2], temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = temp[2] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[0].xy, src0.xy_, src0.11_, src0.00_ MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[0], temp[0].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..3] 0: MOV OUT[0], CONST[3] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], const[3]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[3], src0.w = const[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[3], src0.w = const[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = const[3], src0.w = const[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[3], src0.w = const[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 2D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..4] DCL TEMP[0..1], LOCAL 0: MOV TEMP[0].xyz, CONST[4].xyzx 1: MOV TEMP[1].xy, IN[0].xyyy 2: MOV TEMP[1].w, IN[0].wwww 3: TXP TEMP[1].w, TEMP[1], SAMP[0], 2D 4: MUL TEMP[1].x, TEMP[1].wwww, CONST[4].wwww 5: MOV TEMP[0].w, TEMP[1].xxxx 6: MOV OUT[0], TEMP[0] 7: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyzx; 1: MOV temp[1].xy, input[0].xyyy; 2: MOV temp[1].w, input[0].wwww; 3: TXP temp[1].w, temp[1], 2D[0]; 4: MUL temp[1].x, temp[1].wwww, const[4].wwww; 5: MOV temp[0].w, temp[1].xxxx; 6: MOV output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyzx; 1: MOV temp[1].xy, input[0].xyyy; 2: MOV temp[1].w, input[0].wwww; 3: TXP temp[1].w, temp[1], 2D[0]; 4: MUL temp[1].x, temp[1].wwww, const[4].wwww; 5: MOV temp[0].w, temp[1].xxxx; 6: MOV output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyzx; 1: MOV temp[1].xy, input[0].xyyy; 2: MOV temp[1].w, input[0].wwww; 3: TXP temp[1].w, temp[1], 2D[0]; 4: MUL temp[1].x, temp[1].wwww, const[4].wwww; 5: MOV temp[0].w, temp[1].xxxx; 6: MOV output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyzx; 1: MOV temp[1].xy, input[0].xyyy; 2: MOV temp[1].w, input[0].wwww; 3: TXP temp[1].w, temp[1], 2D[0]; 4: MUL temp[1].x, temp[1].wwww, const[4].wwww; 5: MOV temp[0].w, temp[1].xxxx; 6: MOV output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyzx; 1: MOV temp[1].xy, input[0].xyyy; 2: MOV temp[1].w, input[0].wwww; 3: TXP temp[1].w, temp[1], 2D[0]; 4: MUL temp[1].x, temp[1].wwww, const[4].wwww; 5: MOV temp[0].w, temp[1].xxxx; 6: MOV output[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyzx; 1: MOV temp[1].xy, input[0].xyyy; 2: MOV temp[1].w, input[0].wwww; 3: TXP temp[1].w, temp[1], 2D[0]; 4: MUL temp[1].x, temp[1].wwww, const[4].wwww; 5: MOV temp[0].w, temp[1].xxxx; 6: MOV output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyzx; 1: MOV temp[1].xy, input[0].xyyy; 2: MOV temp[1].w, input[0].wwww; 3: TXP temp[1].w, temp[1], 2D[0]; 4: MUL temp[1].x, temp[1].wwww, const[4].wwww; 5: MOV temp[0].w, temp[1].xxxx; 6: MOV output[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0].xyz, const[4].xyz_; 1: MOV temp[1].xy, input[0].xy__; 2: MOV temp[1].w, input[0].___w; 3: TXP temp[1].w, temp[1].xy_w, 2D[0]; 4: MUL temp[1].x, temp[1].w___, const[4].w___; 5: MOV temp[0].w, temp[1].___x; 6: MOV output[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV temp[2].xyz, const[4].xyz_; 1: MOV temp[3].xy, input[0].xy__; 2: MOV temp[3].w, input[0].___w; 3: TXP temp[4].w, temp[3].xy_w, 2D[0]; 4: MUL temp[5].x, temp[4].w___, const[4].w___; 5: MOV temp[2].w, temp[5].___x; 6: MOV output[0], temp[2]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[2].xyz, const[4].xyz_; 1: MOV temp[3].xy, input[0].xy__; 2: MOV temp[3].w, input[0].___w; 3: TXP temp[4].w, temp[3].xy_w, 2D[0]; 4: MUL temp[5].x, temp[4].w___, const[4].w___; 5: MOV temp[2].w, temp[5].___x; 6: MOV output[0], temp[2]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV temp[2].xyz, const[4].xyz_; 1: MOV temp[3].xy, input[0].xy__; 2: MOV temp[3].w, input[0].___w; 3: TXP temp[4].w, temp[3].xy_w, 2D[0]; 4: MUL temp[5].x, temp[4].w___, const[4].w___; 5: MOV temp[2].w, temp[5].___x; 6: MOV output[0], temp[2]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[2].xyz, const[4].xyz_; 1: MOV temp[3].xy, input[0].xy__; 2: MOV temp[3].w, input[0].___w; 3: TXP temp[4].w, temp[3].xy_w, 2D[0]; 4: MUL temp[5].x, temp[4].w___, const[4].w___; 5: MOV temp[2].w, temp[5].___x; 6: MOV output[0], temp[2]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[2].xyz, const[4].xyz_; 1: MOV temp[3].xy, input[0].xy__; 2: MOV temp[3].w, input[0].___w; 3: TXP temp[4].w, temp[3].xy_w, 2D[0]; 4: MUL temp[5].x, temp[4].w___, const[4].w___; 5: MOV temp[2].w, temp[5].___x; 6: MOV output[0], temp[2]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[4] MAD temp[2].xyz, src0.xyz, src0.111, src0.000 1: src0.xyz = input[0] MAD temp[3].xy, src0.xy_, src0.111, src0.000 2: src0.w = input[0] MAD temp[3].w, src0.w, src0.1, src0.0 3: TXP temp[4].w, temp[3].xy_w, 2D[0]; 4: src0.w = temp[4], src1.w = const[4] MAD temp[5].x, src0.w__, src1.w__, src0.000 5: src0.xyz = temp[5] MAD temp[2].w, src0.x, src0.1, src0.0 6: src0.xyz = temp[2], src0.w = temp[2] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[4], src0.w = input[0] MAD temp[2].xyz, src0.xyz, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 1: src0.xyz = input[0] MAD temp[3].xy, src0.xy_, src0.111, src0.000 2: BEGIN_TEX; 3: TXP temp[4].w, temp[3].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 4: src0.w = temp[4], src1.w = const[4] SEM_WAIT MAD temp[5].x, src0.w__, src1.w__, src0.000 5: src0.xyz = temp[5] MAD temp[2].w, src0.x, src0.1, src0.0 6: src0.xyz = temp[2], src0.w = temp[2] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = const[4], src0.w = input[0] MAD temp[2].xyz, src0.xyz, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 1: src0.xyz = input[0] MAD temp[3].xy, src0.xy_, src0.111, src0.000 2: BEGIN_TEX; 3: TXP temp[4].w, temp[3].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 4: src0.w = temp[4], src1.w = const[4] SEM_WAIT MAD temp[5].x, src0.w__, src1.w__, src0.000 5: src0.xyz = temp[5] MAD temp[2].w, src0.x, src0.1, src0.0 6: src0.xyz = temp[2], src0.w = temp[2] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[4], src0.w = input[0] MAD temp[1].xyz, src0.xyz, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 1: src0.xyz = input[0] MAD temp[2].xy, src0.xy_, src0.11_, src0.00_ 2: BEGIN_TEX; 3: TXP temp[0].w, temp[2].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 4: src0.w = temp[0], src1.w = const[4] SEM_WAIT MAD temp[0].x, src0.w__, src1.w__, src0.0__ 5: src0.xyz = temp[0] MAD temp[0].w, src0.x, src0.1, src0.0 6: src0.xyz = temp[1], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020104:Addr0: 4c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f402: src: 2 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 3 RGB_INST: 0x0091a48c:rgb_A_src:0 A/0/0 0 rgb_B_src:1 A/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c00000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 5 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 6 Instructions ~ 4 Vector Instructions (RGB) ~ 3 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 1 Texture Instructions ~ 0 Presub Operations ~ 0 OMOD Operations ~ 3 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..4] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[0].xyyy 1: MOV TEMP[0].w, IN[0].wwww 2: TXP TEMP[0], TEMP[0], SAMP[0], 2D 3: MUL TEMP[0], TEMP[0], CONST[4] 4: MOV OUT[0], TEMP[0] 5: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xy__; 1: MOV temp[0].w, input[0].___w; 2: TXP temp[0], temp[0].xy_w, 2D[0]; 3: MUL temp[0], temp[0], const[4]; 4: MOV output[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MUL temp[3], temp[2], const[4]; 4: MOV output[0], temp[3]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MUL temp[3], temp[2], const[4]; 4: MOV output[0], temp[3]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MUL temp[3], temp[2], const[4]; 4: MOV output[0], temp[3]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MUL temp[3], temp[2], const[4]; 4: MOV output[0], temp[3]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MUL temp[3], temp[2], const[4]; 4: MOV output[0], temp[3]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 1: src0.w = input[0] MAD temp[1].w, src0.w, src0.1, src0.0 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[4], src1.w = const[4] MAD temp[3].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[3].w, src0.w, src1.w, src0.0 4: src0.xyz = temp[3], src0.w = temp[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2], temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[4], src1.w = const[4] SEM_WAIT MAD temp[3].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[3].w, src0.w, src1.w, src0.0 4: src0.xyz = temp[3], src0.w = temp[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2], temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[4], src1.w = const[4] SEM_WAIT MAD temp[3].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[3].w, src0.w, src1.w, src0.0 4: src0.xyz = temp[3], src0.w = temp[3] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[0].xy, src0.xy_, src0.11_, src0.00_ MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[0], temp[0].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[4], src1.w = const[4] SEM_WAIT MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[0].w, src0.w, src1.w, src0.0 4: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..6] DCL TEMP[0..2], LOCAL IMM[0] FLT32 { 1.0000, 2.0000, -1.0000, 0.0000} 0: MOV TEMP[0].xy, IN[0].xyyy 1: MOV TEMP[0].w, IN[0].wwww 2: TXP TEMP[0], TEMP[0], SAMP[0], 2D 3: ADD TEMP[1], IMM[0].xxxx, -CONST[4] 4: MUL TEMP[2], TEMP[0], CONST[4] 5: MAD TEMP[1].xyz, CONST[5], TEMP[1], TEMP[2] 6: MOV TEMP[1].xyz, TEMP[1].xyzx 7: MOV TEMP[1].w, TEMP[0].wwww 8: MAD TEMP[1].xyz, TEMP[1], IMM[0].yyyy, IMM[0].zzzz 9: MAD TEMP[2].xyz, CONST[6], IMM[0].yyyy, IMM[0].zzzz 10: DP3 TEMP[1].x, TEMP[1].xyzz, TEMP[2].xyzz 11: MOV_SAT TEMP[1].xyz, TEMP[1].xxxx 12: MOV TEMP[1].w, TEMP[0].wwww 13: MOV OUT[0], TEMP[1] 14: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[1], const[7].xxxx, -const[4]; 4: MUL temp[2], temp[0], const[4]; 5: MAD temp[1].xyz, const[5], temp[1], temp[2]; 6: MOV temp[1].xyz, temp[1].xyzx; 7: MOV temp[1].w, temp[0].wwww; 8: MAD temp[1].xyz, temp[1], const[7].yyyy, const[7].zzzz; 9: MAD temp[2].xyz, const[6], const[7].yyyy, const[7].zzzz; 10: DP3 temp[1].x, temp[1].xyzz, temp[2].xyzz; 11: MOV_SAT temp[1].xyz, temp[1].xxxx; 12: MOV temp[1].w, temp[0].wwww; 13: MOV output[0], temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[1], const[7].xxxx, -const[4]; 4: MUL temp[2], temp[0], const[4]; 5: MAD temp[1].xyz, const[5], temp[1], temp[2]; 6: MOV temp[1].xyz, temp[1].xyzx; 7: MOV temp[1].w, temp[0].wwww; 8: MAD temp[1].xyz, temp[1], const[7].yyyy, const[7].zzzz; 9: MAD temp[2].xyz, const[6], const[7].yyyy, const[7].zzzz; 10: DP3 temp[1].x, temp[1].xyzz, temp[2].xyzz; 11: MOV_SAT temp[1].xyz, temp[1].xxxx; 12: MOV temp[1].w, temp[0].wwww; 13: MOV output[0], temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[1], const[7].xxxx, -const[4]; 4: MUL temp[2], temp[0], const[4]; 5: MAD temp[1].xyz, const[5], temp[1], temp[2]; 6: MOV temp[1].xyz, temp[1].xyzx; 7: MOV temp[1].w, temp[0].wwww; 8: MAD temp[1].xyz, temp[1], const[7].yyyy, const[7].zzzz; 9: MAD temp[2].xyz, const[6], const[7].yyyy, const[7].zzzz; 10: DP3 temp[1].x, temp[1].xyzz, temp[2].xyzz; 11: MOV_SAT temp[1].xyz, temp[1].xxxx; 12: MOV temp[1].w, temp[0].wwww; 13: MOV output[0], temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[1], const[7].xxxx, -const[4]; 4: MUL temp[2], temp[0], const[4]; 5: MAD temp[1].xyz, const[5], temp[1], temp[2]; 6: MOV temp[1].xyz, temp[1].xyzx; 7: MOV temp[1].w, temp[0].wwww; 8: MAD temp[1].xyz, temp[1], const[7].yyyy, const[7].zzzz; 9: MAD temp[2].xyz, const[6], const[7].yyyy, const[7].zzzz; 10: DP3 temp[1].x, temp[1].xyzz, temp[2].xyzz; 11: MOV_SAT temp[1].xyz, temp[1].xxxx; 12: MOV temp[1].w, temp[0].wwww; 13: MOV output[0], temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[1], const[7].xxxx, -const[4]; 4: MUL temp[2], temp[0], const[4]; 5: MAD temp[1].xyz, const[5], temp[1], temp[2]; 6: MOV temp[1].xyz, temp[1].xyzx; 7: MOV temp[1].w, temp[0].wwww; 8: MAD temp[1].xyz, temp[1], const[7].yyyy, const[7].zzzz; 9: MAD temp[2].xyz, const[6], const[7].yyyy, const[7].zzzz; 10: DP3 temp[1].x, temp[1].xyzz, temp[2].xyzz; 11: MOV_SAT temp[1].xyz, temp[1].xxxx; 12: MOV temp[1].w, temp[0].wwww; 13: MOV output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[1], const[7].xxxx, -const[4]; 4: MUL temp[2], temp[0], const[4]; 5: MAD temp[1].xyz, const[5], temp[1], temp[2]; 6: MOV temp[1].xyz, temp[1].xyzx; 7: MOV temp[1].w, temp[0].wwww; 8: MAD temp[1].xyz, temp[1], const[7].yyyy, const[7].zzzz; 9: MAD temp[2].xyz, const[6], const[7].yyyy, const[7].zzzz; 10: DP3 temp[1].x, temp[1].xyzz, temp[2].xyzz; 11: MOV_SAT temp[1].xyz, temp[1].xxxx; 12: MOV temp[1].w, temp[0].wwww; 13: MOV output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[1], const[7].xxxx, -const[4]; 4: MUL temp[2], temp[0], const[4]; 5: MAD temp[1].xyz, const[5], temp[1], temp[2]; 6: MOV temp[1].xyz, temp[1].xyzx; 7: MOV temp[1].w, temp[0].wwww; 8: MAD temp[1].xyz, temp[1], const[7].yyyy, const[7].zzzz; 9: MAD temp[2].xyz, const[6], const[7].yyyy, const[7].zzzz; 10: DP3 temp[1].x, temp[1].xyzz, temp[2].xyzz; 11: MOV_SAT temp[1].xyz, temp[1].xxxx; 12: MOV temp[1].w, temp[0].wwww; 13: MOV output[0], temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xy__; 1: MOV temp[0].w, input[0].___w; 2: TXP temp[0], temp[0].xy_w, 2D[0]; 3: ADD temp[1].xyz, const[7].xxx_, -const[4].xyz_; 4: MUL temp[2].xyz, temp[0].xyz_, const[4].xyz_; 5: MAD temp[1].xyz, const[5].xyz_, temp[1].xyz_, temp[2].xyz_; 6: MOV temp[1].xyz, temp[1].xyz_; 7: MAD temp[1].xyz, temp[1].xyz_, const[7].yyy_, const[7].zzz_; 8: MAD temp[2].xyz, const[6].xyz_, const[7].yyy_, const[7].zzz_; 9: DP3 temp[1].x, temp[1].xyz_, temp[2].xyz_; 10: MOV_SAT temp[1].xyz, temp[1].xxx_; 11: MOV temp[1].w, temp[0].___w; 12: MOV output[0], temp[1]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV temp[3].xy, input[0].xy__; 1: MOV temp[3].w, input[0].___w; 2: TXP temp[4], temp[3].xy_w, 2D[0]; 3: ADD temp[5].xyz, const[7].xxx_, -const[4].xyz_; 4: MUL temp[6].xyz, temp[4].xyz_, const[4].xyz_; 5: MAD temp[7].xyz, const[5].xyz_, temp[5].xyz_, temp[6].xyz_; 6: MOV temp[8].xyz, temp[7].xyz_; 7: MAD temp[9].xyz, temp[8].xyz_, const[7].yyy_, const[7].zzz_; 8: MAD temp[10].xyz, const[6].xyz_, const[7].yyy_, const[7].zzz_; 9: DP3 temp[11].x, temp[9].xyz_, temp[10].xyz_; 10: MOV_SAT temp[12].xyz, temp[11].xxx_; 11: MOV temp[12].w, temp[4].___w; 12: MOV output[0], temp[12]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[3].xy, input[0].xy__; 1: MOV temp[3].w, input[0].___w; 2: TXP temp[4], temp[3].xy_w, 2D[0]; 3: MUL temp[6].xyz, temp[4].xyz_, const[4].xyz_; 4: MAD temp[7].xyz, const[5].xyz_, (1 - const[4]).xyz_, temp[6].xyz_; 5: MAD temp[9].xyz, temp[7].xyz_, const[7].yyy_, none.-1-1-1_; 6: MAD temp[10].xyz, const[6].xyz_, const[7].yyy_, none.-1-1-1_; 7: DP3 temp[11].x, temp[9].xyz_, temp[10].xyz_; 8: MOV_SAT temp[12].xyz, temp[11].xxx_; 9: MOV temp[12].w, temp[4].___w; 10: MOV output[0], temp[12]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV temp[3].xy, input[0].xy__; 1: MOV temp[3].w, input[0].___w; 2: TXP temp[4], temp[3].xy_w, 2D[0]; 3: MUL temp[6].xyz, temp[4].xyz_, const[4].xyz_; 4: MAD temp[7].xyz, const[5].xyz_, (1 - const[4]).xyz_, temp[6].xyz_; 5: MAD temp[9].xyz, temp[7].xyz_, 2.000000 (0x40).www_, none.-1-1-1_; 6: MAD temp[10].xyz, const[6].xyz_, 2.000000 (0x40).www_, none.-1-1-1_; 7: DP3 temp[11].x, temp[9].xyz_, temp[10].xyz_; 8: MOV_SAT temp[12].xyz, temp[11].xxx_; 9: MOV temp[12].w, temp[4].___w; 10: MOV output[0], temp[12]; CONST[7] = { 1.0000 2.0000 -1.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[3].xy, input[0].xy__; 1: MOV temp[3].w, input[0].___w; 2: TXP temp[4], temp[3].xy_w, 2D[0]; 3: MUL temp[6].xyz, temp[4].xyz_, const[4].xyz_; 4: MAD temp[7].xyz, const[5].xyz_, (1 - const[4]).xyz_, temp[6].xyz_; 5: MAD temp[9].xyz, temp[7].xyz_, 2.000000 (0x40).www_, none.-1-1-1_; 6: MAD temp[10].xyz, const[6].xyz_, 2.000000 (0x40).www_, none.-1-1-1_; 7: DP3 temp[11].x, temp[9].xyz_, temp[10].xyz_; 8: MOV_SAT temp[12].xyz, temp[11].xxx_; 9: MOV temp[12].w, temp[4].___w; 10: MOV output[0], temp[12]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[3].xy, input[0].xy__; 1: MOV temp[3].w, input[0].___w; 2: TXP temp[4], temp[3].xy_w, 2D[0]; 3: MUL temp[6].xyz, temp[4].xyz_, const[4].xyz_; 4: MAD temp[7].xyz, const[5].xyz_, (1 - const[4]).xyz_, temp[6].xyz_; 5: MAD temp[9].xyz, temp[7].xyz_, 2.000000 (0x40).www_, none.-1-1-1_; 6: MAD temp[10].xyz, const[6].xyz_, 2.000000 (0x40).www_, none.-1-1-1_; 7: DP3 temp[11].x, temp[9].xyz_, temp[10].xyz_; 8: MOV_SAT temp[12].xyz, temp[11].xxx_; 9: MOV temp[12].w, temp[4].___w; 10: MOV output[0], temp[12]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0] MAD temp[3].xy, src0.xy_, src0.111, src0.000 1: src0.w = input[0] MAD temp[3].w, src0.w, src0.1, src0.0 2: TXP temp[4], temp[3].xy_w, 2D[0]; 3: src0.xyz = temp[4], src1.xyz = const[4] MAD temp[6].xyz, src0.xyz, src1.xyz, src0.000 4: src0.xyz = const[4], src1.xyz = const[5], src2.xyz = temp[6], srcp.xyz = (1 - src0) MAD temp[7].xyz, src1.xyz, srcp.xyz, src2.xyz 5: src0.xyz = temp[7] MAD temp[9].xyz, src0.xyz, src0.www, -src0.111 6: src0.xyz = const[6] MAD temp[10].xyz, src0.xyz, src0.www, -src0.111 7: src0.xyz = temp[9], src1.xyz = temp[10] DP3 temp[11].x, src0.xyz, src1.xyz 8: src0.xyz = temp[11] MAD_SAT temp[12].xyz, src0.xxx, src0.111, src0.000 9: src0.w = temp[4] MAD temp[12].w, src0.w, src0.1, src0.0 10: src0.xyz = temp[12], src0.w = temp[12] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[3].xy, src0.xy_, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[4], temp[3].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[4], src1.xyz = const[4] SEM_WAIT MAD temp[6].xyz, src0.xyz, src1.xyz, src0.000 4: src0.xyz = const[4], src1.xyz = const[5], src2.xyz = temp[6], srcp.xyz = (1 - src0) MAD temp[7].xyz, src1.xyz, srcp.xyz, src2.xyz 5: src0.xyz = temp[7] MAD temp[9].xyz, src0.xyz, src0.www, -src0.111 6: src0.xyz = const[6], src0.w = temp[4] MAD temp[10].xyz, src0.xyz, src0.www, -src0.111 MAD temp[12].w, src0.w, src0.1, src0.0 7: src0.xyz = temp[9], src1.xyz = temp[10] DP3 temp[11].x, src0.xyz, src1.xyz 8: src0.xyz = temp[11] MAD_SAT temp[12].xyz, src0.xxx, src0.111, src0.000 9: src0.xyz = temp[12], src0.w = temp[12] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[3].xy, src0.xy_, src0.111, src0.000 MAD temp[3].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[4], temp[3].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[4], src1.xyz = const[4] SEM_WAIT MAD temp[6].xyz, src0.xyz, src1.xyz, src0.000 4: src0.xyz = const[4], src1.xyz = const[5], src2.xyz = temp[6], srcp.xyz = (1 - src0) MAD temp[7].xyz, src1.xyz, srcp.xyz, src2.xyz 5: src0.xyz = temp[7], src0.w = none MAD temp[9].xyz, src0.xyz, src0.www, -src0.111 6: src0.xyz = const[6], src0.w = temp[4] MAD temp[10].xyz, src0.xyz, src0.www, -src0.111 MAD temp[12].w, src0.w, src0.1, src0.0 7: src0.xyz = temp[9], src1.xyz = temp[10] DP3 temp[11].x, src0.xyz, src1.xyz 8: src0.xyz = temp[11] MAD_SAT temp[12].xyz, src0.xxx, src0.111, src0.000 9: src0.xyz = temp[12], src0.w = temp[12] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[0].xy, src0.xy_, src0.11_, src0.00_ MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[0], temp[0].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[0], src1.xyz = const[4] SEM_WAIT MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 4: src0.xyz = const[4], src1.xyz = const[5], src2.xyz = temp[1], srcp.xyz = (1 - src0) MAD temp[1].xyz, src1.xyz, srcp.xyz, src2.xyz 5: src0.xyz = temp[1], src0.w = none MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 6: src0.xyz = const[6], src0.w = temp[0] MAD temp[0].xyz, src0.xyz, src0.www, -src0.111 MAD temp[0].w, src0.w, src0.1, src0.0 7: src0.xyz = temp[1], src1.xyz = temp[0] DP3 temp[0].x, src0.xyz, src1.xyz 8: src0.xyz = temp[0] MAD_SAT temp[0].xyz, src0.xxx, src0.111, src0.000 9: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0xc0141504:Addr0: 4c, Addr1: 5c, Addr2: 1t, srcp:3 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446221:rgb_A_src:1 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 5 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020106:Addr0: 6c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20ed8000:MAD dest:0 rgb_C_src:0 1/1/1 1 alp_C_src:0 0 0 6 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000001:DP3 dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00083800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 9 Instructions ~ 8 Vector Instructions (RGB) ~ 3 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 1 Texture Instructions ~ 1 Presub Operations ~ 0 OMOD Operations ~ 2 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[1..4] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[0].xyyy 1: MOV TEMP[0].w, IN[0].wwww 2: TXP TEMP[0], TEMP[0], SAMP[0], 2D 3: ADD TEMP[0], TEMP[0], CONST[4] 4: MOV_SAT TEMP[0], TEMP[0] 5: MOV OUT[0], TEMP[0] 6: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xyyy; 1: MOV temp[0].w, input[0].wwww; 2: TXP temp[0], temp[0], 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0].xy, input[0].xy__; 1: MOV temp[0].w, input[0].___w; 2: TXP temp[0], temp[0].xy_w, 2D[0]; 3: ADD temp[0], temp[0], const[4]; 4: MOV_SAT temp[0], temp[0]; 5: MOV output[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: ADD temp[3], temp[2], const[4]; 4: MOV_SAT temp[4], temp[3]; 5: MOV output[0], temp[4]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV_SAT output[0], (const[4] + temp[2]); Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV_SAT output[0], (const[4] + temp[2]); Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV_SAT output[0], (const[4] + temp[2]); Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[1].xy, input[0].xy__; 1: MOV temp[1].w, input[0].___w; 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: MOV_SAT output[0], (const[4] + temp[2]); Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 1: src0.w = input[0] MAD temp[1].w, src0.w, src0.1, src0.0 2: TXP temp[2], temp[1].xy_w, 2D[0]; 3: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[4], src1.w = const[4], srcp.xyz = (src1 + src0), srcp.w = (src1 + src0) MAD_SAT color[0].xyz, srcp.xyz, src0.111, src0.000 MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2], temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[4], src1.w = const[4], srcp.xyz = (src1 + src0), srcp.w = (src1 + src0) SEM_WAIT MAD_SAT color[0].xyz, srcp.xyz, src0.111, src0.000 MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[2], temp[1].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[4], src1.w = const[4], srcp.xyz = (src1 + src0), srcp.w = (src1 + src0) SEM_WAIT MAD_SAT color[0].xyz, srcp.xyz, src0.111, src0.000 MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD temp[0].xy, src0.xy_, src0.11_, src0.00_ MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[0], temp[0].xy_w, 2D[0] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[4], src1.w = const[4], srcp.xyz = (src1 + src0), srcp.w = (src1 + src0) SEM_WAIT MAD_SAT color[0].xyz, srcp.xyz, src0.111, src0.000 MAD_SAT color[0].w, srcp.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x88041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x88041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:2 3 RGB_INST: 0x00db0223:rgb_A_src:3 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0f000:MAD dest:0 alp_A_src:3 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 3 Instructions ~ 2 Vector Instructions (RGB) ~ 2 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 1 Texture Instructions ~ 2 Presub Operations ~ 0 OMOD Operations ~ 1 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ AL lib: (WW) FreeContext: (0x91f3548) Deleting 64 Source(s)