Using home directory: /home/fabio/.sauerbraten/ Adding package directory: /usr/share/games/sauerbraten/ init: sdl Library: SDL 1.2.15 init: net init: game init: video: mode r300: DRM version: 2.40.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression RAM: YES, Z compression RAM: YES, HiZ RAM: YES init: video: misc init: gl Renderer: Gallium 0.4 on ATI RV530 (X.Org R300 Project) Driver: 2.1 Mesa 10.5.0-devel (git-e660f9b 2015-01-26 utopic-oibaf-ppa+gallium-nine) Rendering using the OpenGL assembly/GLSL shader path. r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 2D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src0.w = temp[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 init: console r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[2], IN[2] 5: MOV_SAT OUT[1], IN[1] 6: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 5: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 8 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0] 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MUL OUT[0], IN[0], TEMP[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 init: gl: effects r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[2], IN[2] 5: MOV_SAT OUT[1], IN[1] 6: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 5: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 8 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0] 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MUL OUT[0], IN[0], TEMP[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 init: world init: sound init: cfg init: localconnect read map packages/base/force.ogz (0.5 seconds) Force by Ardelico, Hewho and Suicizer. game mode is coop edit Cooperative Editing: Edit maps with multiple players simultaneously. init: mainloop r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG 0: END Fragment Program: before compilation # Radeon Compiler Program Fragment Program: after 'rewrite depth out' # Radeon Compiler Program Fragment Program: after 'transform KILP' # Radeon Compiler Program Fragment Program: after 'unroll loops' # Radeon Compiler Program Fragment Program: after 'transform TEX' # Radeon Compiler Program Fragment Program: after 'transform IF' # Radeon Compiler Program Fragment Program: after 'native rewrite' # Radeon Compiler Program Fragment Program: after 'deadcode' # Radeon Compiler Program Fragment Program: after 'register rename' # Radeon Compiler Program Fragment Program: after 'dataflow optimize' # Radeon Compiler Program Fragment Program: after 'inline literals' # Radeon Compiler Program Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program Fragment Program: after 'dead constants' # Radeon Compiler Program Fragment Program: after 'pair translate' # Radeon Compiler Program Fragment Program: after 'pair scheduling' # Radeon Compiler Program Fragment Program: after 'dead sources' # Radeon Compiler Program Fragment Program: after 'register allocation' # Radeon Compiler Program R500 Fragment Program: -------- 0 0:CMN_INST 0x00000005:OUT TEX_WAIT wmask: NONE omask: NONE 1:RGB_ADDR 0x00000000:Addr0: 0t, Addr1: 0t, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x00000000:Addr0: 0t, Addr1: 0t, Addr2: 0t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], CONSTANT DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0] DCL CONST[2..6] DCL TEMP[0..1] IMM[0] FLT32 { 1.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[0].xxxx, CONST[3] 1: MAD TEMP[0], IN[0].yyyy, CONST[4], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[5], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[6], TEMP[0] 4: DP4 TEMP[1].x, IN[0], CONST[0] 5: SUB OUT[1].x, IMM[0].xxxx, TEMP[1].xxxx 6: MOV OUT[1].y, IMM[0].xxxx 7: MOV OUT[1].z, IMM[0].yyyy 8: MOV OUT[1].w, CONST[2].xxxx 9: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[2], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: SUB output[1].x, temp[0].1111, temp[1].xxxx; 6: MOV output[1].y, temp[0].1111; 7: MOV output[1].z, temp[0].0000; 8: MOV output[1].w, const[2].xxxx; 9: MOV output[0], temp[2]; 10: MOV output[2], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[2], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: SUB output[1].x, temp[0].1111, temp[1].xxxx; 6: MOV output[1].y, temp[0].1111; 7: MOV output[1].z, temp[0].0000; 8: MOV output[1].w, const[2].xxxx; 9: MOV output[0], temp[2]; 10: MOV output[2], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[2], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, temp[0].1111, -temp[1].xxxx; 6: MOV output[1].y, temp[0].1111; 7: MOV output[1].z, temp[0].0000; 8: MOV output[1].w, const[2].xxxx; 9: MOV output[0], temp[2]; 10: MOV output[2], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[2], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, temp[0].1___, -temp[1].x___; 6: MOV output[1].y, temp[0]._1__; 7: MOV output[1].z, temp[0].__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[2]; 10: MOV output[2], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[2], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[2]; 10: MOV output[2], temp[2]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[2], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[2]; 10: MOV output[2], temp[2]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[0], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[0]; 10: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[0], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[0]; 10: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[0], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[0]; 10: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 5: op: 0x00102203 dst: 1o op: VE_ADD src0: 0x01ffa000 reg: 0t swiz: 1/ U/ U/ U src1: 0x1fff0020 reg: 1t swiz: -X/-U/-U/-U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00202203 dst: 1o op: VE_ADD src0: 0x01fde000 reg: 0t swiz: U/ 1/ U/ U src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00402203 dst: 1o op: VE_ADD src0: 0x01e7e000 reg: 0t swiz: U/ U/ 0/ U src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00802203 dst: 1o op: VE_ADD src0: 0x003fe042 reg: 2c swiz: U/ U/ U/ X src1: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 9: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 10: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 11 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..5] DCL TEMP[0..1] 0: MUL TEMP[0], IN[0].xxxx, CONST[2] 1: MAD TEMP[0], IN[0].yyyy, CONST[3], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[4], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[5], TEMP[0] 4: DP4 TEMP[1], CONST[0], IN[0] 5: SUB OUT[1], CONST[1].yyyy, TEMP[1] 6: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[2], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: SUB output[1], const[1].yyyy, temp[1]; 6: MOV output[0], temp[2]; 7: MOV output[2], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[2], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: SUB output[1], const[1].yyyy, temp[1]; 6: MOV output[0], temp[2]; 7: MOV output[2], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[2], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[2]; 7: MOV output[2], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[2], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[2]; 7: MOV output[2], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[2], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[2]; 7: MOV output[2], temp[2]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[2], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[2]; 7: MOV output[2], temp[2]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[0], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[0]; 7: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[0], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[0]; 7: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[0], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[0]; 7: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src1: 0x1ed10020 reg: 1t swiz: -X/-Y/-Z/-W src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 8 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL OUT[2], GENERIC[1] DCL OUT[3], GENERIC[2] DCL OUT[4], GENERIC[3] DCL OUT[5], GENERIC[4] DCL CONST[0] DCL TEMP[0..1] IMM[0] FLT32 { 1.0000, 0.0000, -1.0000, 0.0000} 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: MAD TEMP[0], CONST[0], IMM[0].xxyy, IN[1] 3: MAD TEMP[1], CONST[0], IMM[0].zzyy, IN[1] 4: MOV OUT[2], TEMP[0] 5: MOV OUT[3], TEMP[1] 6: ADD TEMP[0].x, TEMP[0], CONST[0].zzzz 7: SUB TEMP[1].x, TEMP[1], CONST[0].zzzz 8: MOV OUT[4], TEMP[0] 9: MOV OUT[5], TEMP[1] 10: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0], const[0].zzzz; 7: SUB temp[1].x, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0], const[0].zzzz; 7: SUB temp[1].x, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0], const[0].zzzz; 7: ADD temp[1].x, temp[1], -const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0].x___, const[0].z___; 7: ADD temp[1].x, temp[1].x___, -const[0].z___; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0125a000 reg: 0t swiz: 1/ 1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 2: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0725a000 reg: 0t swiz: -1/-1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 5: op: 0x00100003 dst: 0t op: VE_ADD src0: 0x01ff0000 reg: 0t swiz: X/ U/ U/ U src1: 0x01ff4002 reg: 0c swiz: Z/ U/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x01ff0020 reg: 1t swiz: X/ U/ U/ U src1: 0x1fff4002 reg: 0c swiz: -Z/-U/-U/-U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 7: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 11 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL IN[1], GENERIC[1], PERSPECTIVE DCL IN[2], GENERIC[2], PERSPECTIVE DCL IN[3], GENERIC[3], PERSPECTIVE DCL IN[4], GENERIC[4], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0..2] 0: TEX TEMP[0], IN[0], SAMP[0], 2D 1: MUL TEMP[0], TEMP[0], CONST[0].xxxx 2: TEX TEMP[1], IN[1], SAMP[0], 2D 3: TEX TEMP[2], IN[2], SAMP[0], 2D 4: ADD TEMP[1], TEMP[1], TEMP[2] 5: MAD TEMP[0], TEMP[1], CONST[0].yyyy, TEMP[0] 6: TEX TEMP[1], IN[3], SAMP[0], 2D 7: TEX TEMP[2], IN[4], SAMP[0], 2D 8: ADD TEMP[1], TEMP[1], TEMP[2] 9: MAD OUT[0], TEMP[1], CONST[0].zzzz, TEMP[0] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1].xy__, 2D[0]; 3: TEX temp[2], input[2].xy__, 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3].xy__, 2D[0]; 7: TEX temp[2], input[4].xy__, 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[4].w, src0.w, src1.x, src0.0 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] MAD temp[7].xyz, src0.xyz, src0.111, src1.xyz MAD temp[7].w, src0.w, src0.1, src1.w 5: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[0], src1.w = temp[4], src2.xyz = temp[4] MAD temp[8].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[8].w, src0.w, src1.y, src1.w 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = temp[10], src1.w = temp[10] MAD temp[11].xyz, src0.xyz, src0.111, src1.xyz MAD temp[11].w, src0.w, src0.1, src1.w 9: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = const[0], src1.w = temp[8], src2.xyz = temp[8] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[3], input[0].xy__, 2D[0]; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: TEX temp[9], input[3].xy__, 2D[0]; 5: TEX temp[10], input[4].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] SEM_WAIT MAD temp[7].xyz, src0.xyz, src0.111, src1.xyz MAD temp[7].w, src0.w, src0.1, src1.w 7: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = temp[10], src1.w = temp[10] MAD temp[11].xyz, src0.xyz, src0.111, src1.xyz MAD temp[11].w, src0.w, src0.1, src1.w 8: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[4].w, src0.w, src1.x, src0.0 9: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[0], src1.w = temp[4], src2.xyz = temp[4] MAD temp[8].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[8].w, src0.w, src1.y, src1.w 10: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = const[0], src1.w = temp[8], src2.xyz = temp[8] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[3], input[0].xy__, 2D[0]; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: TEX temp[9], input[3].xy__, 2D[0]; 5: TEX temp[10], input[4].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] SEM_WAIT MAD temp[7].xyz, src0.xyz, src0.111, src1.xyz MAD temp[7].w, src0.w, src0.1, src1.w 7: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = temp[10], src1.w = temp[10] MAD temp[11].xyz, src0.xyz, src0.111, src1.xyz MAD temp[11].w, src0.w, src0.1, src1.w 8: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[4].w, src0.w, src1.x, src0.0 9: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[0], src1.w = temp[4], src2.xyz = temp[4] MAD temp[8].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[8].w, src0.w, src1.y, src1.w 10: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = const[0], src1.w = temp[8], src2.xyz = temp[8] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0]; 2: TEX temp[1], input[1].xy__, 2D[0]; 3: TEX temp[2], input[2].xy__, 2D[0]; 4: TEX temp[3], input[3].xy__, 2D[0]; 5: TEX temp[4], input[4].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = temp[2], src1.w = temp[2] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src1.xyz MAD temp[1].w, src0.w, src0.1, src1.w 7: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = temp[4], src1.w = temp[4] MAD temp[2].xyz, src0.xyz, src0.111, src1.xyz MAD temp[2].w, src0.w, src0.1, src1.w 8: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[0] MAD temp[0].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[0].w, src0.w, src1.x, src0.0 9: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = const[0], src1.w = temp[0], src2.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[0].w, src0.w, src1.y, src1.w 10: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[0], src1.w = temp[0], src2.xyz = temp[0] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe402f402: src: 2 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe403f403: src: 3 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f404: src: 4 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 6 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 7 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x0008c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 R 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 8 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00040001:Addr0: 1t, Addr1: 0c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0028c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 9 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x00040002:Addr0: 2t, Addr1: 0c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x0048c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 10 Instructions ~ 5 Vector Instructions (RGB) ~ 5 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 5 Texture Instructions ~ 0 Presub Operations ~ 0 OMOD Operations ~ 5 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL OUT[2], GENERIC[1] DCL OUT[3], GENERIC[2] DCL OUT[4], GENERIC[3] DCL OUT[5], GENERIC[4] DCL CONST[0] DCL TEMP[0..1] IMM[0] FLT32 { 1.0000, 0.0000, -1.0000, 0.0000} 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: MAD TEMP[0], CONST[0], IMM[0].xxyy, IN[1] 3: MAD TEMP[1], CONST[0], IMM[0].zzyy, IN[1] 4: MOV OUT[2], TEMP[0] 5: MOV OUT[3], TEMP[1] 6: ADD TEMP[0].y, TEMP[0], CONST[0].zzzz 7: SUB TEMP[1].y, TEMP[1], CONST[0].zzzz 8: MOV OUT[4], TEMP[0] 9: MOV OUT[5], TEMP[1] 10: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0], const[0].zzzz; 7: SUB temp[1].y, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0], const[0].zzzz; 7: SUB temp[1].y, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0], const[0].zzzz; 7: ADD temp[1].y, temp[1], -const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0]._y__, const[0]._z__; 7: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: MOV output[0], temp[2]; 11: MOV output[6], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: MOV output[0], input[0]; 10: MOV output[6], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0125a000 reg: 0t swiz: 1/ 1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 2: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0725a000 reg: 0t swiz: -1/-1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 5: op: 0x00200003 dst: 0t op: VE_ADD src0: 0x01f9e000 reg: 0t swiz: U/ Y/ U/ U src1: 0x01fae002 reg: 0c swiz: U/ Z/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x01f9e020 reg: 1t swiz: U/ Y/ U/ U src1: 0x1ffae002 reg: 0c swiz: -U/-Z/-U/-U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 7: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 11 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL IN[1], GENERIC[1], PERSPECTIVE DCL IN[2], GENERIC[2], PERSPECTIVE DCL IN[3], GENERIC[3], PERSPECTIVE DCL IN[4], GENERIC[4], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0..2] 0: TEX TEMP[0], IN[0], SAMP[0], 2D 1: MUL TEMP[0], TEMP[0], CONST[0].xxxx 2: TEX TEMP[1], IN[1], SAMP[0], 2D 3: TEX TEMP[2], IN[2], SAMP[0], 2D 4: ADD TEMP[1], TEMP[1], TEMP[2] 5: MAD TEMP[0], TEMP[1], CONST[0].yyyy, TEMP[0] 6: TEX TEMP[1], IN[3], SAMP[0], 2D 7: TEX TEMP[2], IN[4], SAMP[0], 2D 8: ADD TEMP[1], TEMP[1], TEMP[2] 9: MAD OUT[0], TEMP[1], CONST[0].zzzz, TEMP[0] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1].xy__, 2D[0]; 3: TEX temp[2], input[2].xy__, 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3].xy__, 2D[0]; 7: TEX temp[2], input[4].xy__, 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD output[0], temp[1], const[0].zzzz, temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: MUL temp[4], temp[3], const[0].xxxx; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: ADD temp[7], temp[5], temp[6]; 5: MAD temp[8], temp[7], const[0].yyyy, temp[4]; 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: ADD temp[11], temp[9], temp[10]; 9: MAD output[0], temp[11], const[0].zzzz, temp[8]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[3], input[0].xy__, 2D[0]; 1: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[4].w, src0.w, src1.x, src0.0 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] MAD temp[7].xyz, src0.xyz, src0.111, src1.xyz MAD temp[7].w, src0.w, src0.1, src1.w 5: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[0], src1.w = temp[4], src2.xyz = temp[4] MAD temp[8].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[8].w, src0.w, src1.y, src1.w 6: TEX temp[9], input[3].xy__, 2D[0]; 7: TEX temp[10], input[4].xy__, 2D[0]; 8: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = temp[10], src1.w = temp[10] MAD temp[11].xyz, src0.xyz, src0.111, src1.xyz MAD temp[11].w, src0.w, src0.1, src1.w 9: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = const[0], src1.w = temp[8], src2.xyz = temp[8] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[3], input[0].xy__, 2D[0]; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: TEX temp[9], input[3].xy__, 2D[0]; 5: TEX temp[10], input[4].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] SEM_WAIT MAD temp[7].xyz, src0.xyz, src0.111, src1.xyz MAD temp[7].w, src0.w, src0.1, src1.w 7: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = temp[10], src1.w = temp[10] MAD temp[11].xyz, src0.xyz, src0.111, src1.xyz MAD temp[11].w, src0.w, src0.1, src1.w 8: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[4].w, src0.w, src1.x, src0.0 9: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[0], src1.w = temp[4], src2.xyz = temp[4] MAD temp[8].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[8].w, src0.w, src1.y, src1.w 10: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = const[0], src1.w = temp[8], src2.xyz = temp[8] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[3], input[0].xy__, 2D[0]; 2: TEX temp[5], input[1].xy__, 2D[0]; 3: TEX temp[6], input[2].xy__, 2D[0]; 4: TEX temp[9], input[3].xy__, 2D[0]; 5: TEX temp[10], input[4].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] SEM_WAIT MAD temp[7].xyz, src0.xyz, src0.111, src1.xyz MAD temp[7].w, src0.w, src0.1, src1.w 7: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = temp[10], src1.w = temp[10] MAD temp[11].xyz, src0.xyz, src0.111, src1.xyz MAD temp[11].w, src0.w, src0.1, src1.w 8: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[4].w, src0.w, src1.x, src0.0 9: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[0], src1.w = temp[4], src2.xyz = temp[4] MAD temp[8].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[8].w, src0.w, src1.y, src1.w 10: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = const[0], src1.w = temp[8], src2.xyz = temp[8] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0]; 2: TEX temp[1], input[1].xy__, 2D[0]; 3: TEX temp[2], input[2].xy__, 2D[0]; 4: TEX temp[3], input[3].xy__, 2D[0]; 5: TEX temp[4], input[4].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = temp[2], src1.w = temp[2] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src1.xyz MAD temp[1].w, src0.w, src0.1, src1.w 7: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = temp[4], src1.w = temp[4] MAD temp[2].xyz, src0.xyz, src0.111, src1.xyz MAD temp[2].w, src0.w, src0.1, src1.w 8: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[0] MAD temp[0].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[0].w, src0.w, src1.x, src0.0 9: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = const[0], src1.w = temp[0], src2.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[0].w, src0.w, src1.y, src1.w 10: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[0], src1.w = temp[0], src2.xyz = temp[0] MAD color[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD color[0].w, src0.w, src1.z, src1.w R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe402f402: src: 2 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe403f403: src: 3 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f404: src: 4 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 6 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 7 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x0008c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 R 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 8 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00040001:Addr0: 1t, Addr1: 0c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0028c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 9 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x00040002:Addr0: 2t, Addr1: 0c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x0048c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 10 Instructions ~ 5 Vector Instructions (RGB) ~ 5 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 5 Texture Instructions ~ 0 Presub Operations ~ 0 OMOD Operations ~ 5 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL CONST[0] DCL CONST[2..6] DCL TEMP[0] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[3] 2: MAD TEMP[0], IN[0].yyyy, CONST[4], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[5], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[6], TEMP[0] 5: ADD OUT[2].xy, IN[1], CONST[0] 6: MUL OUT[3].xy, IN[2], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[7].xxxy; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[1], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1], const[0]; 6: MUL output[3].xy, input[2], const[7].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[7].xxxy; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[1], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1], const[0]; 6: MUL output[3].xy, input[2], const[7].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[7].xxxy; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[1], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1], const[0]; 6: MUL output[3].xy, input[2], const[7].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[7]._xxy; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[1], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[7].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[1], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[7].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[1], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[7].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[0], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[7].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[0]; 9: MOV output[4], temp[0]; CONST[7] = { 0.0000 1.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[0], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[7].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[0]; 9: MOV output[4], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[3]; 2: MAD temp[0], input[0].yyyy, const[4], temp[0]; 3: MAD temp[0], input[0].zzzz, const[5], temp[0]; 4: MAD temp[0], input[0].wwww, const[6], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[7].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MOV output[0], temp[0]; 9: MOV output[4], temp[0]; Final vertex program code: 0: op: 0x00e06203 dst: 3o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90021 reg: 1i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90041 reg: 2i swiz: X/ Y/ U/ U src1: 0x01fa40e2 reg: 7c swiz: Z/ Z/ U/ U src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 7: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 9: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 10 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL CONST[0..2] DCL TEMP[0..3] 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: TEX TEMP[1], IN[2], SAMP[1], 2D 2: MUL TEMP[0], TEMP[0], CONST[0] 3: MUL_SAT TEMP[2], TEMP[0], TEMP[1] 4: MAD_SAT TEMP[3].x, IN[0].xxxx, CONST[1].xxxx, CONST[1].yyyy 5: LRP OUT[0].xyz, TEMP[3].xxxx, TEMP[2], CONST[2] 6: MOV OUT[0].w, TEMP[2] 7: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].xxxx, const[1].xxxx, const[1].yyyy; 5: LRP output[0].xyz, temp[3].xxxx, temp[2], const[2]; 6: MOV output[0].w, temp[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].xxxx, const[1].xxxx, const[1].yyyy; 5: LRP output[0].xyz, temp[3].xxxx, temp[2], const[2]; 6: MOV output[0].w, temp[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].xxxx, const[1].xxxx, const[1].yyyy; 5: LRP output[0].xyz, temp[3].xxxx, temp[2], const[2]; 6: MOV output[0].w, temp[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].xxxx, const[1].xxxx, const[1].yyyy; 5: LRP output[0].xyz, temp[3].xxxx, temp[2], const[2]; 6: MOV output[0].w, temp[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].xxxx, const[1].xxxx, const[1].yyyy; 5: LRP output[0].xyz, temp[3].xxxx, temp[2], const[2]; 6: MOV output[0].w, temp[2]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].xxxx, const[1].xxxx, const[1].yyyy; 5: LRP output[0].xyz, temp[3].xxxx, temp[2], const[2]; 6: MOV output[0].w, temp[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].xxxx, const[1].xxxx, const[1].yyyy; 5: ADD temp[4].xyz, temp[2], -const[2]; 6: MAD output[0].xyz, temp[3].xxxx, temp[4], const[2]; 7: MOV output[0].w, temp[2]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: TEX temp[1], input[2].xy__, 2D[1]; 2: MUL temp[0], temp[0], const[0]; 3: MUL_SAT temp[2], temp[0], temp[1]; 4: MAD_SAT temp[3].x, input[0].x___, const[1].x___, const[1].y___; 5: ADD temp[4].xyz, temp[2].xyz_, -const[2].xyz_; 6: MAD output[0].xyz, temp[3].xxx_, temp[4].xyz_, const[2].xyz_; 7: MOV output[0].w, temp[2].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[5], input[1].xy__, 2D[0]; 1: TEX temp[6], input[2].xy__, 2D[1]; 2: MUL temp[7], temp[5], const[0]; 3: MUL_SAT temp[8], temp[7], temp[6]; 4: MAD_SAT temp[9].x, input[0].x___, const[1].x___, const[1].y___; 5: ADD temp[10].xyz, temp[8].xyz_, -const[2].xyz_; 6: MAD output[0].xyz, temp[9].xxx_, temp[10].xyz_, const[2].xyz_; 7: MOV output[0].w, temp[8].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[5], input[1].xy__, 2D[0]; 1: TEX temp[6], input[2].xy__, 2D[1]; 2: MUL temp[7], temp[5], const[0]; 3: MUL_SAT temp[8], temp[7], temp[6]; 4: MAD_SAT temp[9].x, input[0].x___, const[1].x___, const[1].y___; 5: MAD output[0].xyz, temp[9].xxx_, (temp[8] - const[2]).xyz_, const[2].xyz_; 6: MOV output[0].w, temp[8].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[5], input[1].xy__, 2D[0]; 1: TEX temp[6], input[2].xy__, 2D[1]; 2: MUL temp[7], temp[5], const[0]; 3: MUL_SAT temp[8], temp[7], temp[6]; 4: MAD_SAT temp[9].x, input[0].x___, const[1].x___, const[1].y___; 5: MAD output[0].xyz, temp[9].xxx_, (temp[8] - const[2]).xyz_, const[2].xyz_; 6: MOV output[0].w, temp[8].___w; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[5], input[1].xy__, 2D[0]; 1: TEX temp[6], input[2].xy__, 2D[1]; 2: MUL temp[7], temp[5], const[0]; 3: MUL_SAT temp[8], temp[7], temp[6]; 4: MAD_SAT temp[9].x, input[0].x___, const[1].x___, const[1].y___; 5: MAD output[0].xyz, temp[9].xxx_, (temp[8] - const[2]).xyz_, const[2].xyz_; 6: MOV output[0].w, temp[8].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[5], input[1].xy__, 2D[0]; 1: TEX temp[6], input[2].xy__, 2D[1]; 2: MUL temp[7], temp[5], const[0]; 3: MUL_SAT temp[8], temp[7], temp[6]; 4: MAD_SAT temp[9].x, input[0].x___, const[1].x___, const[1].y___; 5: MAD output[0].xyz, temp[9].xxx_, (temp[8] - const[2]).xyz_, const[2].xyz_; 6: MOV output[0].w, temp[8].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[5], input[1].xy__, 2D[0]; 1: TEX temp[6], input[2].xy__, 2D[1]; 2: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0], src1.w = const[0] MAD temp[7].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[7].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[6], src1.w = temp[6] MAD_SAT temp[8].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[8].w, src0.w, src1.w, src0.0 4: src0.xyz = input[0], src1.xyz = const[1] MAD_SAT temp[9].x, src0.x__, src1.x__, src1.y__ 5: src0.xyz = const[2], src1.xyz = temp[8], src2.xyz = temp[9], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 6: src0.w = temp[8] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[1].xy__, 2D[0]; 2: TEX temp[6], input[2].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[0], src1.xyz = const[1] MAD_SAT temp[9].x, src0.x__, src1.x__, src1.y__ 4: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0], src1.w = const[0] SEM_WAIT MAD temp[7].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[7].w, src0.w, src1.w, src0.0 5: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[6], src1.w = temp[6] MAD_SAT temp[8].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[8].w, src0.w, src1.w, src0.0 6: src0.xyz = const[2], src0.w = temp[8], src1.xyz = temp[8], src2.xyz = temp[9], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[1].xy__, 2D[0]; 2: TEX temp[6], input[2].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[0], src1.xyz = const[1] MAD_SAT temp[9].x, src0.x__, src1.x__, src1.y__ 4: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0], src1.w = const[0] SEM_WAIT MAD temp[7].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[7].w, src0.w, src1.w, src0.0 5: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[6], src1.w = temp[6] MAD_SAT temp[8].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[8].w, src0.w, src1.w, src0.0 6: src0.xyz = const[2], src0.w = temp[8], src1.xyz = temp[8], src2.xyz = temp[9], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0]; 2: TEX temp[1], input[1].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[2], src1.xyz = const[1] MAD_SAT temp[2].x, src0.x__, src1.x__, src1.y__ 4: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[0], src1.w = const[0] SEM_WAIT MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[0].w, src0.w, src1.w, src0.0 5: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = temp[1], src1.w = temp[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[0].w, src0.w, src1.w, src0.0 6: src0.xyz = const[2], src0.w = temp[0], src1.xyz = temp[0], src2.xyz = temp[2], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08040402:Addr0: 2t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00485020:MAD dest:2 rgb_C_src:1 G/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00187a00:ALU NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 5 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40200102:Addr0: 2c, Addr1: 0t, Addr2: 2t, srcp:1 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 6 Instructions ~ 4 Vector Instructions (RGB) ~ 3 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 2 Texture Instructions ~ 1 Presub Operations ~ 0 OMOD Operations ~ 3 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..9] DCL TEMP[0] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[6] 2: MAD TEMP[0], IN[0].yyyy, CONST[7], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[8], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[9], TEMP[0] 5: ADD OUT[2].xy, IN[1], CONST[0] 6: MUL OUT[3].xy, IN[2], IMM[0].zzzz 7: DP4 OUT[4].x, CONST[2], IN[0] 8: DP4 OUT[4].y, CONST[3], IN[0] 9: DP4 OUT[4].z, CONST[4], IN[0] 10: DP4 OUT[1].x, -IN[0], CONST[5] 11: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[10].xxxy; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[1], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1], const[0]; 6: MUL output[3].xy, input[2], const[10].zzzz; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[1]; 12: MOV output[5], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[10].xxxy; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[1], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1], const[0]; 6: MUL output[3].xy, input[2], const[10].zzzz; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[1]; 12: MOV output[5], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[10].xxxy; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[1], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1], const[0]; 6: MUL output[3].xy, input[2], const[10].zzzz; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[1]; 12: MOV output[5], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[10]._xxy; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[1], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[10].zz__; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[1]; 12: MOV output[5], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[1], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[10].zz__; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[1]; 12: MOV output[5], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[1], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[10].zz__; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[1]; 12: MOV output[5], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[0], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[10].zz__; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[0]; 12: MOV output[5], temp[0]; CONST[10] = { 0.0000 1.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[0], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[10].zz__; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[0]; 12: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[6]; 2: MAD temp[0], input[0].yyyy, const[7], temp[0]; 3: MAD temp[0], input[0].zzzz, const[8], temp[0]; 4: MAD temp[0], input[0].wwww, const[9], temp[0]; 5: ADD output[2].xy, input[1].xy__, const[0].xy__; 6: MUL output[3].xy, input[2].xy__, const[10].zz__; 7: DP4 output[4].x, const[2], input[0]; 8: DP4 output[4].y, const[3], input[0]; 9: DP4 output[4].z, const[4], input[0]; 10: DP4 output[1].x, -input[0], const[5]; 11: MOV output[0], temp[0]; 12: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90021 reg: 1i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90041 reg: 2i swiz: X/ Y/ U/ U src1: 0x01fa4142 reg: 10c swiz: Z/ Z/ U/ U src2: 0x01248142 reg: 10c swiz: 0/ 0/ 0/ 0 7: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 8: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 9: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 11: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 12: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 13 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[7] DCL CONST[1..4] DCL TEMP[0..5] IMM[0] FLT32 { 0.0000, 8.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: TEX TEMP[1], IN[2], SAMP[1], 2D 2: TEX TEMP[2], IN[3], SAMP[7], 2D 3: MAD_SAT TEMP[3].xy, -IN[3].zzzz, TEMP[2].yyyy, TEMP[2].xzzz 4: CMP TEMP[2].w, -TEMP[3].xxxx, TEMP[2].wwww, IMM[0].xxxx 5: MAD_SAT TEMP[2].w, -IMM[0].yyyy, TEMP[3].yyyy, TEMP[2].wwww 6: SUB_SAT TEMP[3].xyz, TEMP[1], CONST[1] 7: MAD TEMP[1].xyz, TEMP[2].wwww, -TEMP[3], TEMP[1] 8: MUL TEMP[0], TEMP[0], CONST[2] 9: MUL_SAT TEMP[4], TEMP[0], TEMP[1] 10: MAD_SAT TEMP[5].x, IN[0].xxxx, CONST[3].xxxx, CONST[3].yyyy 11: LRP OUT[0].xyz, TEMP[5].xxxx, TEMP[4], CONST[4] 12: MOV OUT[0].w, TEMP[4] 13: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: TEX temp[2], input[3], 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zzzz, temp[2].yyyy, temp[2].xzzz; 4: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[5].xxxx; 5: MAD_SAT temp[2].w, -const[5].yyyy, temp[3].yyyy, temp[2].wwww; 6: SUB_SAT temp[3].xyz, temp[1], const[1]; 7: MAD temp[1].xyz, temp[2].wwww, -temp[3], temp[1]; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].xxxx, const[3].xxxx, const[3].yyyy; 11: LRP output[0].xyz, temp[5].xxxx, temp[4], const[4]; 12: MOV output[0].w, temp[4]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: TEX temp[2], input[3], 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zzzz, temp[2].yyyy, temp[2].xzzz; 4: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[5].xxxx; 5: MAD_SAT temp[2].w, -const[5].yyyy, temp[3].yyyy, temp[2].wwww; 6: SUB_SAT temp[3].xyz, temp[1], const[1]; 7: MAD temp[1].xyz, temp[2].wwww, -temp[3], temp[1]; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].xxxx, const[3].xxxx, const[3].yyyy; 11: LRP output[0].xyz, temp[5].xxxx, temp[4], const[4]; 12: MOV output[0].w, temp[4]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: TEX temp[2], input[3], 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zzzz, temp[2].yyyy, temp[2].xzzz; 4: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[5].xxxx; 5: MAD_SAT temp[2].w, -const[5].yyyy, temp[3].yyyy, temp[2].wwww; 6: SUB_SAT temp[3].xyz, temp[1], const[1]; 7: MAD temp[1].xyz, temp[2].wwww, -temp[3], temp[1]; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].xxxx, const[3].xxxx, const[3].yyyy; 11: LRP output[0].xyz, temp[5].xxxx, temp[4], const[4]; 12: MOV output[0].w, temp[4]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: TEX temp[2], input[3], 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zzzz, temp[2].yyyy, temp[2].xzzz; 4: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[5].xxxx; 5: MAD_SAT temp[2].w, -const[5].yyyy, temp[3].yyyy, temp[2].wwww; 6: SUB_SAT temp[3].xyz, temp[1], const[1]; 7: MAD temp[1].xyz, temp[2].wwww, -temp[3], temp[1]; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].xxxx, const[3].xxxx, const[3].yyyy; 11: LRP output[0].xyz, temp[5].xxxx, temp[4], const[4]; 12: MOV output[0].w, temp[4]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: TEX temp[2], input[3], 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zzzz, temp[2].yyyy, temp[2].xzzz; 4: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[5].xxxx; 5: MAD_SAT temp[2].w, -const[5].yyyy, temp[3].yyyy, temp[2].wwww; 6: SUB_SAT temp[3].xyz, temp[1], const[1]; 7: MAD temp[1].xyz, temp[2].wwww, -temp[3], temp[1]; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].xxxx, const[3].xxxx, const[3].yyyy; 11: LRP output[0].xyz, temp[5].xxxx, temp[4], const[4]; 12: MOV output[0].w, temp[4]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: TEX temp[2], input[3], 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zzzz, temp[2].yyyy, temp[2].xzzz; 4: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[5].xxxx; 5: MAD_SAT temp[2].w, -const[5].yyyy, temp[3].yyyy, temp[2].wwww; 6: SUB_SAT temp[3].xyz, temp[1], const[1]; 7: MAD temp[1].xyz, temp[2].wwww, -temp[3], temp[1]; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].xxxx, const[3].xxxx, const[3].yyyy; 11: LRP output[0].xyz, temp[5].xxxx, temp[4], const[4]; 12: MOV output[0].w, temp[4]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2D[1]; 2: TEX temp[2], input[3], 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zzzz, temp[2].yyyy, temp[2].xzzz; 4: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[5].xxxx; 5: MAD_SAT temp[2].w, -const[5].yyyy, temp[3].yyyy, temp[2].wwww; 6: ADD_SAT temp[3].xyz, temp[1], -const[1]; 7: MAD temp[1].xyz, temp[2].wwww, -temp[3], temp[1]; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].xxxx, const[3].xxxx, const[3].yyyy; 11: ADD temp[6].xyz, temp[4], -const[4]; 12: MAD output[0].xyz, temp[5].xxxx, temp[6], const[4]; 13: MOV output[0].w, temp[4]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: TEX temp[1], input[2].xy__, 2D[1]; 2: TEX temp[2], input[3].xy__, 2D[7]; 3: MAD_SAT temp[3].xy, -input[3].zz__, temp[2].yy__, temp[2].xz__; 4: CMP temp[2].w, -temp[3].___x, temp[2].___w, const[5].___x; 5: MAD_SAT temp[2].w, -const[5].___y, temp[3].___y, temp[2].___w; 6: ADD_SAT temp[3].xyz, temp[1].xyz_, -const[1].xyz_; 7: MAD temp[1].xyz, temp[2].www_, -temp[3].xyz_, temp[1].xyz_; 8: MUL temp[0], temp[0], const[2]; 9: MUL_SAT temp[4], temp[0], temp[1]; 10: MAD_SAT temp[5].x, input[0].x___, const[3].x___, const[3].y___; 11: ADD temp[6].xyz, temp[4].xyz_, -const[4].xyz_; 12: MAD output[0].xyz, temp[5].xxx_, temp[6].xyz_, const[4].xyz_; 13: MOV output[0].w, temp[4].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[7], input[1].xy__, 2D[0]; 1: TEX temp[8], input[2].xy__, 2D[1]; 2: TEX temp[9], input[3].xy__, 2D[7]; 3: MAD_SAT temp[10].xy, -input[3].zz__, temp[9].yy__, temp[9].xz__; 4: CMP temp[11].w, -temp[10].___x, temp[9].___w, const[5].___x; 5: MAD_SAT temp[12].w, -const[5].___y, temp[10].___y, temp[11].___w; 6: ADD_SAT temp[13].xyz, temp[8].xyz_, -const[1].xyz_; 7: MAD temp[8].xyz, temp[12].www_, -temp[13].xyz_, temp[8].xyz_; 8: MUL temp[14], temp[7], const[2]; 9: MUL_SAT temp[15], temp[14], temp[8]; 10: MAD_SAT temp[16].x, input[0].x___, const[3].x___, const[3].y___; 11: ADD temp[17].xyz, temp[15].xyz_, -const[4].xyz_; 12: MAD output[0].xyz, temp[16].xxx_, temp[17].xyz_, const[4].xyz_; 13: MOV output[0].w, temp[15].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[7], input[1].xy__, 2D[0]; 1: TEX temp[8], input[2].xy__, 2D[1]; 2: TEX temp[9], input[3].xy__, 2D[7]; 3: MAD_SAT temp[10].xy, -input[3].zz__, temp[9].yy__, temp[9].xz__; 4: CMP temp[11].w, -temp[10].___x, temp[9].___w, none.___0; 5: MAD_SAT temp[12].w, -const[5].___y, temp[10].___y, temp[11].___w; 6: ADD_SAT temp[13].xyz, temp[8].xyz_, -const[1].xyz_; 7: MAD temp[8].xyz, temp[12].www_, -temp[13].xyz_, temp[8].xyz_; 8: MUL temp[14], temp[7], const[2]; 9: MUL_SAT temp[15], temp[14], temp[8]; 10: MAD_SAT temp[16].x, input[0].x___, const[3].x___, const[3].y___; 11: MAD output[0].xyz, temp[16].xxx_, (temp[15] - const[4]).xyz_, const[4].xyz_; 12: MOV output[0].w, temp[15].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[7], input[1].xy__, 2D[0]; 1: TEX temp[8], input[2].xy__, 2D[1]; 2: TEX temp[9], input[3].xy__, 2D[7]; 3: MAD_SAT temp[10].xy, -input[3].zz__, temp[9].yy__, temp[9].xz__; 4: CMP temp[11].w, -temp[10].___x, temp[9].___w, none.___0; 5: MAD_SAT temp[12].w, -8.000000 (0x50).___w, temp[10].___y, temp[11].___w; 6: ADD_SAT temp[13].xyz, temp[8].xyz_, -const[1].xyz_; 7: MAD temp[8].xyz, temp[12].www_, -temp[13].xyz_, temp[8].xyz_; 8: MUL temp[14], temp[7], const[2]; 9: MUL_SAT temp[15], temp[14], temp[8]; 10: MAD_SAT temp[16].x, input[0].x___, const[3].x___, const[3].y___; 11: MAD output[0].xyz, temp[16].xxx_, (temp[15] - const[4]).xyz_, const[4].xyz_; 12: MOV output[0].w, temp[15].___w; CONST[5] = { 0.0000 8.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[7], input[1].xy__, 2D[0]; 1: TEX temp[8], input[2].xy__, 2D[1]; 2: TEX temp[9], input[3].xy__, 2D[7]; 3: MAD_SAT temp[10].xy, -input[3].zz__, temp[9].yy__, temp[9].xz__; 4: CMP temp[11].w, -temp[10].___x, temp[9].___w, none.___0; 5: MAD_SAT temp[12].w, -8.000000 (0x50).___w, temp[10].___y, temp[11].___w; 6: ADD_SAT temp[13].xyz, temp[8].xyz_, -const[1].xyz_; 7: MAD temp[8].xyz, temp[12].www_, -temp[13].xyz_, temp[8].xyz_; 8: MUL temp[14], temp[7], const[2]; 9: MUL_SAT temp[15], temp[14], temp[8]; 10: MAD_SAT temp[16].x, input[0].x___, const[3].x___, const[3].y___; 11: MAD output[0].xyz, temp[16].xxx_, (temp[15] - const[4]).xyz_, const[4].xyz_; 12: MOV output[0].w, temp[15].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[7], input[1].xy__, 2D[0]; 1: TEX temp[8], input[2].xy__, 2D[1]; 2: TEX temp[9], input[3].xy__, 2D[7]; 3: MAD_SAT temp[10].xy, -input[3].zz__, temp[9].yy__, temp[9].xz__; 4: CMP temp[11].w, -temp[10].___x, temp[9].___w, none.___0; 5: MAD_SAT temp[12].w, -8.000000 (0x50).___w, temp[10].___y, temp[11].___w; 6: ADD_SAT temp[13].xyz, temp[8].xyz_, -const[1].xyz_; 7: MAD temp[8].xyz, temp[12].www_, -temp[13].xyz_, temp[8].xyz_; 8: MUL temp[14], temp[7], const[2]; 9: MUL_SAT temp[15], temp[14], temp[8]; 10: MAD_SAT temp[16].x, input[0].x___, const[3].x___, const[3].y___; 11: MAD output[0].xyz, temp[16].xxx_, (temp[15] - const[4]).xyz_, const[4].xyz_; 12: MOV output[0].w, temp[15].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[7], input[1].xy__, 2D[0]; 1: TEX temp[8], input[2].xy__, 2D[1]; 2: TEX temp[9], input[3].xy__, 2D[7]; 3: src0.xyz = input[3], src1.xyz = temp[9] MAD_SAT temp[10].xy, -src0.zz_, src1.yy_, src1.xz_ 4: src0.xyz = temp[10], src0.w = temp[9] CMP temp[11].w, src0.0, src0.w, -src0.x 5: src0.xyz = temp[10], src0.w = 8.000000 (0x50), src1.w = temp[11] MAD_SAT temp[12].w, -src0.w, src0.y, src1.w 6: src0.xyz = temp[8], src1.xyz = const[1] MAD_SAT temp[13].xyz, src0.xyz, src0.111, -src1.xyz 7: src0.xyz = temp[13], src0.w = temp[12], src1.xyz = temp[8] MAD temp[8].xyz, src0.www, -src0.xyz, src1.xyz 8: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[2], src1.w = const[2] MAD temp[14].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[14].w, src0.w, src1.w, src0.0 9: src0.xyz = temp[14], src0.w = temp[14], src1.xyz = temp[8], src1.w = temp[8] MAD_SAT temp[15].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[15].w, src0.w, src1.w, src0.0 10: src0.xyz = input[0], src1.xyz = const[3] MAD_SAT temp[16].x, src0.x__, src1.x__, src1.y__ 11: src0.xyz = const[4], src1.xyz = temp[15], src2.xyz = temp[16], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 12: src0.w = temp[15] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[7], input[1].xy__, 2D[0]; 2: TEX temp[8], input[2].xy__, 2D[1]; 3: TEX temp[9], input[3].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[0], src1.xyz = const[3] MAD_SAT temp[16].w, src0.x, src1.x, src1.y 5: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[2], src1.w = const[2] SEM_WAIT MAD temp[14].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[14].w, src0.w, src1.w, src0.0 6: src0.xyz = temp[8], src1.xyz = const[1] MAD_SAT temp[13].xyz, src0.xyz, src0.111, -src1.xyz 7: src0.xyz = input[3], src1.xyz = temp[9] MAD_SAT temp[10].xy, -src0.zz_, src1.yy_, src1.xz_ 8: src0.xyz = temp[10], src0.w = temp[9] CMP temp[11].w, src0.0, src0.w, -src0.x 9: src0.xyz = temp[10], src0.w = 8.000000 (0x50), src1.w = temp[11] MAD_SAT temp[12].w, -src0.w, src0.y, src1.w 10: src0.xyz = temp[13], src0.w = temp[12], src1.xyz = temp[8] MAD temp[8].xyz, src0.www, -src0.xyz, src1.xyz 11: src0.xyz = temp[14], src0.w = temp[14], src1.xyz = temp[8], src1.w = temp[8] MAD_SAT temp[15].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[15].w, src0.w, src1.w, src0.0 12: src0.xyz = const[4], src0.w = temp[16], src1.xyz = temp[15], src1.w = temp[15], src2.xyz = temp[16], srcp.xyz = (src1 - src0) MAD color[0].xyz, src0.www, srcp.xyz, src0.xyz MAD color[0].w, src1.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[7], input[1].xy__, 2D[0]; 2: TEX temp[8], input[2].xy__, 2D[1]; 3: TEX temp[9], input[3].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[0], src1.xyz = const[3] MAD_SAT temp[16].w, src0.x, src1.x, src1.y 5: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = const[2], src1.w = const[2] SEM_WAIT MAD temp[14].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[14].w, src0.w, src1.w, src0.0 6: src0.xyz = temp[8], src1.xyz = const[1] MAD_SAT temp[13].xyz, src0.xyz, src0.111, -src1.xyz 7: src0.xyz = input[3], src1.xyz = temp[9] MAD_SAT temp[10].xy, -src0.zz_, src1.yy_, src1.xz_ 8: src0.xyz = temp[10], src0.w = temp[9] CMP temp[11].w, src0.0, src0.w, -src0.x 9: src0.xyz = temp[10], src0.w = 8.000000 (0x50), src1.w = temp[11] MAD_SAT temp[12].w, -src0.w, src0.y, src1.w 10: src0.xyz = temp[13], src0.w = temp[12], src1.xyz = temp[8] MAD temp[8].xyz, src0.www, -src0.xyz, src1.xyz 11: src0.xyz = temp[14], src0.w = temp[14], src1.xyz = temp[8], src1.w = temp[8] MAD_SAT temp[15].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[15].w, src0.w, src1.w, src0.0 12: src0.xyz = const[4], src0.w = temp[16], src1.xyz = temp[15], src1.w = temp[15], srcp.xyz = (src1 - src0) MAD color[0].xyz, src0.www, srcp.xyz, src0.xyz MAD color[0].w, src1.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0]; 2: TEX temp[1], input[1].xy__, 2D[1]; 3: TEX temp[4], input[2].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[3], src1.xyz = const[3] MAD_SAT temp[2].w, src0.x, src1.x, src1.y 5: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[2], src1.w = const[2] SEM_WAIT MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[0].w, src0.w, src1.w, src0.0 6: src0.xyz = temp[1], src1.xyz = const[1] MAD_SAT temp[3].xyz, src0.xyz, src0.111, -src1.xyz 7: src0.xyz = input[2], src1.xyz = temp[4] MAD_SAT temp[2].xy, -src0.zz_, src1.yy_, src1.xz_ 8: src0.xyz = temp[2], src0.w = temp[4] CMP temp[3].w, src0.0, src0.w, -src0.x 9: src0.xyz = temp[2], src0.w = 8.000000 (0x50), src1.w = temp[3] MAD_SAT temp[3].w, -src0.w, src0.y, src1.w 10: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = temp[1] MAD temp[2].xyz, src0.www, -src0.xyz, src1.xyz 11: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = temp[2], src1.w = temp[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[0].w, src0.w, src1.w, src0.0 12: src0.xyz = const[4], src0.w = temp[2], src1.xyz = temp[0], src1.w = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src0.www, srcp.xyz, src0.xyz MAD color[0].w, src1.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02470000: id: 7 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f402: src: 2 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08040c03:Addr0: 3t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00080020:MAD dest:2 alp_A_src:0 R 0 alp_B_src:1 R 0 targ 0 w:0 5 RGBA_INST: 0x0a000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:1 G 0 4 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040800:Addr0: 0t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040800:Addr0: 0t, Addr1: 2c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 5 0:CMN_INST 0x00083800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040401:Addr0: 1t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21030:MAD dest:3 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00081800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08001002:Addr0: 2t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084ac48:rgb_A_src:0 B/B/0 1 rgb_B_src:1 G/G/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00441020:MAD dest:2 rgb_C_src:1 R/B/0 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00610036:CMP dest:3 alp_A_src:0 0 0 alp_B_src:0 A 0 targ 0 w:0 5 RGBA_INST: 0x40000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 1 8 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000cd0:Addr0: 208t, Addr1: 3t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0022c030:MAD dest:3 alp_A_src:0 A 1 alp_B_src:0 G 0 targ 0 w:0 5 RGBA_INST: 0x1a000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:1 A 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000403:Addr0: 3t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0144036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 1 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00187a00:ALU NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000800:Addr0: 0t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 11 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x48000104:Addr0: 4c, Addr1: 0t, Addr2: 128t, srcp:1 2:ALPHA_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0d000:MAD dest:0 alp_A_src:1 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 12 Instructions ~ 6 Vector Instructions (RGB) ~ 6 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 1 Presub Operations ~ 0 OMOD Operations ~ 5 Temporary Registers ~ 1 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL CONST[0] DCL CONST[3..6] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 32.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4], IN[1], SAMP[0], 2D 8: MUL TEMP[4].xyz, TEMP[4], CONST[0] 9: TEX TEMP[5].xyz, IN[1], SAMP[3], 2D 10: MAD TEMP[5].xyz, TEMP[5], IMM[0].xxxx, -IMM[0].yyyy 11: ADD TEMP[6].xyz, TEMP[3], TEMP[2] 12: DP3 TEMP[6].w, TEMP[6], TEMP[6] 13: RSQ TEMP[6].w, |TEMP[6].wwww| 14: MUL TEMP[6].xyz, TEMP[6].wwww, TEMP[6] 15: DP3_SAT TEMP[6].w, TEMP[6], TEMP[5] 16: POW TEMP[6].w, TEMP[6].wwww, IMM[0].wwww 17: MUL TEMP[6].w, TEMP[6], TEMP[4] 18: MAD TEMP[4].xyz, TEMP[6].wwww, CONST[3], TEMP[4] 19: DP3_SAT TEMP[2].w, TEMP[5], TEMP[2] 20: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 21: MAX TEMP[0].xyz, TEMP[0], CONST[4] 22: MUL_SAT TEMP[1].xyz, TEMP[4], TEMP[0] 23: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[5].xxxx, CONST[5].yyyy 24: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[6] 25: MOV OUT[0].w, TEMP[1] 26: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[7].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[7].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[7].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[7].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[7].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[7].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: LG2 temp[8].w, temp[6].wwww; 17: MUL temp[8].w, temp[8].wwww, const[7].wwww; 18: EX2 temp[6].w, temp[8].wwww; 19: MUL temp[6].w, temp[6], temp[4]; 20: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 21: DP3_SAT temp[2].w, temp[5], temp[2]; 22: MUL temp[0].xyz, temp[0], temp[2].wwww; 23: MAX temp[0].xyz, temp[0], const[4]; 24: MUL_SAT temp[1].xyz, temp[4], temp[0]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 26: ADD temp[9].xyz, temp[1], -const[6]; 27: MAD output[0].xyz, temp[7].xxxx, temp[9], const[6]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[7].xxx_, -const[7].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4], input[1].xy__, 2D[0]; 8: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 9: TEX temp[5].xyz, input[1].xy__, 2D[3]; 10: MAD temp[5].xyz, temp[5].xyz_, const[7].xxx_, -const[7].yyy_; 11: ADD temp[6].xyz, temp[3].xyz_, temp[2].xyz_; 12: DP3 temp[6].w, temp[6].xyz_, temp[6].xyz_; 13: RSQ temp[6].w, |temp[6].___w|; 14: MUL temp[6].xyz, temp[6].www_, temp[6].xyz_; 15: DP3_SAT temp[6].w, temp[6].xyz_, temp[5].xyz_; 16: LG2 temp[8].w, temp[6].___w; 17: MUL temp[8].w, temp[8].___w, const[7].___w; 18: EX2 temp[6].w, temp[8].___w; 19: MUL temp[6].w, temp[6].___w, temp[4].___w; 20: MAD temp[4].xyz, temp[6].www_, const[3].xyz_, temp[4].xyz_; 21: DP3_SAT temp[2].w, temp[5].xyz_, temp[2].xyz_; 22: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 23: MAX temp[0].xyz, temp[0].xyz_, const[4].xyz_; 24: MUL_SAT temp[1].xyz, temp[4].xyz_, temp[0].xyz_; 25: MAD_SAT temp[7].x, input[0].x___, const[5].x___, const[5].y___; 26: ADD temp[9].xyz, temp[1].xyz_, -const[6].xyz_; 27: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[6].xyz_; 28: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[7].xxx_, -const[7].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[7].xxx_, -const[7].yyy_; 11: ADD temp[21].xyz, temp[16].xyz_, temp[13].xyz_; 12: DP3 temp[22].w, temp[21].xyz_, temp[21].xyz_; 13: RSQ temp[23].w, |temp[22].___w|; 14: MUL temp[24].xyz, temp[23].www_, temp[21].xyz_; 15: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 16: LG2 temp[26].w, temp[25].___w; 17: MUL temp[27].w, temp[26].___w, const[7].___w; 18: EX2 temp[28].w, temp[27].___w; 19: MUL temp[29].w, temp[28].___w, temp[17].___w; 20: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 21: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 22: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 23: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 24: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 25: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 26: ADD temp[36].xyz, temp[34].xyz_, -const[6].xyz_; 27: MAD output[0].xyz, temp[35].xxx_, temp[36].xyz_, const[6].xyz_; 28: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[7].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[7].xxx_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, const[7].___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; CONST[7] = { 2.0000 1.0000 0.0000 32.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17], input[1].xy__, 2D[0]; 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 11: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 12: src0.w = temp[22] RSQ temp[23].w, |src0.w| 13: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 14: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 15: src0.w = temp[25] LG2 temp[26].w, src0.w 16: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 17: src0.w = temp[27] EX2 temp[28].w, src0.w 18: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 19: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 20: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 21: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 22: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD_SAT temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = input[0], src1.xyz = const[5] MAD_SAT temp[35].x, src0.x__, src1.x__, src1.y__ 25: src0.xyz = const[6], src1.xyz = temp[34], src2.xyz = temp[35], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 26: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 6: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 7: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 8: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 9: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 10: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 12: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[5], src1.w = const[0] MAD_SAT temp[35].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 13: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 14: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 15: src0.w = temp[22] RSQ temp[23].w, |src0.w| 16: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 17: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 18: src0.w = temp[25] LG2 temp[26].w, src0.w 19: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 20: src0.w = temp[27] EX2 temp[28].w, src0.w 21: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 22: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD_SAT temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = const[6], src0.w = temp[11], src1.xyz = temp[34], src2.xyz = temp[35], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 6: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 7: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 8: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 9: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 10: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 12: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[5], src1.w = const[0] MAD_SAT temp[35].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 13: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 14: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 15: src0.w = temp[22] RSQ temp[23].w, |src0.w| 16: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 17: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 18: src0.w = temp[25] LG2 temp[26].w, src0.w 19: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 20: src0.w = temp[27] EX2 temp[28].w, src0.w 21: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 22: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD_SAT temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = const[6], src0.w = temp[11], src1.xyz = temp[34], src2.xyz = temp[35], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xy__, 2D[1]; 2: TEX temp[1].xyz, input[1].xy__, 2D[2]; 3: TEX temp[5], input[0].xy__, 2D[0]; 4: TEX temp[0].xyz, input[0].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 6: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[0].w, |src1.w| 7: src0.xyz = temp[5], src1.xyz = const[0] MAD temp[6].xyz, src0.xyz, src1.xyz, src0.000 8: src0.xyz = temp[0], src0.w = 2.000000 (0x40) MAD temp[0].xyz, src0.xyz, src0.www, -src0.111 9: src0.xyz = temp[0], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 10: src0.xyz = temp[4], src0.w = temp[1] MAD temp[7].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[7], src1.xyz = const[4] MAX temp[7].xyz, src0.xyz, src1.xyz 12: src0.xyz = input[3], src0.w = temp[4], src1.xyz = const[5], src1.w = const[0] MAD_SAT temp[3].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[1].w, src0.w, src1.w, src0.0 13: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 14: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[0].w, src0._, src0._ 15: src0.w = temp[0] RSQ temp[0].w, |src0.w| 16: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[1].xyz, src0.www, srcp.xyz, src0.000 17: src0.xyz = temp[1], src1.xyz = temp[0] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[0].w, src0._, src0._ 18: src0.w = temp[0] LG2 temp[0].w, src0.w 19: src0.w = temp[0], src1.w = 32.000000 (0x60) MAD temp[0].w, src0.w, src1.w, src0.0 20: src0.w = temp[0] EX2 temp[0].w, src0.w 21: src0.w = temp[0], src1.w = temp[5] MAD temp[0].w, src0.w, src1.w, src0.0 22: src0.xyz = const[3], src0.w = temp[0], src1.xyz = temp[6] MAD temp[0].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[0], src1.xyz = temp[7] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = const[6], src0.w = temp[1], src1.xyz = temp[0], src2.xyz = temp[3], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe404f401: src: 1 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080000c0:Addr0: 192t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040005:Addr0: 5t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8000:MAD dest:0 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 8 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001f1:DP3 dest:31 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08041007:Addr0: 7t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000075:MAX dest:7 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00184800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x08041403:Addr0: 3t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040004:Addr0: 4t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20485030:MAD dest:3 rgb_C_src:1 G/0/0 0 alp_C_src:0 0 0 12 0:CMN_INST 0x00003a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000161:DP3 dest:22 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000191:DP3 dest:25 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c009:LN2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08038000:Addr0: 0t, Addr1: 224t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 19 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001400:Addr0: 0t, Addr1: 5t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 21 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08001903:Addr0: 3c, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221000:MAD dest:0 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08001c00:Addr0: 0t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40300106:Addr0: 6c, Addr1: 0t, Addr2: 3t, srcp:1 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 24 Instructions ~ 15 Vector Instructions (RGB) ~ 12 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 4 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 8 Temporary Registers ~ 3 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL OUT[5], GENERIC[3] DCL CONST[0] DCL CONST[2..10] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[7] 2: MAD TEMP[0], IN[0].yyyy, CONST[8], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[9], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[10], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: DP4 OUT[5].x, CONST[4], IN[0] 16: DP4 OUT[5].y, CONST[5], IN[0] 17: DP4 OUT[5].z, CONST[6], IN[0] 18: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[11]._xxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; CONST[11] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4162 reg: 11c swiz: Z/ Z/ U/ U src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 7: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6162 reg: 11c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 17: op: 0x00208201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 18: op: 0x00408201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 19: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 20: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 21 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL IN[4], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[7] DCL CONST[0] DCL CONST[3..7] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 8.0000} IMM[1] FLT32 { 32.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4], IN[1], SAMP[0], 2D 8: MUL TEMP[4].xyz, TEMP[4], CONST[0] 9: TEX TEMP[5].xyz, IN[1], SAMP[3], 2D 10: MAD TEMP[5].xyz, TEMP[5], IMM[0].xxxx, -IMM[0].yyyy 11: ADD TEMP[6].xyz, TEMP[3], TEMP[2] 12: DP3 TEMP[6].w, TEMP[6], TEMP[6] 13: RSQ TEMP[6].w, |TEMP[6].wwww| 14: MUL TEMP[6].xyz, TEMP[6].wwww, TEMP[6] 15: DP3_SAT TEMP[6].w, TEMP[6], TEMP[5] 16: POW TEMP[6].w, TEMP[6].wwww, IMM[1].xxxx 17: MUL TEMP[6].w, TEMP[6], TEMP[4] 18: MAD TEMP[4].xyz, TEMP[6].wwww, CONST[3], TEMP[4] 19: DP3_SAT TEMP[2].w, TEMP[5], TEMP[2] 20: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 21: MAX TEMP[0].xyz, TEMP[0], CONST[4] 22: TEX TEMP[2], IN[4], SAMP[7], 2D 23: MAD_SAT TEMP[3].xy, -IN[4].zzzz, TEMP[2].yyyy, TEMP[2].xzzz 24: CMP TEMP[2].w, -TEMP[3].xxxx, TEMP[2].wwww, IMM[0].zzzz 25: MAD_SAT TEMP[2].w, -IMM[0].wwww, TEMP[3].yyyy, TEMP[2].wwww 26: SUB_SAT TEMP[3].xyz, TEMP[0], CONST[5] 27: MAD TEMP[0].xyz, TEMP[2].wwww, -TEMP[3], TEMP[0] 28: MUL_SAT TEMP[1].xyz, TEMP[4], TEMP[0] 29: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[6].xxxx, CONST[6].yyyy 30: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[7] 31: MOV OUT[0].w, TEMP[1] 32: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[9].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[8].zzzz; 25: MAD_SAT temp[2].w, -const[8].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL_SAT temp[1].xyz, temp[4], temp[0]; 29: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 30: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[9].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[8].zzzz; 25: MAD_SAT temp[2].w, -const[8].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL_SAT temp[1].xyz, temp[4], temp[0]; 29: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 30: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[9].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[8].zzzz; 25: MAD_SAT temp[2].w, -const[8].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL_SAT temp[1].xyz, temp[4], temp[0]; 29: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 30: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[9].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[8].zzzz; 25: MAD_SAT temp[2].w, -const[8].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL_SAT temp[1].xyz, temp[4], temp[0]; 29: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 30: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[9].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[8].zzzz; 25: MAD_SAT temp[2].w, -const[8].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL_SAT temp[1].xyz, temp[4], temp[0]; 29: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 30: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[9].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[8].zzzz; 25: MAD_SAT temp[2].w, -const[8].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL_SAT temp[1].xyz, temp[4], temp[0]; 29: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 30: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: LG2 temp[8].w, temp[6].wwww; 17: MUL temp[8].w, temp[8].wwww, const[9].xxxx; 18: EX2 temp[6].w, temp[8].wwww; 19: MUL temp[6].w, temp[6], temp[4]; 20: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 21: DP3_SAT temp[2].w, temp[5], temp[2]; 22: MUL temp[0].xyz, temp[0], temp[2].wwww; 23: MAX temp[0].xyz, temp[0], const[4]; 24: TEX temp[2], input[4], 2D[7]; 25: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 26: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[8].zzzz; 27: MAD_SAT temp[2].w, -const[8].wwww, temp[3].yyyy, temp[2].wwww; 28: ADD_SAT temp[3].xyz, temp[0], -const[5]; 29: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 30: MUL_SAT temp[1].xyz, temp[4], temp[0]; 31: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 32: ADD temp[9].xyz, temp[1], -const[7]; 33: MAD output[0].xyz, temp[7].xxxx, temp[9], const[7]; 34: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4], input[1].xy__, 2D[0]; 8: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 9: TEX temp[5].xyz, input[1].xy__, 2D[3]; 10: MAD temp[5].xyz, temp[5].xyz_, const[8].xxx_, -const[8].yyy_; 11: ADD temp[6].xyz, temp[3].xyz_, temp[2].xyz_; 12: DP3 temp[6].w, temp[6].xyz_, temp[6].xyz_; 13: RSQ temp[6].w, |temp[6].___w|; 14: MUL temp[6].xyz, temp[6].www_, temp[6].xyz_; 15: DP3_SAT temp[6].w, temp[6].xyz_, temp[5].xyz_; 16: LG2 temp[8].w, temp[6].___w; 17: MUL temp[8].w, temp[8].___w, const[9].___x; 18: EX2 temp[6].w, temp[8].___w; 19: MUL temp[6].w, temp[6].___w, temp[4].___w; 20: MAD temp[4].xyz, temp[6].www_, const[3].xyz_, temp[4].xyz_; 21: DP3_SAT temp[2].w, temp[5].xyz_, temp[2].xyz_; 22: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 23: MAX temp[0].xyz, temp[0].xyz_, const[4].xyz_; 24: TEX temp[2], input[4].xy__, 2D[7]; 25: MAD_SAT temp[3].xy, -input[4].zz__, temp[2].yy__, temp[2].xz__; 26: CMP temp[2].w, -temp[3].___x, temp[2].___w, const[8].___z; 27: MAD_SAT temp[2].w, -const[8].___w, temp[3].___y, temp[2].___w; 28: ADD_SAT temp[3].xyz, temp[0].xyz_, -const[5].xyz_; 29: MAD temp[0].xyz, temp[2].www_, -temp[3].xyz_, temp[0].xyz_; 30: MUL_SAT temp[1].xyz, temp[4].xyz_, temp[0].xyz_; 31: MAD_SAT temp[7].x, input[0].x___, const[6].x___, const[6].y___; 32: ADD temp[9].xyz, temp[1].xyz_, -const[7].xyz_; 33: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[7].xyz_; 34: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[8].xxx_, -const[8].yyy_; 11: ADD temp[21].xyz, temp[16].xyz_, temp[13].xyz_; 12: DP3 temp[22].w, temp[21].xyz_, temp[21].xyz_; 13: RSQ temp[23].w, |temp[22].___w|; 14: MUL temp[24].xyz, temp[23].www_, temp[21].xyz_; 15: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 16: LG2 temp[26].w, temp[25].___w; 17: MUL temp[27].w, temp[26].___w, const[9].___x; 18: EX2 temp[28].w, temp[27].___w; 19: MUL temp[29].w, temp[28].___w, temp[17].___w; 20: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 21: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 22: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 23: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 24: TEX temp[34], input[4].xy__, 2D[7]; 25: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 26: CMP temp[36].w, -temp[35].___x, temp[34].___w, const[8].___z; 27: MAD_SAT temp[37].w, -const[8].___w, temp[35].___y, temp[36].___w; 28: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 29: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 30: MUL_SAT temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 31: MAD_SAT temp[41].x, input[0].x___, const[6].x___, const[6].y___; 32: ADD temp[42].xyz, temp[40].xyz_, -const[7].xyz_; 33: MAD output[0].xyz, temp[41].xxx_, temp[42].xyz_, const[7].xyz_; 34: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[8].xxx_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, const[9].___x; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -const[8].___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL_SAT temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: MAD_SAT temp[41].x, input[0].x___, const[6].x___, const[6].y___; 31: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[7]).xyz_, const[7].xyz_; 32: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -8.000000 (0x50).___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL_SAT temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: MAD_SAT temp[41].x, input[0].x___, const[6].x___, const[6].y___; 31: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[7]).xyz_, const[7].xyz_; 32: MOV output[0].w, temp[11].___w; CONST[8] = { 2.0000 1.0000 0.0000 8.0000 } CONST[9] = { 32.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -8.000000 (0x50).___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL_SAT temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: MAD_SAT temp[41].x, input[0].x___, const[6].x___, const[6].y___; 31: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[7]).xyz_, const[7].xyz_; 32: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -8.000000 (0x50).___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL_SAT temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: MAD_SAT temp[41].x, input[0].x___, const[6].x___, const[6].y___; 31: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[7]).xyz_, const[7].xyz_; 32: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17], input[1].xy__, 2D[0]; 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 11: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 12: src0.w = temp[22] RSQ temp[23].w, |src0.w| 13: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 14: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 15: src0.w = temp[25] LG2 temp[26].w, src0.w 16: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 17: src0.w = temp[27] EX2 temp[28].w, src0.w 18: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 19: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 20: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 21: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 22: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 23: TEX temp[34], input[4].xy__, 2D[7]; 24: src0.xyz = input[4], src1.xyz = temp[34] MAD_SAT temp[35].xy, -src0.zz_, src1.yy_, src1.xz_ 25: src0.xyz = temp[35], src0.w = temp[34] CMP temp[36].w, src0.0, src0.w, -src0.x 26: src0.xyz = temp[35], src0.w = 8.000000 (0x50), src1.w = temp[36] MAD_SAT temp[37].w, -src0.w, src0.y, src1.w 27: src0.xyz = temp[33], src1.xyz = const[5] MAD_SAT temp[38].xyz, src0.xyz, src0.111, -src1.xyz 28: src0.xyz = temp[38], src0.w = temp[37], src1.xyz = temp[33] MAD temp[39].xyz, src0.www, -src0.xyz, src1.xyz 29: src0.xyz = temp[30], src1.xyz = temp[39] MAD_SAT temp[40].xyz, src0.xyz, src1.xyz, src0.000 30: src0.xyz = input[0], src1.xyz = const[6] MAD_SAT temp[41].x, src0.x__, src1.x__, src1.y__ 31: src0.xyz = const[7], src1.xyz = temp[40], src2.xyz = temp[41], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 32: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3]; 5: TEX temp[34], input[4].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 11: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = input[4], src1.xyz = temp[34] MAD_SAT temp[35].xy, -src0.zz_, src1.yy_, src1.xz_ 13: src0.xyz = input[3], src0.w = temp[15], src1.xyz = temp[35], src1.w = temp[34] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 CMP temp[36].w, src0.0, src1.w, -src1.x 14: src0.xyz = temp[32], src0.w = 8.000000 (0x50), src1.xyz = const[4], src1.w = temp[36], src2.xyz = temp[35] MAX temp[33].xyz, src0.xyz, src1.xyz MAD_SAT temp[37].w, -src0.w, src2.y, src1.w 15: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 16: src0.xyz = temp[33], src0.w = temp[22], src1.xyz = const[5] MAD_SAT temp[38].xyz, src0.xyz, src0.111, -src1.xyz RSQ temp[23].w, |src0.w| 17: src0.xyz = temp[38], src0.w = temp[37], src1.xyz = temp[33] MAD temp[39].xyz, src0.www, -src0.xyz, src1.xyz 18: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 19: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 20: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[41].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 21: src0.w = temp[25] LG2 temp[26].w, src0.w 22: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 23: src0.w = temp[27] EX2 temp[28].w, src0.w 24: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 25: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 26: src0.xyz = temp[30], src1.xyz = temp[39] MAD_SAT temp[40].xyz, src0.xyz, src1.xyz, src0.000 27: src0.xyz = const[7], src0.w = temp[11], src1.xyz = temp[40], src2.xyz = temp[41], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3]; 5: TEX temp[34], input[4].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 11: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = input[4], src1.xyz = temp[34] MAD_SAT temp[35].xy, -src0.zz_, src1.yy_, src1.xz_ 13: src0.xyz = input[3], src0.w = temp[15], src1.xyz = temp[35], src1.w = temp[34] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 CMP temp[36].w, src0.0, src1.w, -src1.x 14: src0.xyz = temp[32], src0.w = 8.000000 (0x50), src1.xyz = const[4], src1.w = temp[36], src2.xyz = temp[35] MAX temp[33].xyz, src0.xyz, src1.xyz MAD_SAT temp[37].w, -src0.w, src2.y, src1.w 15: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 16: src0.xyz = temp[33], src0.w = temp[22], src1.xyz = const[5] MAD_SAT temp[38].xyz, src0.xyz, src0.111, -src1.xyz RSQ temp[23].w, |src0.w| 17: src0.xyz = temp[38], src0.w = temp[37], src1.xyz = temp[33] MAD temp[39].xyz, src0.www, -src0.xyz, src1.xyz 18: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 19: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 20: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[41].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 21: src0.w = temp[25] LG2 temp[26].w, src0.w 22: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 23: src0.w = temp[27] EX2 temp[28].w, src0.w 24: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 25: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 26: src0.xyz = temp[30], src1.xyz = temp[39] MAD_SAT temp[40].xyz, src0.xyz, src1.xyz, src0.000 27: src0.xyz = const[7], src0.w = temp[11], src1.xyz = temp[40], src2.xyz = temp[41], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[1].xy__, 2D[1]; 2: TEX temp[1].xyz, input[1].xy__, 2D[2]; 3: TEX temp[6], input[0].xy__, 2D[0]; 4: TEX temp[0].xyz, input[0].xy__, 2D[3]; 5: TEX temp[7], input[3].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 7: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[0].w, |src1.w| 8: src0.xyz = temp[6], src1.xyz = const[0] MAD temp[8].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[0], src0.w = 2.000000 (0x40) MAD temp[0].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[0], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 11: src0.xyz = temp[5], src0.w = temp[1] MAD temp[9].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = input[3], src1.xyz = temp[7] MAD_SAT temp[3].xy, -src0.zz_, src1.yy_, src1.xz_ 13: src0.xyz = input[2], src0.w = temp[0], src1.xyz = temp[3], src1.w = temp[7] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 CMP temp[0].w, src0.0, src1.w, -src1.x 14: src0.xyz = temp[9], src0.w = 8.000000 (0x50), src1.xyz = const[4], src1.w = temp[0], src2.xyz = temp[3] MAX temp[3].xyz, src0.xyz, src1.xyz MAD_SAT temp[0].w, -src0.w, src2.y, src1.w 15: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[1].w, src0._, src0._ 16: src0.xyz = temp[3], src0.w = temp[1], src1.xyz = const[5] MAD_SAT temp[7].xyz, src0.xyz, src0.111, -src1.xyz RSQ temp[1].w, |src0.w| 17: src0.xyz = temp[7], src0.w = temp[0], src1.xyz = temp[3] MAD temp[3].xyz, src0.www, -src0.xyz, src1.xyz 18: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[1].xyz, src0.www, srcp.xyz, src0.000 19: src0.xyz = temp[1], src1.xyz = temp[0] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[0].w, src0._, src0._ 20: src0.xyz = input[4], src0.w = temp[5], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[0].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[1].w, src0.w, src1.w, src0.0 21: src0.w = temp[0] LG2 temp[0].w, src0.w 22: src0.w = temp[0], src1.w = 32.000000 (0x60) MAD temp[0].w, src0.w, src1.w, src0.0 23: src0.w = temp[0] EX2 temp[0].w, src0.w 24: src0.w = temp[0], src1.w = temp[6] MAD temp[0].w, src0.w, src1.w, src0.0 25: src0.xyz = const[3], src0.w = temp[0], src1.xyz = temp[8] MAD temp[1].xyz, src0.www, src0.xyz, src1.xyz 26: src0.xyz = temp[1], src1.xyz = temp[3] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src0.000 27: src0.xyz = const[7], src0.w = temp[1], src1.xyz = temp[1], src2.xyz = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe405f401: src: 1 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe406f400: src: 0 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02470000: id: 7 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe407f403: src: 3 R/G/A/A dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080000c0:Addr0: 192t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040006:Addr0: 6t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490080:MAD dest:8 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8000:MAD dest:0 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 9 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001f1:DP3 dest:31 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00081800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08001c03:Addr0: 3t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084ac48:rgb_A_src:0 B/B/0 1 rgb_B_src:1 G/G/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00441030:MAD dest:3 rgb_C_src:1 R/B/0 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000c02:Addr0: 2t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001c00:Addr0: 0t, Addr1: 7t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00690006:CMP dest:0 alp_A_src:0 0 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x42490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:1 R 1 13 0:CMN_INST 0x00107800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00341009:Addr0: 9t, Addr1: 4c, Addr2: 3t, srcp:0 2:ALPHA_ADDR 0x080000d0:Addr0: 208t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0032c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:2 G 0 targ 0 w:0 5 RGBA_INST: 0x1a000035:MAX dest:3 rgb_C_src:0 R/R/R 0 alp_C_src:1 A 0 14 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000161:DP3 dest:22 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00087800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041403:Addr0: 3t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x0004c01b:RSQ dest:1 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21070:MAD dest:7 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 16 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000c07:Addr0: 7t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0144036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 1 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221030:MAD dest:3 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000191:DP3 dest:25 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00184800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x08041804:Addr0: 4t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040005:Addr0: 5t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20485000:MAD dest:0 rgb_C_src:1 G/0/0 0 alp_C_src:0 0 0 20 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c009:LN2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08038000:Addr0: 0t, Addr1: 224t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 22 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001800:Addr0: 0t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 24 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08002103:Addr0: 3c, Addr1: 8t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 25 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08000c01:Addr0: 1t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 26 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40000507:Addr0: 7c, Addr1: 1t, Addr2: 0t, srcp:1 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 27 Instructions ~ 18 Vector Instructions (RGB) ~ 14 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 5 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 10 Temporary Registers ~ 4 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[4] DCL CONST[0] DCL CONST[3..7] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 32.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4], IN[1], SAMP[0], 2D 8: MUL TEMP[4].xyz, TEMP[4], CONST[0] 9: TEX TEMP[5].xyz, IN[1], SAMP[3], 2D 10: MAD TEMP[5].xyz, TEMP[5], IMM[0].xxxx, -IMM[0].yyyy 11: ADD TEMP[6].xyz, TEMP[3], TEMP[2] 12: DP3 TEMP[6].w, TEMP[6], TEMP[6] 13: RSQ TEMP[6].w, |TEMP[6].wwww| 14: MUL TEMP[6].xyz, TEMP[6].wwww, TEMP[6] 15: DP3_SAT TEMP[6].w, TEMP[6], TEMP[5] 16: POW TEMP[6].w, TEMP[6].wwww, IMM[0].wwww 17: MUL TEMP[6].w, TEMP[6], TEMP[4] 18: MAD TEMP[4].xyz, TEMP[6].wwww, CONST[3], TEMP[4] 19: DP3_SAT TEMP[2].w, TEMP[5], TEMP[2] 20: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 21: MAX TEMP[0].xyz, TEMP[0], CONST[4] 22: MUL TEMP[4].xyz, TEMP[4], TEMP[0] 23: TEX TEMP[0].xyz, IN[1], SAMP[4], 2D 24: MAD_SAT TEMP[1].xyz, TEMP[0], CONST[5], TEMP[4] 25: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[6].xxxx, CONST[6].yyyy 26: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[7] 27: MOV OUT[0].w, TEMP[1] 28: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[8].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MAD_SAT temp[1].xyz, temp[0], const[5], temp[4]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 26: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 27: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[8].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MAD_SAT temp[1].xyz, temp[0], const[5], temp[4]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 26: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 27: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[8].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MAD_SAT temp[1].xyz, temp[0], const[5], temp[4]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 26: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 27: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[8].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MAD_SAT temp[1].xyz, temp[0], const[5], temp[4]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 26: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 27: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[8].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MAD_SAT temp[1].xyz, temp[0], const[5], temp[4]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 26: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 27: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[8].wwww; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MAD_SAT temp[1].xyz, temp[0], const[5], temp[4]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 26: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 27: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: LG2 temp[8].w, temp[6].wwww; 17: MUL temp[8].w, temp[8].wwww, const[8].wwww; 18: EX2 temp[6].w, temp[8].wwww; 19: MUL temp[6].w, temp[6], temp[4]; 20: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 21: DP3_SAT temp[2].w, temp[5], temp[2]; 22: MUL temp[0].xyz, temp[0], temp[2].wwww; 23: MAX temp[0].xyz, temp[0], const[4]; 24: MUL temp[4].xyz, temp[4], temp[0]; 25: TEX temp[0].xyz, input[1], 2D[4]; 26: MAD_SAT temp[1].xyz, temp[0], const[5], temp[4]; 27: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 28: ADD temp[9].xyz, temp[1], -const[7]; 29: MAD output[0].xyz, temp[7].xxxx, temp[9], const[7]; 30: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4], input[1].xy__, 2D[0]; 8: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 9: TEX temp[5].xyz, input[1].xy__, 2D[3]; 10: MAD temp[5].xyz, temp[5].xyz_, const[8].xxx_, -const[8].yyy_; 11: ADD temp[6].xyz, temp[3].xyz_, temp[2].xyz_; 12: DP3 temp[6].w, temp[6].xyz_, temp[6].xyz_; 13: RSQ temp[6].w, |temp[6].___w|; 14: MUL temp[6].xyz, temp[6].www_, temp[6].xyz_; 15: DP3_SAT temp[6].w, temp[6].xyz_, temp[5].xyz_; 16: LG2 temp[8].w, temp[6].___w; 17: MUL temp[8].w, temp[8].___w, const[8].___w; 18: EX2 temp[6].w, temp[8].___w; 19: MUL temp[6].w, temp[6].___w, temp[4].___w; 20: MAD temp[4].xyz, temp[6].www_, const[3].xyz_, temp[4].xyz_; 21: DP3_SAT temp[2].w, temp[5].xyz_, temp[2].xyz_; 22: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 23: MAX temp[0].xyz, temp[0].xyz_, const[4].xyz_; 24: MUL temp[4].xyz, temp[4].xyz_, temp[0].xyz_; 25: TEX temp[0].xyz, input[1].xy__, 2D[4]; 26: MAD_SAT temp[1].xyz, temp[0].xyz_, const[5].xyz_, temp[4].xyz_; 27: MAD_SAT temp[7].x, input[0].x___, const[6].x___, const[6].y___; 28: ADD temp[9].xyz, temp[1].xyz_, -const[7].xyz_; 29: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[7].xyz_; 30: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[8].xxx_, -const[8].yyy_; 11: ADD temp[21].xyz, temp[16].xyz_, temp[13].xyz_; 12: DP3 temp[22].w, temp[21].xyz_, temp[21].xyz_; 13: RSQ temp[23].w, |temp[22].___w|; 14: MUL temp[24].xyz, temp[23].www_, temp[21].xyz_; 15: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 16: LG2 temp[26].w, temp[25].___w; 17: MUL temp[27].w, temp[26].___w, const[8].___w; 18: EX2 temp[28].w, temp[27].___w; 19: MUL temp[29].w, temp[28].___w, temp[17].___w; 20: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 21: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 22: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 23: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 24: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 25: TEX temp[35].xyz, input[1].xy__, 2D[4]; 26: MAD_SAT temp[36].xyz, temp[35].xyz_, const[5].xyz_, temp[34].xyz_; 27: MAD_SAT temp[37].x, input[0].x___, const[6].x___, const[6].y___; 28: ADD temp[38].xyz, temp[36].xyz_, -const[7].xyz_; 29: MAD output[0].xyz, temp[37].xxx_, temp[38].xyz_, const[7].xyz_; 30: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[8].xxx_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, const[8].___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MAD_SAT temp[36].xyz, temp[35].xyz_, const[5].xyz_, temp[34].xyz_; 26: MAD_SAT temp[37].x, input[0].x___, const[6].x___, const[6].y___; 27: MAD output[0].xyz, temp[37].xxx_, (temp[36] - const[7]).xyz_, const[7].xyz_; 28: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MAD_SAT temp[36].xyz, temp[35].xyz_, const[5].xyz_, temp[34].xyz_; 26: MAD_SAT temp[37].x, input[0].x___, const[6].x___, const[6].y___; 27: MAD output[0].xyz, temp[37].xxx_, (temp[36] - const[7]).xyz_, const[7].xyz_; 28: MOV output[0].w, temp[11].___w; CONST[8] = { 2.0000 1.0000 0.0000 32.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MAD_SAT temp[36].xyz, temp[35].xyz_, const[5].xyz_, temp[34].xyz_; 26: MAD_SAT temp[37].x, input[0].x___, const[6].x___, const[6].y___; 27: MAD output[0].xyz, temp[37].xxx_, (temp[36] - const[7]).xyz_, const[7].xyz_; 28: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MAD_SAT temp[36].xyz, temp[35].xyz_, const[5].xyz_, temp[34].xyz_; 26: MAD_SAT temp[37].x, input[0].x___, const[6].x___, const[6].y___; 27: MAD output[0].xyz, temp[37].xxx_, (temp[36] - const[7]).xyz_, const[7].xyz_; 28: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17], input[1].xy__, 2D[0]; 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 11: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 12: src0.w = temp[22] RSQ temp[23].w, |src0.w| 13: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 14: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 15: src0.w = temp[25] LG2 temp[26].w, src0.w 16: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 17: src0.w = temp[27] EX2 temp[28].w, src0.w 18: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 19: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 20: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 21: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 22: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: src0.xyz = temp[35], src1.xyz = const[5], src2.xyz = temp[34] MAD_SAT temp[36].xyz, src0.xyz, src1.xyz, src2.xyz 26: src0.xyz = input[0], src1.xyz = const[6] MAD_SAT temp[37].x, src0.x__, src1.x__, src1.y__ 27: src0.xyz = const[7], src1.xyz = temp[36], src2.xyz = temp[37], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 28: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3]; 5: TEX temp[35].xyz, input[1].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 11: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 13: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[37].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 14: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 15: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 16: src0.w = temp[22] RSQ temp[23].w, |src0.w| 17: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 18: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 19: src0.w = temp[25] LG2 temp[26].w, src0.w 20: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 21: src0.w = temp[27] EX2 temp[28].w, src0.w 22: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 23: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[30], src1.xyz = temp[33] MAD temp[34].xyz, src0.xyz, src1.xyz, src0.000 25: src0.xyz = temp[35], src1.xyz = const[5], src2.xyz = temp[34] MAD_SAT temp[36].xyz, src0.xyz, src1.xyz, src2.xyz 26: src0.xyz = const[7], src0.w = temp[11], src1.xyz = temp[36], src2.xyz = temp[37], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3]; 5: TEX temp[35].xyz, input[1].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 11: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 13: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[37].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 14: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 15: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 16: src0.w = temp[22] RSQ temp[23].w, |src0.w| 17: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 18: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 19: src0.w = temp[25] LG2 temp[26].w, src0.w 20: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 21: src0.w = temp[27] EX2 temp[28].w, src0.w 22: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 23: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[30], src1.xyz = temp[33] MAD temp[34].xyz, src0.xyz, src1.xyz, src0.000 25: src0.xyz = temp[35], src1.xyz = const[5], src2.xyz = temp[34] MAD_SAT temp[36].xyz, src0.xyz, src1.xyz, src2.xyz 26: src0.xyz = const[7], src0.w = temp[11], src1.xyz = temp[36], src2.xyz = temp[37], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xy__, 2D[1]; 2: TEX temp[1].xyz, input[1].xy__, 2D[2]; 3: TEX temp[5], input[0].xy__, 2D[0]; 4: TEX temp[6].xyz, input[0].xy__, 2D[3]; 5: TEX temp[0].xyz, input[0].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 7: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[0].w, |src1.w| 8: src0.xyz = temp[5], src1.xyz = const[0] MAD temp[7].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[6], src0.w = 2.000000 (0x40) MAD temp[6].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[6], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 11: src0.xyz = temp[4], src0.w = temp[1] MAD temp[8].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[8], src1.xyz = const[4] MAX temp[8].xyz, src0.xyz, src1.xyz 13: src0.xyz = input[3], src0.w = temp[4], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[3].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[1].w, src0.w, src1.w, src0.0 14: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 15: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[0].w, src0._, src0._ 16: src0.w = temp[0] RSQ temp[0].w, |src0.w| 17: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[1].xyz, src0.www, srcp.xyz, src0.000 18: src0.xyz = temp[1], src1.xyz = temp[6] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[0].w, src0._, src0._ 19: src0.w = temp[0] LG2 temp[0].w, src0.w 20: src0.w = temp[0], src1.w = 32.000000 (0x60) MAD temp[0].w, src0.w, src1.w, src0.0 21: src0.w = temp[0] EX2 temp[0].w, src0.w 22: src0.w = temp[0], src1.w = temp[5] MAD temp[0].w, src0.w, src1.w, src0.0 23: src0.xyz = const[3], src0.w = temp[0], src1.xyz = temp[7] MAD temp[1].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[1], src1.xyz = temp[8] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 25: src0.xyz = temp[0], src1.xyz = const[5], src2.xyz = temp[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 26: src0.xyz = const[7], src0.w = temp[1], src1.xyz = temp[0], src2.xyz = temp[3], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe404f401: src: 1 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe406f400: src: 0 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02440000: id: 4 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080000c0:Addr0: 192t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040005:Addr0: 5t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8060:MAD dest:6 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 9 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000406:Addr0: 6t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001f1:DP3 dest:31 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490080:MAD dest:8 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08041008:Addr0: 8t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000085:MAX dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00184800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x08041803:Addr0: 3t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040004:Addr0: 4t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20485030:MAD dest:3 rgb_C_src:1 G/0/0 0 alp_C_src:0 0 0 13 0:CMN_INST 0x00003a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000161:DP3 dest:22 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001801:Addr0: 1t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000191:DP3 dest:25 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c009:LN2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08038000:Addr0: 0t, Addr1: 224t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 20 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001400:Addr0: 0t, Addr1: 5t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 22 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08001d03:Addr0: 3c, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08002001:Addr0: 1t, Addr1: 8t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00141400:Addr0: 0t, Addr1: 5c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 25 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40300107:Addr0: 7c, Addr1: 0t, Addr2: 3t, srcp:1 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 26 Instructions ~ 16 Vector Instructions (RGB) ~ 12 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 5 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 9 Temporary Registers ~ 3 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL OUT[5], GENERIC[3] DCL CONST[0] DCL CONST[2..10] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[7] 2: MAD TEMP[0], IN[0].yyyy, CONST[8], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[9], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[10], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: DP4 OUT[5].x, CONST[4], IN[0] 16: DP4 OUT[5].y, CONST[5], IN[0] 17: DP4 OUT[5].z, CONST[6], IN[0] 18: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[11]._xxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; CONST[11] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4162 reg: 11c swiz: Z/ Z/ U/ U src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 7: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6162 reg: 11c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 17: op: 0x00208201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 18: op: 0x00408201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 19: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 20: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 21 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL IN[4], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[4] DCL SAMP[7] DCL CONST[0] DCL CONST[3..8] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 8.0000} IMM[1] FLT32 { 32.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4], IN[1], SAMP[0], 2D 8: MUL TEMP[4].xyz, TEMP[4], CONST[0] 9: TEX TEMP[5].xyz, IN[1], SAMP[3], 2D 10: MAD TEMP[5].xyz, TEMP[5], IMM[0].xxxx, -IMM[0].yyyy 11: ADD TEMP[6].xyz, TEMP[3], TEMP[2] 12: DP3 TEMP[6].w, TEMP[6], TEMP[6] 13: RSQ TEMP[6].w, |TEMP[6].wwww| 14: MUL TEMP[6].xyz, TEMP[6].wwww, TEMP[6] 15: DP3_SAT TEMP[6].w, TEMP[6], TEMP[5] 16: POW TEMP[6].w, TEMP[6].wwww, IMM[1].xxxx 17: MUL TEMP[6].w, TEMP[6], TEMP[4] 18: MAD TEMP[4].xyz, TEMP[6].wwww, CONST[3], TEMP[4] 19: DP3_SAT TEMP[2].w, TEMP[5], TEMP[2] 20: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 21: MAX TEMP[0].xyz, TEMP[0], CONST[4] 22: TEX TEMP[2], IN[4], SAMP[7], 2D 23: MAD_SAT TEMP[3].xy, -IN[4].zzzz, TEMP[2].yyyy, TEMP[2].xzzz 24: CMP TEMP[2].w, -TEMP[3].xxxx, TEMP[2].wwww, IMM[0].zzzz 25: MAD_SAT TEMP[2].w, -IMM[0].wwww, TEMP[3].yyyy, TEMP[2].wwww 26: SUB_SAT TEMP[3].xyz, TEMP[0], CONST[5] 27: MAD TEMP[0].xyz, TEMP[2].wwww, -TEMP[3], TEMP[0] 28: MUL TEMP[4].xyz, TEMP[4], TEMP[0] 29: TEX TEMP[0].xyz, IN[1], SAMP[4], 2D 30: MAD_SAT TEMP[1].xyz, TEMP[0], CONST[6], TEMP[4] 31: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[7].xxxx, CONST[7].yyyy 32: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[8] 33: MOV OUT[0].w, TEMP[1] 34: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[9].xxxx, -const[9].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[10].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 25: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL temp[4].xyz, temp[4], temp[0]; 29: TEX temp[0].xyz, input[1], 2D[4]; 30: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 31: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 32: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 33: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[9].xxxx, -const[9].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[10].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 25: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL temp[4].xyz, temp[4], temp[0]; 29: TEX temp[0].xyz, input[1], 2D[4]; 30: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 31: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 32: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 33: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[9].xxxx, -const[9].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[10].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 25: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL temp[4].xyz, temp[4], temp[0]; 29: TEX temp[0].xyz, input[1], 2D[4]; 30: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 31: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 32: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 33: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[9].xxxx, -const[9].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[10].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 25: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL temp[4].xyz, temp[4], temp[0]; 29: TEX temp[0].xyz, input[1], 2D[4]; 30: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 31: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 32: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 33: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[9].xxxx, -const[9].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[10].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 25: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL temp[4].xyz, temp[4], temp[0]; 29: TEX temp[0].xyz, input[1], 2D[4]; 30: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 31: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 32: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 33: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[9].xxxx, -const[9].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: POW temp[6].w, temp[6].wwww, const[10].xxxx; 17: MUL temp[6].w, temp[6], temp[4]; 18: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: TEX temp[2], input[4], 2D[7]; 23: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 24: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 25: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 26: SUB_SAT temp[3].xyz, temp[0], const[5]; 27: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 28: MUL temp[4].xyz, temp[4], temp[0]; 29: TEX temp[0].xyz, input[1], 2D[4]; 30: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 31: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 32: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 33: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4], input[1], 2D[0]; 8: MUL temp[4].xyz, temp[4], const[0]; 9: TEX temp[5].xyz, input[1], 2D[3]; 10: MAD temp[5].xyz, temp[5], const[9].xxxx, -const[9].yyyy; 11: ADD temp[6].xyz, temp[3], temp[2]; 12: DP3 temp[6].w, temp[6], temp[6]; 13: RSQ temp[6].w, |temp[6].wwww|; 14: MUL temp[6].xyz, temp[6].wwww, temp[6]; 15: DP3_SAT temp[6].w, temp[6], temp[5]; 16: LG2 temp[8].w, temp[6].wwww; 17: MUL temp[8].w, temp[8].wwww, const[10].xxxx; 18: EX2 temp[6].w, temp[8].wwww; 19: MUL temp[6].w, temp[6], temp[4]; 20: MAD temp[4].xyz, temp[6].wwww, const[3], temp[4]; 21: DP3_SAT temp[2].w, temp[5], temp[2]; 22: MUL temp[0].xyz, temp[0], temp[2].wwww; 23: MAX temp[0].xyz, temp[0], const[4]; 24: TEX temp[2], input[4], 2D[7]; 25: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 26: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 27: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 28: ADD_SAT temp[3].xyz, temp[0], -const[5]; 29: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 30: MUL temp[4].xyz, temp[4], temp[0]; 31: TEX temp[0].xyz, input[1], 2D[4]; 32: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 33: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 34: ADD temp[9].xyz, temp[1], -const[8]; 35: MAD output[0].xyz, temp[7].xxxx, temp[9], const[8]; 36: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4], input[1].xy__, 2D[0]; 8: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 9: TEX temp[5].xyz, input[1].xy__, 2D[3]; 10: MAD temp[5].xyz, temp[5].xyz_, const[9].xxx_, -const[9].yyy_; 11: ADD temp[6].xyz, temp[3].xyz_, temp[2].xyz_; 12: DP3 temp[6].w, temp[6].xyz_, temp[6].xyz_; 13: RSQ temp[6].w, |temp[6].___w|; 14: MUL temp[6].xyz, temp[6].www_, temp[6].xyz_; 15: DP3_SAT temp[6].w, temp[6].xyz_, temp[5].xyz_; 16: LG2 temp[8].w, temp[6].___w; 17: MUL temp[8].w, temp[8].___w, const[10].___x; 18: EX2 temp[6].w, temp[8].___w; 19: MUL temp[6].w, temp[6].___w, temp[4].___w; 20: MAD temp[4].xyz, temp[6].www_, const[3].xyz_, temp[4].xyz_; 21: DP3_SAT temp[2].w, temp[5].xyz_, temp[2].xyz_; 22: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 23: MAX temp[0].xyz, temp[0].xyz_, const[4].xyz_; 24: TEX temp[2], input[4].xy__, 2D[7]; 25: MAD_SAT temp[3].xy, -input[4].zz__, temp[2].yy__, temp[2].xz__; 26: CMP temp[2].w, -temp[3].___x, temp[2].___w, const[9].___z; 27: MAD_SAT temp[2].w, -const[9].___w, temp[3].___y, temp[2].___w; 28: ADD_SAT temp[3].xyz, temp[0].xyz_, -const[5].xyz_; 29: MAD temp[0].xyz, temp[2].www_, -temp[3].xyz_, temp[0].xyz_; 30: MUL temp[4].xyz, temp[4].xyz_, temp[0].xyz_; 31: TEX temp[0].xyz, input[1].xy__, 2D[4]; 32: MAD_SAT temp[1].xyz, temp[0].xyz_, const[6].xyz_, temp[4].xyz_; 33: MAD_SAT temp[7].x, input[0].x___, const[7].x___, const[7].y___; 34: ADD temp[9].xyz, temp[1].xyz_, -const[8].xyz_; 35: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[8].xyz_; 36: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[9].xxx_, -const[9].yyy_; 11: ADD temp[21].xyz, temp[16].xyz_, temp[13].xyz_; 12: DP3 temp[22].w, temp[21].xyz_, temp[21].xyz_; 13: RSQ temp[23].w, |temp[22].___w|; 14: MUL temp[24].xyz, temp[23].www_, temp[21].xyz_; 15: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 16: LG2 temp[26].w, temp[25].___w; 17: MUL temp[27].w, temp[26].___w, const[10].___x; 18: EX2 temp[28].w, temp[27].___w; 19: MUL temp[29].w, temp[28].___w, temp[17].___w; 20: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 21: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 22: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 23: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 24: TEX temp[34], input[4].xy__, 2D[7]; 25: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 26: CMP temp[36].w, -temp[35].___x, temp[34].___w, const[9].___z; 27: MAD_SAT temp[37].w, -const[9].___w, temp[35].___y, temp[36].___w; 28: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 29: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 30: MUL temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 31: TEX temp[41].xyz, input[1].xy__, 2D[4]; 32: MAD_SAT temp[42].xyz, temp[41].xyz_, const[6].xyz_, temp[40].xyz_; 33: MAD_SAT temp[43].x, input[0].x___, const[7].x___, const[7].y___; 34: ADD temp[44].xyz, temp[42].xyz_, -const[8].xyz_; 35: MAD output[0].xyz, temp[43].xxx_, temp[44].xyz_, const[8].xyz_; 36: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[9].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, const[9].xxx_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, const[10].___x; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -const[9].___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: TEX temp[41].xyz, input[1].xy__, 2D[4]; 31: MAD_SAT temp[42].xyz, temp[41].xyz_, const[6].xyz_, temp[40].xyz_; 32: MAD_SAT temp[43].x, input[0].x___, const[7].x___, const[7].y___; 33: MAD output[0].xyz, temp[43].xxx_, (temp[42] - const[8]).xyz_, const[8].xyz_; 34: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -8.000000 (0x50).___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: TEX temp[41].xyz, input[1].xy__, 2D[4]; 31: MAD_SAT temp[42].xyz, temp[41].xyz_, const[6].xyz_, temp[40].xyz_; 32: MAD_SAT temp[43].x, input[0].x___, const[7].x___, const[7].y___; 33: MAD output[0].xyz, temp[43].xxx_, (temp[42] - const[8]).xyz_, const[8].xyz_; 34: MOV output[0].w, temp[11].___w; CONST[9] = { 2.0000 1.0000 0.0000 8.0000 } CONST[10] = { 32.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -8.000000 (0x50).___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: TEX temp[41].xyz, input[1].xy__, 2D[4]; 31: MAD_SAT temp[42].xyz, temp[41].xyz_, const[6].xyz_, temp[40].xyz_; 32: MAD_SAT temp[43].x, input[0].x___, const[7].x___, const[7].y___; 33: MAD output[0].xyz, temp[43].xxx_, (temp[42] - const[8]).xyz_, const[8].xyz_; 34: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17], input[1].xy__, 2D[0]; 8: MUL temp[18].xyz, temp[17].xyz_, const[0].xyz_; 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: MAD temp[20].xyz, temp[19].xyz_, 2.000000 (0x40).www_, -none.111_; 11: DP3 temp[22].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 12: RSQ temp[23].w, |temp[22].___w|; 13: MUL temp[24].xyz, temp[23].www_, (temp[13] + temp[16]).xyz_; 14: DP3_SAT temp[25].w, temp[24].xyz_, temp[20].xyz_; 15: LG2 temp[26].w, temp[25].___w; 16: MUL temp[27].w, temp[26].___w, 32.000000 (0x60).___w; 17: EX2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MAD temp[30].xyz, temp[29].www_, const[3].xyz_, temp[18].xyz_; 20: DP3_SAT temp[31].w, temp[20].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: TEX temp[34], input[4].xy__, 2D[7]; 24: MAD_SAT temp[35].xy, -input[4].zz__, temp[34].yy__, temp[34].xz__; 25: CMP temp[36].w, -temp[35].___x, temp[34].___w, none.___0; 26: MAD_SAT temp[37].w, -8.000000 (0x50).___w, temp[35].___y, temp[36].___w; 27: ADD_SAT temp[38].xyz, temp[33].xyz_, -const[5].xyz_; 28: MAD temp[39].xyz, temp[37].www_, -temp[38].xyz_, temp[33].xyz_; 29: MUL temp[40].xyz, temp[30].xyz_, temp[39].xyz_; 30: TEX temp[41].xyz, input[1].xy__, 2D[4]; 31: MAD_SAT temp[42].xyz, temp[41].xyz_, const[6].xyz_, temp[40].xyz_; 32: MAD_SAT temp[43].x, input[0].x___, const[7].x___, const[7].y___; 33: MAD output[0].xyz, temp[43].xxx_, (temp[42] - const[8]).xyz_, const[8].xyz_; 34: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17], input[1].xy__, 2D[0]; 8: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 9: TEX temp[19].xyz, input[1].xy__, 2D[3]; 10: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 11: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 12: src0.w = temp[22] RSQ temp[23].w, |src0.w| 13: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 14: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 15: src0.w = temp[25] LG2 temp[26].w, src0.w 16: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 17: src0.w = temp[27] EX2 temp[28].w, src0.w 18: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 19: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 20: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 21: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 22: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 23: TEX temp[34], input[4].xy__, 2D[7]; 24: src0.xyz = input[4], src1.xyz = temp[34] MAD_SAT temp[35].xy, -src0.zz_, src1.yy_, src1.xz_ 25: src0.xyz = temp[35], src0.w = temp[34] CMP temp[36].w, src0.0, src0.w, -src0.x 26: src0.xyz = temp[35], src0.w = 8.000000 (0x50), src1.w = temp[36] MAD_SAT temp[37].w, -src0.w, src0.y, src1.w 27: src0.xyz = temp[33], src1.xyz = const[5] MAD_SAT temp[38].xyz, src0.xyz, src0.111, -src1.xyz 28: src0.xyz = temp[38], src0.w = temp[37], src1.xyz = temp[33] MAD temp[39].xyz, src0.www, -src0.xyz, src1.xyz 29: src0.xyz = temp[30], src1.xyz = temp[39] MAD temp[40].xyz, src0.xyz, src1.xyz, src0.000 30: TEX temp[41].xyz, input[1].xy__, 2D[4]; 31: src0.xyz = temp[41], src1.xyz = const[6], src2.xyz = temp[40] MAD_SAT temp[42].xyz, src0.xyz, src1.xyz, src2.xyz 32: src0.xyz = input[0], src1.xyz = const[7] MAD_SAT temp[43].x, src0.x__, src1.x__, src1.y__ 33: src0.xyz = const[8], src1.xyz = temp[42], src2.xyz = temp[43], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 34: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3]; 5: TEX temp[34], input[4].xy__, 2D[7]; 6: TEX temp[41].xyz, input[1].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 8: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 9: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 10: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 11: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 12: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 13: src0.xyz = input[4], src1.xyz = temp[34] MAD_SAT temp[35].xy, -src0.zz_, src1.yy_, src1.xz_ 14: src0.xyz = input[3], src0.w = temp[15], src1.xyz = temp[35], src1.w = temp[34] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 CMP temp[36].w, src0.0, src1.w, -src1.x 15: src0.xyz = temp[32], src0.w = 8.000000 (0x50), src1.xyz = const[4], src1.w = temp[36], src2.xyz = temp[35] MAX temp[33].xyz, src0.xyz, src1.xyz MAD_SAT temp[37].w, -src0.w, src2.y, src1.w 16: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 17: src0.xyz = temp[33], src0.w = temp[22], src1.xyz = const[5] MAD_SAT temp[38].xyz, src0.xyz, src0.111, -src1.xyz RSQ temp[23].w, |src0.w| 18: src0.xyz = temp[38], src0.w = temp[37], src1.xyz = temp[33] MAD temp[39].xyz, src0.www, -src0.xyz, src1.xyz 19: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 20: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 21: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[7], src1.w = const[0] MAD_SAT temp[43].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 22: src0.w = temp[25] LG2 temp[26].w, src0.w 23: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 24: src0.w = temp[27] EX2 temp[28].w, src0.w 25: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 26: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 27: src0.xyz = temp[30], src1.xyz = temp[39] MAD temp[40].xyz, src0.xyz, src1.xyz, src0.000 28: src0.xyz = temp[41], src1.xyz = const[6], src2.xyz = temp[40] MAD_SAT temp[42].xyz, src0.xyz, src1.xyz, src2.xyz 29: src0.xyz = const[8], src0.w = temp[11], src1.xyz = temp[42], src2.xyz = temp[43], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17], input[1].xy__, 2D[0]; 4: TEX temp[19].xyz, input[1].xy__, 2D[3]; 5: TEX temp[34], input[4].xy__, 2D[7]; 6: TEX temp[41].xyz, input[1].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 8: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 9: src0.xyz = temp[17], src1.xyz = const[0] MAD temp[18].xyz, src0.xyz, src1.xyz, src0.000 10: src0.xyz = temp[19], src0.w = 2.000000 (0x40) MAD temp[20].xyz, src0.xyz, src0.www, -src0.111 11: src0.xyz = temp[20], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 12: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 13: src0.xyz = input[4], src1.xyz = temp[34] MAD_SAT temp[35].xy, -src0.zz_, src1.yy_, src1.xz_ 14: src0.xyz = input[3], src0.w = temp[15], src1.xyz = temp[35], src1.w = temp[34] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 CMP temp[36].w, src0.0, src1.w, -src1.x 15: src0.xyz = temp[32], src0.w = 8.000000 (0x50), src1.xyz = const[4], src1.w = temp[36], src2.xyz = temp[35] MAX temp[33].xyz, src0.xyz, src1.xyz MAD_SAT temp[37].w, -src0.w, src2.y, src1.w 16: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[22].w, src0._, src0._ 17: src0.xyz = temp[33], src0.w = temp[22], src1.xyz = const[5] MAD_SAT temp[38].xyz, src0.xyz, src0.111, -src1.xyz RSQ temp[23].w, |src0.w| 18: src0.xyz = temp[38], src0.w = temp[37], src1.xyz = temp[33] MAD temp[39].xyz, src0.www, -src0.xyz, src1.xyz 19: src0.xyz = temp[16], src0.w = temp[23], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[24].xyz, src0.www, srcp.xyz, src0.000 20: src0.xyz = temp[24], src1.xyz = temp[20] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[25].w, src0._, src0._ 21: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[7], src1.w = const[0] MAD_SAT temp[43].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 22: src0.w = temp[25] LG2 temp[26].w, src0.w 23: src0.w = temp[26], src1.w = 32.000000 (0x60) MAD temp[27].w, src0.w, src1.w, src0.0 24: src0.w = temp[27] EX2 temp[28].w, src0.w 25: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 26: src0.xyz = const[3], src0.w = temp[29], src1.xyz = temp[18] MAD temp[30].xyz, src0.www, src0.xyz, src1.xyz 27: src0.xyz = temp[30], src1.xyz = temp[39] MAD temp[40].xyz, src0.xyz, src1.xyz, src0.000 28: src0.xyz = temp[41], src1.xyz = const[6], src2.xyz = temp[40] MAD_SAT temp[42].xyz, src0.xyz, src1.xyz, src2.xyz 29: src0.xyz = const[8], src0.w = temp[11], src1.xyz = temp[42], src2.xyz = temp[43], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[1].xy__, 2D[1]; 2: TEX temp[1].xyz, input[1].xy__, 2D[2]; 3: TEX temp[6], input[0].xy__, 2D[0]; 4: TEX temp[7].xyz, input[0].xy__, 2D[3]; 5: TEX temp[8], input[3].xy__, 2D[7]; 6: TEX temp[0].xyz, input[0].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 8: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[0].w, |src1.w| 9: src0.xyz = temp[6], src1.xyz = const[0] MAD temp[9].xyz, src0.xyz, src1.xyz, src0.000 10: src0.xyz = temp[7], src0.w = 2.000000 (0x40) MAD temp[7].xyz, src0.xyz, src0.www, -src0.111 11: src0.xyz = temp[7], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 12: src0.xyz = temp[5], src0.w = temp[1] MAD temp[10].xyz, src0.xyz, src0.www, src0.000 13: src0.xyz = input[3], src1.xyz = temp[8] MAD_SAT temp[3].xy, -src0.zz_, src1.yy_, src1.xz_ 14: src0.xyz = input[2], src0.w = temp[0], src1.xyz = temp[3], src1.w = temp[8] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 CMP temp[0].w, src0.0, src1.w, -src1.x 15: src0.xyz = temp[10], src0.w = 8.000000 (0x50), src1.xyz = const[4], src1.w = temp[0], src2.xyz = temp[3] MAX temp[3].xyz, src0.xyz, src1.xyz MAD_SAT temp[0].w, -src0.w, src2.y, src1.w 16: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[1].w, src0._, src0._ 17: src0.xyz = temp[3], src0.w = temp[1], src1.xyz = const[5] MAD_SAT temp[8].xyz, src0.xyz, src0.111, -src1.xyz RSQ temp[1].w, |src0.w| 18: src0.xyz = temp[8], src0.w = temp[0], src1.xyz = temp[3] MAD temp[3].xyz, src0.www, -src0.xyz, src1.xyz 19: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[1].xyz, src0.www, srcp.xyz, src0.000 20: src0.xyz = temp[1], src1.xyz = temp[7] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[0].w, src0._, src0._ 21: src0.xyz = input[4], src0.w = temp[5], src1.xyz = const[7], src1.w = const[0] MAD_SAT temp[1].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[1].w, src0.w, src1.w, src0.0 22: src0.w = temp[0] LG2 temp[0].w, src0.w 23: src0.w = temp[0], src1.w = 32.000000 (0x60) MAD temp[0].w, src0.w, src1.w, src0.0 24: src0.w = temp[0] EX2 temp[0].w, src0.w 25: src0.w = temp[0], src1.w = temp[6] MAD temp[0].w, src0.w, src1.w, src0.0 26: src0.xyz = const[3], src0.w = temp[0], src1.xyz = temp[9] MAD temp[2].xyz, src0.www, src0.xyz, src1.xyz 27: src0.xyz = temp[2], src1.xyz = temp[3] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 28: src0.xyz = temp[0], src1.xyz = const[6], src2.xyz = temp[2] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 29: src0.xyz = const[8], src0.w = temp[1], src1.xyz = temp[0], src2.xyz = temp[1], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe405f401: src: 1 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe406f400: src: 0 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe407f400: src: 0 R/G/A/A dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00470000: id: 7 op:LD, , SCALED 2:TEX_ADDR: 0xe408f403: src: 3 R/G/A/A dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02440000: id: 4 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080000c0:Addr0: 192t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 8 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040006:Addr0: 6t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020007:Addr0: 7t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8070:MAD dest:7 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 10 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000407:Addr0: 7t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001f1:DP3 dest:31 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900a0:MAD dest:10 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00081800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08002003:Addr0: 3t, Addr1: 8t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084ac48:rgb_A_src:0 B/B/0 1 rgb_B_src:1 G/G/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00441030:MAD dest:3 rgb_C_src:1 R/B/0 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000c02:Addr0: 2t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08002000:Addr0: 0t, Addr1: 8t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00690006:CMP dest:0 alp_A_src:0 0 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x42490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:1 R 1 14 0:CMN_INST 0x00107800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x0034100a:Addr0: 10t, Addr1: 4c, Addr2: 3t, srcp:0 2:ALPHA_ADDR 0x080000d0:Addr0: 208t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0032c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:2 G 0 targ 0 w:0 5 RGBA_INST: 0x1a000035:MAX dest:3 rgb_C_src:0 R/R/R 0 alp_C_src:1 A 0 15 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000161:DP3 dest:22 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00087800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041403:Addr0: 3t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x0004c01b:RSQ dest:1 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00a21080:MAD dest:8 rgb_C_src:1 R/G/B 1 alp_C_src:0 R 0 17 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000c08:Addr0: 8t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0144036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 1 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221030:MAD dest:3 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001c01:Addr0: 1t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000191:DP3 dest:25 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00184800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x08041c04:Addr0: 4t, Addr1: 7c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040005:Addr0: 5t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20485010:MAD dest:1 rgb_C_src:1 G/0/0 0 alp_C_src:0 0 0 21 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c009:LN2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08038000:Addr0: 0t, Addr1: 224t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 23 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001800:Addr0: 0t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 25 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08002503:Addr0: 3c, Addr1: 9t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 26 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000c02:Addr0: 2t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 27 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00241800:Addr0: 0t, Addr1: 6c, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 28 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40100108:Addr0: 8c, Addr1: 0t, Addr2: 1t, srcp:1 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 29 Instructions ~ 19 Vector Instructions (RGB) ~ 14 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 6 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 11 Temporary Registers ~ 4 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL CONST[0] DCL CONST[2] DCL CONST[4..7] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 32.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[3], 2D 8: MAD TEMP[4].w, TEMP[4].wwww, CONST[2].xxxx, CONST[2].yyyy 9: MAD TEMP[5].xy, TEMP[4].wwww, TEMP[3], IN[1] 10: TEX TEMP[4], TEMP[5], SAMP[0], 2D 11: MUL TEMP[4].xyz, TEMP[4], CONST[0] 12: TEX TEMP[6].xyz, TEMP[5], SAMP[3], 2D 13: MAD TEMP[6].xyz, TEMP[6], IMM[0].xxxx, -IMM[0].yyyy 14: ADD TEMP[5].xyz, TEMP[3], TEMP[2] 15: DP3 TEMP[5].w, TEMP[5], TEMP[5] 16: RSQ TEMP[5].w, |TEMP[5].wwww| 17: MUL TEMP[5].xyz, TEMP[5].wwww, TEMP[5] 18: DP3_SAT TEMP[5].w, TEMP[5], TEMP[6] 19: POW TEMP[5].w, TEMP[5].wwww, IMM[0].wwww 20: MUL TEMP[5].w, TEMP[5], TEMP[4] 21: MAD TEMP[4].xyz, TEMP[5].wwww, CONST[4], TEMP[4] 22: DP3_SAT TEMP[2].w, TEMP[6], TEMP[2] 23: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 24: MAX TEMP[0].xyz, TEMP[0], CONST[5] 25: MUL_SAT TEMP[1].xyz, TEMP[4], TEMP[0] 26: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[6].xxxx, CONST[6].yyyy 27: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[7] 28: MOV OUT[0].w, TEMP[1] 29: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[8].wwww; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[8].wwww; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[8].wwww; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[8].wwww; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[8].wwww; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[8].wwww; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: LG2 temp[8].w, temp[5].wwww; 20: MUL temp[8].w, temp[8].wwww, const[8].wwww; 21: EX2 temp[5].w, temp[8].wwww; 22: MUL temp[5].w, temp[5], temp[4]; 23: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 24: DP3_SAT temp[2].w, temp[6], temp[2]; 25: MUL temp[0].xyz, temp[0], temp[2].wwww; 26: MAX temp[0].xyz, temp[0], const[5]; 27: MUL_SAT temp[1].xyz, temp[4], temp[0]; 28: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 29: ADD temp[9].xyz, temp[1], -const[7]; 30: MAD output[0].xyz, temp[7].xxxx, temp[9], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[3]; 8: MAD temp[4].w, temp[4].___w, const[2].___x, const[2].___y; 9: MAD temp[5].xy, temp[4].ww__, temp[3].xy__, input[1].xy__; 10: TEX temp[4], temp[5].xy__, 2D[0]; 11: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 12: TEX temp[6].xyz, temp[5].xy__, 2D[3]; 13: MAD temp[6].xyz, temp[6].xyz_, const[8].xxx_, -const[8].yyy_; 14: ADD temp[5].xyz, temp[3].xyz_, temp[2].xyz_; 15: DP3 temp[5].w, temp[5].xyz_, temp[5].xyz_; 16: RSQ temp[5].w, |temp[5].___w|; 17: MUL temp[5].xyz, temp[5].www_, temp[5].xyz_; 18: DP3_SAT temp[5].w, temp[5].xyz_, temp[6].xyz_; 19: LG2 temp[8].w, temp[5].___w; 20: MUL temp[8].w, temp[8].___w, const[8].___w; 21: EX2 temp[5].w, temp[8].___w; 22: MUL temp[5].w, temp[5].___w, temp[4].___w; 23: MAD temp[4].xyz, temp[5].www_, const[4].xyz_, temp[4].xyz_; 24: DP3_SAT temp[2].w, temp[6].xyz_, temp[2].xyz_; 25: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 26: MAX temp[0].xyz, temp[0].xyz_, const[5].xyz_; 27: MUL_SAT temp[1].xyz, temp[4].xyz_, temp[0].xyz_; 28: MAD_SAT temp[7].x, input[0].x___, const[6].x___, const[6].y___; 29: ADD temp[9].xyz, temp[1].xyz_, -const[7].xyz_; 30: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[7].xyz_; 31: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, const[8].xxx_, -const[8].yyy_; 14: ADD temp[24].xyz, temp[16].xyz_, temp[13].xyz_; 15: DP3 temp[25].w, temp[24].xyz_, temp[24].xyz_; 16: RSQ temp[26].w, |temp[25].___w|; 17: MUL temp[27].xyz, temp[26].www_, temp[24].xyz_; 18: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 19: LG2 temp[29].w, temp[28].___w; 20: MUL temp[30].w, temp[29].___w, const[8].___w; 21: EX2 temp[31].w, temp[30].___w; 22: MUL temp[32].w, temp[31].___w, temp[20].___w; 23: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 24: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 25: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 26: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 27: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 28: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 29: ADD temp[39].xyz, temp[37].xyz_, -const[7].xyz_; 30: MAD output[0].xyz, temp[38].xxx_, temp[39].xyz_, const[7].xyz_; 31: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, const[8].xxx_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, const[8].___w; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, 32.000000 (0x60).___w; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; CONST[8] = { 2.0000 1.0000 0.0000 32.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, 32.000000 (0x60).___w; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, 32.000000 (0x60).___w; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: src0.xyz = const[2], src0.w = temp[17] MAD temp[18].w, src0.w, src0.x, src0.y 9: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: src0.xyz = temp[20], src1.xyz = const[0] MAD temp[21].xyz, src0.xyz, src1.xyz, src0.000 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: src0.xyz = temp[22], src0.w = 2.000000 (0x40) MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 14: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 15: src0.w = temp[25] RSQ temp[26].w, |src0.w| 16: src0.xyz = temp[16], src0.w = temp[26], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 17: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 18: src0.w = temp[28] LG2 temp[29].w, src0.w 19: src0.w = temp[29], src1.w = 32.000000 (0x60) MAD temp[30].w, src0.w, src1.w, src0.0 20: src0.w = temp[30] EX2 temp[31].w, src0.w 21: src0.w = temp[31], src1.w = temp[20] MAD temp[32].w, src0.w, src1.w, src0.0 22: src0.xyz = const[4], src0.w = temp[32], src1.xyz = temp[21] MAD temp[33].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[23], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 24: src0.xyz = temp[10], src0.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.www, src0.000 25: src0.xyz = temp[35], src1.xyz = const[5] MAX temp[36].xyz, src0.xyz, src1.xyz 26: src0.xyz = temp[33], src1.xyz = temp[36] MAD_SAT temp[37].xyz, src0.xyz, src1.xyz, src0.000 27: src0.xyz = input[0], src1.xyz = const[6] MAD_SAT temp[38].x, src0.x__, src1.x__, src1.y__ 28: src0.xyz = const[7], src1.xyz = temp[37], src2.xyz = temp[38], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 29: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[14], src1.xyz = const[6] MAD_SAT temp[38].x, src0.x__, src1.x__, src1.y__ RSQ temp[15].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[10], input[2].xy__, 2D[1]; 5: TEX temp[12].xyz, input[2].xy__, 2D[2]; 6: TEX temp[17].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[10], src2.w = const[0] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[11].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[17] MAD temp[18].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1], src1.w = temp[25] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[26].w, |src1.w| 11: src0.xyz = temp[16], src0.w = temp[26], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 15: TEX temp[20], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 16: src0.xyz = temp[22], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 17: src0.xyz = temp[23], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 18: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 19: src0.xyz = temp[20], src0.w = temp[28], src1.xyz = const[0] MAD temp[21].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[29].w, src0.w 20: src0.xyz = temp[10], src0.w = temp[34], src1.w = temp[29], src2.w = 32.000000 (0x60) MAD temp[35].xyz, src0.xyz, src0.www, src0.000 MAD temp[30].w, src1.w, src2.w, src0.0 21: src0.xyz = temp[35], src0.w = temp[30], src1.xyz = const[5] MAX temp[36].xyz, src0.xyz, src1.xyz EX2 temp[31].w, src0.w 22: src0.w = temp[31], src1.w = temp[20] MAD temp[32].w, src0.w, src1.w, src0.0 23: src0.xyz = const[4], src0.w = temp[32], src1.xyz = temp[21] MAD temp[33].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[33], src1.xyz = temp[36] MAD_SAT temp[37].xyz, src0.xyz, src1.xyz, src0.000 25: src0.xyz = const[7], src1.xyz = temp[37], src2.xyz = temp[38], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[14], src1.xyz = const[6] MAD_SAT temp[38].x, src0.x__, src1.x__, src1.y__ RSQ temp[15].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[10], input[2].xy__, 2D[1]; 5: TEX temp[12].xyz, input[2].xy__, 2D[2]; 6: TEX temp[17].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[10], src2.w = const[0] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[11].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[17] MAD temp[18].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1], src1.w = temp[25] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[26].w, |src1.w| 11: src0.xyz = temp[16], src0.w = temp[26], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 15: TEX temp[20], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 16: src0.xyz = temp[22], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 17: src0.xyz = temp[23], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 18: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 19: src0.xyz = temp[20], src0.w = temp[28], src1.xyz = const[0] MAD temp[21].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[29].w, src0.w 20: src0.xyz = temp[10], src0.w = temp[34], src1.w = temp[29], src2.w = 32.000000 (0x60) MAD temp[35].xyz, src0.xyz, src0.www, src0.000 MAD temp[30].w, src1.w, src2.w, src0.0 21: src0.xyz = temp[35], src0.w = temp[30], src1.xyz = const[5] MAX temp[36].xyz, src0.xyz, src1.xyz EX2 temp[31].w, src0.w 22: src0.w = temp[31], src1.w = temp[20] MAD temp[32].w, src0.w, src1.w, src0.0 23: src0.xyz = const[4], src0.w = temp[32], src1.xyz = temp[21] MAD temp[33].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[33], src1.xyz = temp[36] MAD_SAT temp[37].xyz, src0.xyz, src1.xyz, src0.000 25: src0.xyz = const[7], src1.xyz = temp[37], src2.xyz = temp[38], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 1: src0.xyz = input[3], src0.w = temp[0], src1.xyz = const[6] MAD_SAT temp[0].z, src0.__x, src1.__x, src1.__y RSQ temp[0].w, |src0.w| 2: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[3], input[1].xy__, 2D[1]; 5: TEX temp[1].xyz, input[1].xy__, 2D[2]; 6: TEX temp[0].w, input[0].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[3], src2.w = const[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[1].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[2].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[0] MAD temp[0].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = input[0], src1.w = temp[2] MAD temp[0].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[0].w, |src1.w| 11: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[2].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[1] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[4].xyz, temp[0].xy__, 2D[3]; 15: TEX temp[5], temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 16: src0.xyz = temp[4], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[4].xyz, src0.xyz, src0.www, -src0.111 17: src0.xyz = temp[4], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[0].w, src0._, src0._ 18: src0.xyz = temp[2], src1.xyz = temp[4] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 19: src0.xyz = temp[5], src0.w = temp[1], src1.xyz = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[1].w, src0.w 20: src0.xyz = temp[3], src0.w = temp[0], src1.w = temp[1], src2.w = 32.000000 (0x60) MAD temp[2].xyz, src0.xyz, src0.www, src0.000 MAD temp[0].w, src1.w, src2.w, src0.0 21: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = const[5] MAX temp[2].xyz, src0.xyz, src1.xyz EX2 temp[0].w, src0.w 22: src0.w = temp[0], src1.w = temp[5] MAD temp[0].w, src0.w, src1.w, src0.0 23: src0.xyz = const[4], src0.w = temp[0], src1.xyz = temp[1] MAD temp[1].xyz, src0.www, src0.xyz, src1.xyz 24: src0.xyz = temp[1], src1.xyz = temp[2] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src0.000 25: src0.xyz = const[7], src1.xyz = temp[1], src2.xyz = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.zzz, srcp.xyz, src0.xyz R500 Fragment Program: -------- 0 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00086000:ALU wmask: AB omask: NONE 1:RGB_ADDR 0x08041803:Addr0: 3t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122090:rgb_A_src:0 0/0/R 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00191000:MAD dest:0 rgb_C_src:1 0/0/G 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe403f401: src: 1 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00107a04:ALU TEX_WAIT NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x10000cc0:Addr0: 192t, Addr1: 3t, Addr2: 0c, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 0 0 7 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000191:DP3 dest:25 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020102:Addr0: 2c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x08000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 G 0 9 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000800:Addr0: 0t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084046c:rgb_A_src:0 A/A/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00421000:MAD dest:0 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00040001:OUT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 12 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe404f400: src: 0 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 13 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 14 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8040:MAD dest:4 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 15 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000404:Addr0: 4t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000221:DP3 dest:34 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001002:Addr0: 2t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001c1:DP3 dest:28 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040005:Addr0: 5t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c019:LN2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0e000400:Addr0: 0t, Addr1: 1t, Addr2: 224t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d000:MAD dest:0 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 19 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041402:Addr0: 2t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000025:MAX dest:2 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001400:Addr0: 0t, Addr1: 5t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 21 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000504:Addr0: 4c, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00038005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x40000507:Addr0: 7c, Addr1: 1t, Addr2: 0t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 24 Instructions ~ 16 Vector Instructions (RGB) ~ 13 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 5 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 6 Temporary Registers ~ 3 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL OUT[5], GENERIC[3] DCL CONST[0] DCL CONST[2..10] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[7] 2: MAD TEMP[0], IN[0].yyyy, CONST[8], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[9], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[10], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: DP4 OUT[5].x, CONST[4], IN[0] 16: DP4 OUT[5].y, CONST[5], IN[0] 17: DP4 OUT[5].z, CONST[6], IN[0] 18: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[11]._xxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; CONST[11] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4162 reg: 11c swiz: Z/ Z/ U/ U src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 7: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6162 reg: 11c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 17: op: 0x00208201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 18: op: 0x00408201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 19: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 20: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 21 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL IN[4], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[7] DCL CONST[0] DCL CONST[2] DCL CONST[4..8] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 8.0000} IMM[1] FLT32 { 32.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[3], 2D 8: MAD TEMP[4].w, TEMP[4].wwww, CONST[2].xxxx, CONST[2].yyyy 9: MAD TEMP[5].xy, TEMP[4].wwww, TEMP[3], IN[1] 10: TEX TEMP[4], TEMP[5], SAMP[0], 2D 11: MUL TEMP[4].xyz, TEMP[4], CONST[0] 12: TEX TEMP[6].xyz, TEMP[5], SAMP[3], 2D 13: MAD TEMP[6].xyz, TEMP[6], IMM[0].xxxx, -IMM[0].yyyy 14: ADD TEMP[5].xyz, TEMP[3], TEMP[2] 15: DP3 TEMP[5].w, TEMP[5], TEMP[5] 16: RSQ TEMP[5].w, |TEMP[5].wwww| 17: MUL TEMP[5].xyz, TEMP[5].wwww, TEMP[5] 18: DP3_SAT TEMP[5].w, TEMP[5], TEMP[6] 19: POW TEMP[5].w, TEMP[5].wwww, IMM[1].xxxx 20: MUL TEMP[5].w, TEMP[5], TEMP[4] 21: MAD TEMP[4].xyz, TEMP[5].wwww, CONST[4], TEMP[4] 22: DP3_SAT TEMP[2].w, TEMP[6], TEMP[2] 23: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 24: MAX TEMP[0].xyz, TEMP[0], CONST[5] 25: TEX TEMP[2], IN[4], SAMP[7], 2D 26: MAD_SAT TEMP[3].xy, -IN[4].zzzz, TEMP[2].yyyy, TEMP[2].xzzz 27: CMP TEMP[2].w, -TEMP[3].xxxx, TEMP[2].wwww, IMM[0].zzzz 28: MAD_SAT TEMP[2].w, -IMM[0].wwww, TEMP[3].yyyy, TEMP[2].wwww 29: SUB_SAT TEMP[3].xyz, TEMP[0], CONST[6] 30: MAD TEMP[0].xyz, TEMP[2].wwww, -TEMP[3], TEMP[0] 31: MUL_SAT TEMP[1].xyz, TEMP[4], TEMP[0] 32: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[7].xxxx, CONST[7].yyyy 33: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[8] 34: MOV OUT[0].w, TEMP[1] 35: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[10].xxxx; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 28: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL_SAT temp[1].xyz, temp[4], temp[0]; 32: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 33: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 34: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[10].xxxx; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 28: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL_SAT temp[1].xyz, temp[4], temp[0]; 32: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 33: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 34: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[10].xxxx; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 28: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL_SAT temp[1].xyz, temp[4], temp[0]; 32: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 33: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 34: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[10].xxxx; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 28: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL_SAT temp[1].xyz, temp[4], temp[0]; 32: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 33: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 34: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[10].xxxx; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 28: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL_SAT temp[1].xyz, temp[4], temp[0]; 32: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 33: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 34: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: POW temp[5].w, temp[5].wwww, const[10].xxxx; 20: MUL temp[5].w, temp[5], temp[4]; 21: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 28: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL_SAT temp[1].xyz, temp[4], temp[0]; 32: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 33: LRP output[0].xyz, temp[7].xxxx, temp[1], const[8]; 34: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[5].xyz, temp[3], temp[2]; 15: DP3 temp[5].w, temp[5], temp[5]; 16: RSQ temp[5].w, |temp[5].wwww|; 17: MUL temp[5].xyz, temp[5].wwww, temp[5]; 18: DP3_SAT temp[5].w, temp[5], temp[6]; 19: LG2 temp[8].w, temp[5].wwww; 20: MUL temp[8].w, temp[8].wwww, const[10].xxxx; 21: EX2 temp[5].w, temp[8].wwww; 22: MUL temp[5].w, temp[5], temp[4]; 23: MAD temp[4].xyz, temp[5].wwww, const[4], temp[4]; 24: DP3_SAT temp[2].w, temp[6], temp[2]; 25: MUL temp[0].xyz, temp[0], temp[2].wwww; 26: MAX temp[0].xyz, temp[0], const[5]; 27: TEX temp[2], input[4], 2D[7]; 28: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 29: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[9].zzzz; 30: MAD_SAT temp[2].w, -const[9].wwww, temp[3].yyyy, temp[2].wwww; 31: ADD_SAT temp[3].xyz, temp[0], -const[6]; 32: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 33: MUL_SAT temp[1].xyz, temp[4], temp[0]; 34: MAD_SAT temp[7].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 35: ADD temp[9].xyz, temp[1], -const[8]; 36: MAD output[0].xyz, temp[7].xxxx, temp[9], const[8]; 37: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[3]; 8: MAD temp[4].w, temp[4].___w, const[2].___x, const[2].___y; 9: MAD temp[5].xy, temp[4].ww__, temp[3].xy__, input[1].xy__; 10: TEX temp[4], temp[5].xy__, 2D[0]; 11: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 12: TEX temp[6].xyz, temp[5].xy__, 2D[3]; 13: MAD temp[6].xyz, temp[6].xyz_, const[9].xxx_, -const[9].yyy_; 14: ADD temp[5].xyz, temp[3].xyz_, temp[2].xyz_; 15: DP3 temp[5].w, temp[5].xyz_, temp[5].xyz_; 16: RSQ temp[5].w, |temp[5].___w|; 17: MUL temp[5].xyz, temp[5].www_, temp[5].xyz_; 18: DP3_SAT temp[5].w, temp[5].xyz_, temp[6].xyz_; 19: LG2 temp[8].w, temp[5].___w; 20: MUL temp[8].w, temp[8].___w, const[10].___x; 21: EX2 temp[5].w, temp[8].___w; 22: MUL temp[5].w, temp[5].___w, temp[4].___w; 23: MAD temp[4].xyz, temp[5].www_, const[4].xyz_, temp[4].xyz_; 24: DP3_SAT temp[2].w, temp[6].xyz_, temp[2].xyz_; 25: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 26: MAX temp[0].xyz, temp[0].xyz_, const[5].xyz_; 27: TEX temp[2], input[4].xy__, 2D[7]; 28: MAD_SAT temp[3].xy, -input[4].zz__, temp[2].yy__, temp[2].xz__; 29: CMP temp[2].w, -temp[3].___x, temp[2].___w, const[9].___z; 30: MAD_SAT temp[2].w, -const[9].___w, temp[3].___y, temp[2].___w; 31: ADD_SAT temp[3].xyz, temp[0].xyz_, -const[6].xyz_; 32: MAD temp[0].xyz, temp[2].www_, -temp[3].xyz_, temp[0].xyz_; 33: MUL_SAT temp[1].xyz, temp[4].xyz_, temp[0].xyz_; 34: MAD_SAT temp[7].x, input[0].x___, const[7].x___, const[7].y___; 35: ADD temp[9].xyz, temp[1].xyz_, -const[8].xyz_; 36: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[8].xyz_; 37: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, const[9].xxx_, -const[9].yyy_; 14: ADD temp[24].xyz, temp[16].xyz_, temp[13].xyz_; 15: DP3 temp[25].w, temp[24].xyz_, temp[24].xyz_; 16: RSQ temp[26].w, |temp[25].___w|; 17: MUL temp[27].xyz, temp[26].www_, temp[24].xyz_; 18: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 19: LG2 temp[29].w, temp[28].___w; 20: MUL temp[30].w, temp[29].___w, const[10].___x; 21: EX2 temp[31].w, temp[30].___w; 22: MUL temp[32].w, temp[31].___w, temp[20].___w; 23: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 24: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 25: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 26: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 27: TEX temp[37], input[4].xy__, 2D[7]; 28: MAD_SAT temp[38].xy, -input[4].zz__, temp[37].yy__, temp[37].xz__; 29: CMP temp[39].w, -temp[38].___x, temp[37].___w, const[9].___z; 30: MAD_SAT temp[40].w, -const[9].___w, temp[38].___y, temp[39].___w; 31: ADD_SAT temp[41].xyz, temp[36].xyz_, -const[6].xyz_; 32: MAD temp[42].xyz, temp[40].www_, -temp[41].xyz_, temp[36].xyz_; 33: MUL_SAT temp[43].xyz, temp[33].xyz_, temp[42].xyz_; 34: MAD_SAT temp[44].x, input[0].x___, const[7].x___, const[7].y___; 35: ADD temp[45].xyz, temp[43].xyz_, -const[8].xyz_; 36: MAD output[0].xyz, temp[44].xxx_, temp[45].xyz_, const[8].xyz_; 37: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[9].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, const[9].xxx_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, const[10].___x; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: TEX temp[37], input[4].xy__, 2D[7]; 27: MAD_SAT temp[38].xy, -input[4].zz__, temp[37].yy__, temp[37].xz__; 28: CMP temp[39].w, -temp[38].___x, temp[37].___w, none.___0; 29: MAD_SAT temp[40].w, -const[9].___w, temp[38].___y, temp[39].___w; 30: ADD_SAT temp[41].xyz, temp[36].xyz_, -const[6].xyz_; 31: MAD temp[42].xyz, temp[40].www_, -temp[41].xyz_, temp[36].xyz_; 32: MUL_SAT temp[43].xyz, temp[33].xyz_, temp[42].xyz_; 33: MAD_SAT temp[44].x, input[0].x___, const[7].x___, const[7].y___; 34: MAD output[0].xyz, temp[44].xxx_, (temp[43] - const[8]).xyz_, const[8].xyz_; 35: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, 32.000000 (0x60).___w; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: TEX temp[37], input[4].xy__, 2D[7]; 27: MAD_SAT temp[38].xy, -input[4].zz__, temp[37].yy__, temp[37].xz__; 28: CMP temp[39].w, -temp[38].___x, temp[37].___w, none.___0; 29: MAD_SAT temp[40].w, -8.000000 (0x50).___w, temp[38].___y, temp[39].___w; 30: ADD_SAT temp[41].xyz, temp[36].xyz_, -const[6].xyz_; 31: MAD temp[42].xyz, temp[40].www_, -temp[41].xyz_, temp[36].xyz_; 32: MUL_SAT temp[43].xyz, temp[33].xyz_, temp[42].xyz_; 33: MAD_SAT temp[44].x, input[0].x___, const[7].x___, const[7].y___; 34: MAD output[0].xyz, temp[44].xxx_, (temp[43] - const[8]).xyz_, const[8].xyz_; 35: MOV output[0].w, temp[11].___w; CONST[9] = { 2.0000 1.0000 0.0000 8.0000 } CONST[10] = { 32.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, 32.000000 (0x60).___w; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: TEX temp[37], input[4].xy__, 2D[7]; 27: MAD_SAT temp[38].xy, -input[4].zz__, temp[37].yy__, temp[37].xz__; 28: CMP temp[39].w, -temp[38].___x, temp[37].___w, none.___0; 29: MAD_SAT temp[40].w, -8.000000 (0x50).___w, temp[38].___y, temp[39].___w; 30: ADD_SAT temp[41].xyz, temp[36].xyz_, -const[6].xyz_; 31: MAD temp[42].xyz, temp[40].www_, -temp[41].xyz_, temp[36].xyz_; 32: MUL_SAT temp[43].xyz, temp[33].xyz_, temp[42].xyz_; 33: MAD_SAT temp[44].x, input[0].x___, const[7].x___, const[7].y___; 34: MAD output[0].xyz, temp[44].xxx_, (temp[43] - const[8]).xyz_, const[8].xyz_; 35: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: MUL temp[21].xyz, temp[20].xyz_, const[0].xyz_; 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[25].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, (temp[13] + temp[16]).xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, 32.000000 (0x60).___w; 20: EX2 temp[31].w, temp[30].___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MAD temp[33].xyz, temp[32].www_, const[4].xyz_, temp[21].xyz_; 23: DP3_SAT temp[34].w, temp[23].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: TEX temp[37], input[4].xy__, 2D[7]; 27: MAD_SAT temp[38].xy, -input[4].zz__, temp[37].yy__, temp[37].xz__; 28: CMP temp[39].w, -temp[38].___x, temp[37].___w, none.___0; 29: MAD_SAT temp[40].w, -8.000000 (0x50).___w, temp[38].___y, temp[39].___w; 30: ADD_SAT temp[41].xyz, temp[36].xyz_, -const[6].xyz_; 31: MAD temp[42].xyz, temp[40].www_, -temp[41].xyz_, temp[36].xyz_; 32: MUL_SAT temp[43].xyz, temp[33].xyz_, temp[42].xyz_; 33: MAD_SAT temp[44].x, input[0].x___, const[7].x___, const[7].y___; 34: MAD output[0].xyz, temp[44].xxx_, (temp[43] - const[8]).xyz_, const[8].xyz_; 35: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: src0.xyz = const[2], src0.w = temp[17] MAD temp[18].w, src0.w, src0.x, src0.y 9: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ 10: TEX temp[20], temp[19].xy__, 2D[0]; 11: src0.xyz = temp[20], src1.xyz = const[0] MAD temp[21].xyz, src0.xyz, src1.xyz, src0.000 12: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 13: src0.xyz = temp[22], src0.w = 2.000000 (0x40) MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 14: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 15: src0.w = temp[25] RSQ temp[26].w, |src0.w| 16: src0.xyz = temp[16], src0.w = temp[26], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 17: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 18: src0.w = temp[28] LG2 temp[29].w, src0.w 19: src0.w = temp[29], src1.w = 32.000000 (0x60) MAD temp[30].w, src0.w, src1.w, src0.0 20: src0.w = temp[30] EX2 temp[31].w, src0.w 21: src0.w = temp[31], src1.w = temp[20] MAD temp[32].w, src0.w, src1.w, src0.0 22: src0.xyz = const[4], src0.w = temp[32], src1.xyz = temp[21] MAD temp[33].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[23], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 24: src0.xyz = temp[10], src0.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.www, src0.000 25: src0.xyz = temp[35], src1.xyz = const[5] MAX temp[36].xyz, src0.xyz, src1.xyz 26: TEX temp[37], input[4].xy__, 2D[7]; 27: src0.xyz = input[4], src1.xyz = temp[37] MAD_SAT temp[38].xy, -src0.zz_, src1.yy_, src1.xz_ 28: src0.xyz = temp[38], src0.w = temp[37] CMP temp[39].w, src0.0, src0.w, -src0.x 29: src0.xyz = temp[38], src0.w = 8.000000 (0x50), src1.w = temp[39] MAD_SAT temp[40].w, -src0.w, src0.y, src1.w 30: src0.xyz = temp[36], src1.xyz = const[6] MAD_SAT temp[41].xyz, src0.xyz, src0.111, -src1.xyz 31: src0.xyz = temp[41], src0.w = temp[40], src1.xyz = temp[36] MAD temp[42].xyz, src0.www, -src0.xyz, src1.xyz 32: src0.xyz = temp[33], src1.xyz = temp[42] MAD_SAT temp[43].xyz, src0.xyz, src1.xyz, src0.000 33: src0.xyz = input[0], src1.xyz = const[7] MAD_SAT temp[44].x, src0.x__, src1.x__, src1.y__ 34: src0.xyz = const[8], src1.xyz = temp[43], src2.xyz = temp[44], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 35: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[14], src1.xyz = const[7] MAD_SAT temp[44].x, src0.x__, src1.x__, src1.y__ RSQ temp[15].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[10], input[2].xy__, 2D[1]; 5: TEX temp[12].xyz, input[2].xy__, 2D[2]; 6: TEX temp[17].w, input[1].xy__, 2D[3]; 7: TEX temp[37], input[4].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 8: src0.xyz = input[4], src0.w = temp[17], src1.xyz = temp[37], src2.xyz = const[2] SEM_WAIT MAD_SAT temp[38].xy, -src0.zz_, src1.yy_, src1.xz_ MAD temp[18].w, src0.w, src2.x, src2.y 9: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1], src1.w = temp[37], src2.xyz = temp[38] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ CMP temp[39].w, src0.0, src1.w, -src2.x 10: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[10], src2.w = const[0] MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[11].w, src1.w, src2.w, src0.0 11: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 12: src0.xyz = temp[38], src0.w = 8.000000 (0x50), src1.w = temp[39] MAD_SAT temp[40].w, -src0.w, src0.y, src1.w 13: src0.w = temp[25] RSQ temp[26].w, |src0.w| 14: src0.xyz = temp[16], src0.w = temp[26], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 15: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 16: BEGIN_TEX; 17: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 18: TEX temp[20], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 19: src0.xyz = temp[22], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 20: src0.xyz = temp[23], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 21: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 22: src0.xyz = temp[20], src0.w = temp[28], src1.xyz = const[0] MAD temp[21].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[29].w, src0.w 23: src0.xyz = temp[10], src0.w = temp[34], src1.w = temp[29], src2.w = 32.000000 (0x60) MAD temp[35].xyz, src0.xyz, src0.www, src0.000 MAD temp[30].w, src1.w, src2.w, src0.0 24: src0.xyz = temp[35], src0.w = temp[30], src1.xyz = const[5] MAX temp[36].xyz, src0.xyz, src1.xyz EX2 temp[31].w, src0.w 25: src0.xyz = temp[36], src0.w = temp[31], src1.xyz = const[6], src1.w = temp[20] MAD_SAT temp[41].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[32].w, src0.w, src1.w, src0.0 26: src0.xyz = temp[41], src0.w = temp[40], src1.xyz = temp[36] MAD temp[42].xyz, src0.www, -src0.xyz, src1.xyz 27: src0.xyz = const[4], src0.w = temp[32], src1.xyz = temp[21] MAD temp[33].xyz, src0.www, src0.xyz, src1.xyz 28: src0.xyz = temp[33], src1.xyz = temp[42] MAD_SAT temp[43].xyz, src0.xyz, src1.xyz, src0.000 29: src0.xyz = const[8], src1.xyz = temp[43], src2.xyz = temp[44], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[14], src1.xyz = const[7] MAD_SAT temp[44].x, src0.x__, src1.x__, src1.y__ RSQ temp[15].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[10], input[2].xy__, 2D[1]; 5: TEX temp[12].xyz, input[2].xy__, 2D[2]; 6: TEX temp[17].w, input[1].xy__, 2D[3]; 7: TEX temp[37], input[4].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 8: src0.xyz = input[4], src0.w = temp[17], src1.xyz = temp[37], src2.xyz = const[2] SEM_WAIT MAD_SAT temp[38].xy, -src0.zz_, src1.yy_, src1.xz_ MAD temp[18].w, src0.w, src2.x, src2.y 9: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1], src1.w = temp[37], src2.xyz = temp[38] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ CMP temp[39].w, src0.0, src1.w, -src2.x 10: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[10], src2.w = const[0] MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[11].w, src1.w, src2.w, src0.0 11: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 12: src0.xyz = temp[38], src0.w = 8.000000 (0x50), src1.w = temp[39] MAD_SAT temp[40].w, -src0.w, src0.y, src1.w 13: src0.w = temp[25] RSQ temp[26].w, |src0.w| 14: src0.xyz = temp[16], src0.w = temp[26], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 15: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 16: BEGIN_TEX; 17: TEX temp[22].xyz, temp[19].xy__, 2D[3]; 18: TEX temp[20], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 19: src0.xyz = temp[22], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 20: src0.xyz = temp[23], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 21: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 22: src0.xyz = temp[20], src0.w = temp[28], src1.xyz = const[0] MAD temp[21].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[29].w, src0.w 23: src0.xyz = temp[10], src0.w = temp[34], src1.w = temp[29], src2.w = 32.000000 (0x60) MAD temp[35].xyz, src0.xyz, src0.www, src0.000 MAD temp[30].w, src1.w, src2.w, src0.0 24: src0.xyz = temp[35], src0.w = temp[30], src1.xyz = const[5] MAX temp[36].xyz, src0.xyz, src1.xyz EX2 temp[31].w, src0.w 25: src0.xyz = temp[36], src0.w = temp[31], src1.xyz = const[6], src1.w = temp[20] MAD_SAT temp[41].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[32].w, src0.w, src1.w, src0.0 26: src0.xyz = temp[41], src0.w = temp[40], src1.xyz = temp[36] MAD temp[42].xyz, src0.www, -src0.xyz, src1.xyz 27: src0.xyz = const[4], src0.w = temp[32], src1.xyz = temp[21] MAD temp[33].xyz, src0.www, src0.xyz, src1.xyz 28: src0.xyz = temp[33], src1.xyz = temp[42] MAD_SAT temp[43].xyz, src0.xyz, src1.xyz, src0.000 29: src0.xyz = const[8], src1.xyz = temp[43], src2.xyz = temp[44], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 1: src0.xyz = input[4], src0.w = temp[0], src1.xyz = const[7] MAD_SAT temp[0].z, src0.__x, src1.__x, src1.__y RSQ temp[0].w, |src0.w| 2: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[4], input[1].xy__, 2D[1]; 5: TEX temp[1].xyz, input[1].xy__, 2D[2]; 6: TEX temp[0].w, input[0].xy__, 2D[3]; 7: TEX temp[5], input[3].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 8: src0.xyz = input[3], src0.w = temp[0], src1.xyz = temp[5], src2.xyz = const[2] SEM_WAIT MAD_SAT temp[3].xy, -src0.zz_, src1.yy_, src1.xz_ MAD temp[0].w, src0.w, src2.x, src2.y 9: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = input[0], src1.w = temp[5], src2.xyz = temp[3] MAD temp[0].xy, src0.ww_, src0.xy_, src1.xy_ CMP temp[0].w, src0.0, src1.w, -src2.x 10: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[4], src2.w = const[0] MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[1].w, src1.w, src2.w, src0.0 11: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[2].w, src0._, src0._ 12: src0.xyz = temp[3], src0.w = 8.000000 (0x50), src1.w = temp[0] MAD_SAT temp[0].w, -src0.w, src0.y, src1.w 13: src0.w = temp[2] RSQ temp[2].w, |src0.w| 14: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[2].xyz, src0.www, srcp.xyz, src0.000 15: src0.w = temp[1] MAD color[0].w, src0.w, src0.1, src0.0 16: BEGIN_TEX; 17: TEX temp[3].xyz, temp[0].xy__, 2D[3]; 18: TEX temp[5], temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 19: src0.xyz = temp[3], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[3].xyz, src0.xyz, src0.www, -src0.111 20: src0.xyz = temp[3], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 21: src0.xyz = temp[2], src1.xyz = temp[3] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 22: src0.xyz = temp[5], src0.w = temp[2], src1.xyz = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[2].w, src0.w 23: src0.xyz = temp[4], src0.w = temp[1], src1.w = temp[2], src2.w = 32.000000 (0x60) MAD temp[2].xyz, src0.xyz, src0.www, src0.000 MAD temp[1].w, src1.w, src2.w, src0.0 24: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[5] MAX temp[2].xyz, src0.xyz, src1.xyz EX2 temp[1].w, src0.w 25: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[6], src1.w = temp[5] MAD_SAT temp[3].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[1].w, src0.w, src1.w, src0.0 26: src0.xyz = temp[3], src0.w = temp[0], src1.xyz = temp[2] MAD temp[2].xyz, src0.www, -src0.xyz, src1.xyz 27: src0.xyz = const[4], src0.w = temp[1], src1.xyz = temp[1] MAD temp[1].xyz, src0.www, src0.xyz, src1.xyz 28: src0.xyz = temp[1], src1.xyz = temp[2] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src0.000 29: src0.xyz = const[8], src1.xyz = temp[1], src2.xyz = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.zzz, srcp.xyz, src0.xyz R500 Fragment Program: -------- 0 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00086000:ALU wmask: AB omask: NONE 1:RGB_ADDR 0x08041c04:Addr0: 4t, Addr1: 7c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122090:rgb_A_src:0 0/0/R 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00191000:MAD dest:0 rgb_C_src:1 0/0/G 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe404f401: src: 1 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004003:TEX wmask: A omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02470000: id: 7 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe405f403: src: 3 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 7 0:CMN_INST 0x00085804:ALU TEX_WAIT wmask: ARG omask: NONE 1:RGB_ADDR 0x10201403:Addr0: 3t, Addr1: 5t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084ac48:rgb_A_src:0 B/B/0 1 rgb_B_src:1 G/G/0 0 targ: 0 4 ALPHA_INST:0x0010c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:2 R 0 targ 0 w:0 5 RGBA_INST: 0x0c441030:MAD dest:3 rgb_C_src:1 R/B/0 0 alp_C_src:2 G 0 8 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x00300002:Addr0: 2t, Addr1: 0t, Addr2: 3t, srcp:0 2:ALPHA_ADDR 0x08001400:Addr0: 0t, Addr1: 5t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084046c:rgb_A_src:0 A/A/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x00690006:CMP dest:0 alp_A_src:0 0 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x44421000:MAD dest:0 rgb_C_src:1 R/G/0 0 alp_C_src:2 R 1 9 0:CMN_INST 0x00107a00:ALU NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x100010c0:Addr0: 192t, Addr1: 4t, Addr2: 0c, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 0 0 10 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000191:DP3 dest:25 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080000d0:Addr0: 208t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0022c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 G 0 targ 0 w:0 5 RGBA_INST: 0x1a000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:1 A 0 12 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c02b:RSQ dest:2 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00040001:OUT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 15 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe403f400: src: 0 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 16 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 17 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8030:MAD dest:3 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 18 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000403:Addr0: 3t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000221:DP3 dest:34 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000c02:Addr0: 2t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001c1:DP3 dest:28 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040005:Addr0: 5t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c029:LN2 dest:2 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0e000801:Addr0: 1t, Addr1: 2t, Addr2: 224t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 22 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041402:Addr0: 2t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c018:EX2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000025:MAX dest:2 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00087800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041802:Addr0: 2t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001401:Addr0: 1t, Addr1: 5t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20a21030:MAD dest:3 rgb_C_src:1 R/G/B 1 alp_C_src:0 0 0 24 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000803:Addr0: 3t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0144036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 1 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 25 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000504:Addr0: 4c, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 26 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 27 0:CMN_INST 0x00038005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x40000508:Addr0: 8c, Addr1: 1t, Addr2: 0t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 28 Instructions ~ 19 Vector Instructions (RGB) ~ 15 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 6 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 6 Temporary Registers ~ 4 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[4] DCL CONST[0] DCL CONST[2] DCL CONST[4..8] DCL TEMP[0..8] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 32.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[3], 2D 8: MAD TEMP[4].w, TEMP[4].wwww, CONST[2].xxxx, CONST[2].yyyy 9: MAD TEMP[5].xy, TEMP[4].wwww, TEMP[3], IN[1] 10: TEX TEMP[4], TEMP[5], SAMP[0], 2D 11: MUL TEMP[4].xyz, TEMP[4], CONST[0] 12: TEX TEMP[6].xyz, TEMP[5], SAMP[3], 2D 13: MAD TEMP[6].xyz, TEMP[6], IMM[0].xxxx, -IMM[0].yyyy 14: ADD TEMP[7].xyz, TEMP[3], TEMP[2] 15: DP3 TEMP[7].w, TEMP[7], TEMP[7] 16: RSQ TEMP[7].w, |TEMP[7].wwww| 17: MUL TEMP[7].xyz, TEMP[7].wwww, TEMP[7] 18: DP3_SAT TEMP[7].w, TEMP[7], TEMP[6] 19: POW TEMP[7].w, TEMP[7].wwww, IMM[0].wwww 20: MUL TEMP[7].w, TEMP[7], TEMP[4] 21: MAD TEMP[4].xyz, TEMP[7].wwww, CONST[4], TEMP[4] 22: DP3_SAT TEMP[2].w, TEMP[6], TEMP[2] 23: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 24: MAX TEMP[0].xyz, TEMP[0], CONST[5] 25: MUL TEMP[4].xyz, TEMP[4], TEMP[0] 26: TEX TEMP[0].xyz, TEMP[5], SAMP[4], 2D 27: MAD_SAT TEMP[1].xyz, TEMP[0], CONST[6], TEMP[4] 28: MAD_SAT TEMP[8].x, IN[0].xxxx, CONST[7].xxxx, CONST[7].yyyy 29: LRP OUT[0].xyz, TEMP[8].xxxx, TEMP[1], CONST[8] 30: MOV OUT[0].w, TEMP[1] 31: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[9].wwww; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 28: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 29: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 30: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[9].wwww; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 28: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 29: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 30: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[9].wwww; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 28: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 29: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 30: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[9].wwww; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 28: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 29: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 30: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[9].wwww; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 28: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 29: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 30: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[9].wwww; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 28: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 29: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 30: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: LG2 temp[9].w, temp[7].wwww; 20: MUL temp[9].w, temp[9].wwww, const[9].wwww; 21: EX2 temp[7].w, temp[9].wwww; 22: MUL temp[7].w, temp[7], temp[4]; 23: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 24: DP3_SAT temp[2].w, temp[6], temp[2]; 25: MUL temp[0].xyz, temp[0], temp[2].wwww; 26: MAX temp[0].xyz, temp[0], const[5]; 27: MUL temp[4].xyz, temp[4], temp[0]; 28: TEX temp[0].xyz, temp[5], 2D[4]; 29: MAD_SAT temp[1].xyz, temp[0], const[6], temp[4]; 30: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 31: ADD temp[10].xyz, temp[1], -const[8]; 32: MAD output[0].xyz, temp[8].xxxx, temp[10], const[8]; 33: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[3]; 8: MAD temp[4].w, temp[4].___w, const[2].___x, const[2].___y; 9: MAD temp[5].xy, temp[4].ww__, temp[3].xy__, input[1].xy__; 10: TEX temp[4], temp[5].xy__, 2D[0]; 11: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 12: TEX temp[6].xyz, temp[5].xy__, 2D[3]; 13: MAD temp[6].xyz, temp[6].xyz_, const[9].xxx_, -const[9].yyy_; 14: ADD temp[7].xyz, temp[3].xyz_, temp[2].xyz_; 15: DP3 temp[7].w, temp[7].xyz_, temp[7].xyz_; 16: RSQ temp[7].w, |temp[7].___w|; 17: MUL temp[7].xyz, temp[7].www_, temp[7].xyz_; 18: DP3_SAT temp[7].w, temp[7].xyz_, temp[6].xyz_; 19: LG2 temp[9].w, temp[7].___w; 20: MUL temp[9].w, temp[9].___w, const[9].___w; 21: EX2 temp[7].w, temp[9].___w; 22: MUL temp[7].w, temp[7].___w, temp[4].___w; 23: MAD temp[4].xyz, temp[7].www_, const[4].xyz_, temp[4].xyz_; 24: DP3_SAT temp[2].w, temp[6].xyz_, temp[2].xyz_; 25: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 26: MAX temp[0].xyz, temp[0].xyz_, const[5].xyz_; 27: MUL temp[4].xyz, temp[4].xyz_, temp[0].xyz_; 28: TEX temp[0].xyz, temp[5].xy__, 2D[4]; 29: MAD_SAT temp[1].xyz, temp[0].xyz_, const[6].xyz_, temp[4].xyz_; 30: MAD_SAT temp[8].x, input[0].x___, const[7].x___, const[7].y___; 31: ADD temp[10].xyz, temp[1].xyz_, -const[8].xyz_; 32: MAD output[0].xyz, temp[8].xxx_, temp[10].xyz_, const[8].xyz_; 33: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, const[9].xxx_, -const[9].yyy_; 14: ADD temp[25].xyz, temp[17].xyz_, temp[14].xyz_; 15: DP3 temp[26].w, temp[25].xyz_, temp[25].xyz_; 16: RSQ temp[27].w, |temp[26].___w|; 17: MUL temp[28].xyz, temp[27].www_, temp[25].xyz_; 18: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 19: LG2 temp[30].w, temp[29].___w; 20: MUL temp[31].w, temp[30].___w, const[9].___w; 21: EX2 temp[32].w, temp[31].___w; 22: MUL temp[33].w, temp[32].___w, temp[21].___w; 23: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 24: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 25: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 26: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 27: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 28: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 29: MAD_SAT temp[40].xyz, temp[39].xyz_, const[6].xyz_, temp[38].xyz_; 30: MAD_SAT temp[41].x, input[0].x___, const[7].x___, const[7].y___; 31: ADD temp[42].xyz, temp[40].xyz_, -const[8].xyz_; 32: MAD output[0].xyz, temp[41].xxx_, temp[42].xyz_, const[8].xyz_; 33: MOV output[0].w, temp[12].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, const[9].xxx_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, const[9].xxx_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, const[9].___w; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MAD_SAT temp[40].xyz, temp[39].xyz_, const[6].xyz_, temp[38].xyz_; 29: MAD_SAT temp[41].x, input[0].x___, const[7].x___, const[7].y___; 30: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[8]).xyz_, const[8].xyz_; 31: MOV output[0].w, temp[12].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, 32.000000 (0x60).___w; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MAD_SAT temp[40].xyz, temp[39].xyz_, const[6].xyz_, temp[38].xyz_; 29: MAD_SAT temp[41].x, input[0].x___, const[7].x___, const[7].y___; 30: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[8]).xyz_, const[8].xyz_; 31: MOV output[0].w, temp[12].___w; CONST[9] = { 2.0000 1.0000 0.0000 32.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, 32.000000 (0x60).___w; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MAD_SAT temp[40].xyz, temp[39].xyz_, const[6].xyz_, temp[38].xyz_; 29: MAD_SAT temp[41].x, input[0].x___, const[7].x___, const[7].y___; 30: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[8]).xyz_, const[8].xyz_; 31: MOV output[0].w, temp[12].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, 32.000000 (0x60).___w; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MAD_SAT temp[40].xyz, temp[39].xyz_, const[6].xyz_, temp[38].xyz_; 29: MAD_SAT temp[41].x, input[0].x___, const[7].x___, const[7].y___; 30: MAD output[0].xyz, temp[41].xxx_, (temp[40] - const[8]).xyz_, const[8].xyz_; 31: MOV output[0].w, temp[12].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: src0.w = temp[11], src1.w = const[0] MAD_SAT temp[12].w, src0.w, src1.w, src0.0 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[13], src0.w = 2.000000 (0x40) MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 5: src0.w = temp[15] RSQ temp[16].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: src0.xyz = const[2], src0.w = temp[18] MAD temp[19].w, src0.w, src0.x, src0.y 9: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: src0.xyz = temp[21], src1.xyz = const[0] MAD temp[22].xyz, src0.xyz, src1.xyz, src0.000 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: src0.xyz = temp[23], src0.w = 2.000000 (0x40) MAD temp[24].xyz, src0.xyz, src0.www, -src0.111 14: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[26].w, src0._, src0._ 15: src0.w = temp[26] RSQ temp[27].w, |src0.w| 16: src0.xyz = temp[17], src0.w = temp[27], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[28].xyz, src0.www, srcp.xyz, src0.000 17: src0.xyz = temp[28], src1.xyz = temp[24] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[29].w, src0._, src0._ 18: src0.w = temp[29] LG2 temp[30].w, src0.w 19: src0.w = temp[30], src1.w = 32.000000 (0x60) MAD temp[31].w, src0.w, src1.w, src0.0 20: src0.w = temp[31] EX2 temp[32].w, src0.w 21: src0.w = temp[32], src1.w = temp[21] MAD temp[33].w, src0.w, src1.w, src0.0 22: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[22] MAD temp[34].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[24], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 24: src0.xyz = temp[11], src0.w = temp[35] MAD temp[36].xyz, src0.xyz, src0.www, src0.000 25: src0.xyz = temp[36], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz 26: src0.xyz = temp[34], src1.xyz = temp[37] MAD temp[38].xyz, src0.xyz, src1.xyz, src0.000 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: src0.xyz = temp[39], src1.xyz = const[6], src2.xyz = temp[38] MAD_SAT temp[40].xyz, src0.xyz, src1.xyz, src2.xyz 29: src0.xyz = input[0], src1.xyz = const[7] MAD_SAT temp[41].x, src0.x__, src1.x__, src1.y__ 30: src0.xyz = const[8], src1.xyz = temp[40], src2.xyz = temp[41], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 31: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[15], src1.xyz = const[7] MAD_SAT temp[41].x, src0.x__, src1.x__, src1.y__ RSQ temp[16].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[11], input[2].xy__, 2D[1]; 5: TEX temp[13].xyz, input[2].xy__, 2D[2]; 6: TEX temp[18].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[13], src0.w = 2.000000 (0x40), src1.w = temp[11], src2.w = const[0] SEM_WAIT MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[12].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[26].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[18] MAD temp[19].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1], src1.w = temp[26] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[27].w, |src1.w| 11: src0.xyz = temp[17], src0.w = temp[27], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[28].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 15: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 16: TEX temp[21], temp[20].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 17: src0.xyz = temp[23], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[24].xyz, src0.xyz, src0.www, -src0.111 18: src0.xyz = temp[24], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 19: src0.xyz = temp[28], src1.xyz = temp[24] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[29].w, src0._, src0._ 20: src0.xyz = temp[21], src0.w = temp[29], src1.xyz = const[0] MAD temp[22].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[30].w, src0.w 21: src0.xyz = temp[11], src0.w = temp[35], src1.w = temp[30], src2.w = 32.000000 (0x60) MAD temp[36].xyz, src0.xyz, src0.www, src0.000 MAD temp[31].w, src1.w, src2.w, src0.0 22: src0.xyz = temp[36], src0.w = temp[31], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz EX2 temp[32].w, src0.w 23: src0.w = temp[32], src1.w = temp[21] MAD temp[33].w, src0.w, src1.w, src0.0 24: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[22] MAD temp[34].xyz, src0.www, src0.xyz, src1.xyz 25: src0.xyz = temp[34], src1.xyz = temp[37] MAD temp[38].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = temp[39], src1.xyz = const[6], src2.xyz = temp[38] MAD_SAT temp[40].xyz, src0.xyz, src1.xyz, src2.xyz 27: src0.xyz = const[8], src1.xyz = temp[40], src2.xyz = temp[41], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[15], src1.xyz = const[7] MAD_SAT temp[41].x, src0.x__, src1.x__, src1.y__ RSQ temp[16].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[11], input[2].xy__, 2D[1]; 5: TEX temp[13].xyz, input[2].xy__, 2D[2]; 6: TEX temp[18].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[13], src0.w = 2.000000 (0x40), src1.w = temp[11], src2.w = const[0] SEM_WAIT MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[12].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[26].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[18] MAD temp[19].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1], src1.w = temp[26] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[27].w, |src1.w| 11: src0.xyz = temp[17], src0.w = temp[27], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[28].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 15: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 16: TEX temp[21], temp[20].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 17: src0.xyz = temp[23], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[24].xyz, src0.xyz, src0.www, -src0.111 18: src0.xyz = temp[24], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 19: src0.xyz = temp[28], src1.xyz = temp[24] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[29].w, src0._, src0._ 20: src0.xyz = temp[21], src0.w = temp[29], src1.xyz = const[0] MAD temp[22].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[30].w, src0.w 21: src0.xyz = temp[11], src0.w = temp[35], src1.w = temp[30], src2.w = 32.000000 (0x60) MAD temp[36].xyz, src0.xyz, src0.www, src0.000 MAD temp[31].w, src1.w, src2.w, src0.0 22: src0.xyz = temp[36], src0.w = temp[31], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz EX2 temp[32].w, src0.w 23: src0.w = temp[32], src1.w = temp[21] MAD temp[33].w, src0.w, src1.w, src0.0 24: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[22] MAD temp[34].xyz, src0.www, src0.xyz, src1.xyz 25: src0.xyz = temp[34], src1.xyz = temp[37] MAD temp[38].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = temp[39], src1.xyz = const[6], src2.xyz = temp[38] MAD_SAT temp[40].xyz, src0.xyz, src1.xyz, src2.xyz 27: src0.xyz = const[8], src1.xyz = temp[40], src2.xyz = temp[41], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 1: src0.xyz = input[3], src0.w = temp[0], src1.xyz = const[7] MAD_SAT temp[0].z, src0.__x, src1.__x, src1.__y RSQ temp[0].w, |src0.w| 2: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[3], input[1].xy__, 2D[1]; 5: TEX temp[1].xyz, input[1].xy__, 2D[2]; 6: TEX temp[0].w, input[0].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[3], src2.w = const[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[1].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[2].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[0] MAD temp[0].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = input[0], src1.w = temp[2] MAD temp[0].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[0].w, |src1.w| 11: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[2].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[1] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[4].xyz, temp[0].xy__, 2D[4]; 15: TEX temp[5].xyz, temp[0].xy__, 2D[3]; 16: TEX temp[6], temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 17: src0.xyz = temp[5], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[5].xyz, src0.xyz, src0.www, -src0.111 18: src0.xyz = temp[5], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[0].w, src0._, src0._ 19: src0.xyz = temp[2], src1.xyz = temp[5] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 20: src0.xyz = temp[6], src0.w = temp[1], src1.xyz = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[1].w, src0.w 21: src0.xyz = temp[3], src0.w = temp[0], src1.w = temp[1], src2.w = 32.000000 (0x60) MAD temp[2].xyz, src0.xyz, src0.www, src0.000 MAD temp[0].w, src1.w, src2.w, src0.0 22: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = const[5] MAX temp[2].xyz, src0.xyz, src1.xyz EX2 temp[0].w, src0.w 23: src0.w = temp[0], src1.w = temp[6] MAD temp[0].w, src0.w, src1.w, src0.0 24: src0.xyz = const[4], src0.w = temp[0], src1.xyz = temp[1] MAD temp[1].xyz, src0.www, src0.xyz, src1.xyz 25: src0.xyz = temp[1], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = temp[4], src1.xyz = const[6], src2.xyz = temp[1] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 27: src0.xyz = const[8], src1.xyz = temp[1], src2.xyz = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.zzz, srcp.xyz, src0.xyz R500 Fragment Program: -------- 0 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000f1:DP3 dest:15 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00086000:ALU wmask: AB omask: NONE 1:RGB_ADDR 0x08041c03:Addr0: 3t, Addr1: 7c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122090:rgb_A_src:0 0/0/R 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00191000:MAD dest:0 rgb_C_src:1 0/0/G 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe403f401: src: 1 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00107a04:ALU TEX_WAIT NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x10000cc0:Addr0: 192t, Addr1: 3t, Addr2: 0c, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 0 0 7 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001a1:DP3 dest:26 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020102:Addr0: 2c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x08000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 G 0 9 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000800:Addr0: 0t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084046c:rgb_A_src:0 A/A/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00421000:MAD dest:0 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00040001:OUT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 12 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00440000: id: 4 op:LD, , SCALED 2:TEX_ADDR: 0xe404f400: src: 0 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 13 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 14 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe406f400: src: 0 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 15 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8050:MAD dest:5 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 16 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000405:Addr0: 5t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000231:DP3 dest:35 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001402:Addr0: 2t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001d1:DP3 dest:29 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040006:Addr0: 6t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c019:LN2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0e000400:Addr0: 0t, Addr1: 1t, Addr2: 224t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d000:MAD dest:0 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 20 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041402:Addr0: 2t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000025:MAX dest:2 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001800:Addr0: 0t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 22 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000504:Addr0: 4c, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00141804:Addr0: 4t, Addr1: 6c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 25 0:CMN_INST 0x00038005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x40000508:Addr0: 8c, Addr1: 1t, Addr2: 0t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 26 Instructions ~ 17 Vector Instructions (RGB) ~ 13 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 6 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 7 Temporary Registers ~ 3 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL OUT[5], GENERIC[3] DCL CONST[0] DCL CONST[2..10] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[7] 2: MAD TEMP[0], IN[0].yyyy, CONST[8], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[9], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[10], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: DP4 OUT[5].x, CONST[4], IN[0] 16: DP4 OUT[5].y, CONST[5], IN[0] 17: DP4 OUT[5].z, CONST[6], IN[0] 18: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: DP4 output[5].x, const[4], input[0]; 16: DP4 output[5].y, const[5], input[0]; 17: DP4 output[5].z, const[6], input[0]; 18: MOV output[0], temp[4]; 19: MOV output[6], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[11].xxxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[11].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[11]._xxy; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -const[11].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[4], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[4]; 20: MOV output[6], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; CONST[11] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[7]; 2: MAD temp[0], input[0].yyyy, const[8], temp[0]; 3: MAD temp[0], input[0].zzzz, const[9], temp[0]; 4: MAD temp[0], input[0].wwww, const[10], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[11].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[11].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: DP4 output[5].x, const[4], input[0]; 17: DP4 output[5].y, const[5], input[0]; 18: DP4 output[5].z, const[6], input[0]; 19: MOV output[0], temp[0]; 20: MOV output[6], temp[0]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x012480e2 reg: 7c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4162 reg: 11c swiz: Z/ Z/ U/ U src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 7: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6162 reg: 11c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 17: op: 0x00208201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 18: op: 0x00408201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 19: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 20: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 21 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL IN[4], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[4] DCL SAMP[7] DCL CONST[0] DCL CONST[2] DCL CONST[4..9] DCL TEMP[0..8] IMM[0] FLT32 { 2.0000, 1.0000, 0.0000, 8.0000} IMM[1] FLT32 { 32.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[3], 2D 8: MAD TEMP[4].w, TEMP[4].wwww, CONST[2].xxxx, CONST[2].yyyy 9: MAD TEMP[5].xy, TEMP[4].wwww, TEMP[3], IN[1] 10: TEX TEMP[4], TEMP[5], SAMP[0], 2D 11: MUL TEMP[4].xyz, TEMP[4], CONST[0] 12: TEX TEMP[6].xyz, TEMP[5], SAMP[3], 2D 13: MAD TEMP[6].xyz, TEMP[6], IMM[0].xxxx, -IMM[0].yyyy 14: ADD TEMP[7].xyz, TEMP[3], TEMP[2] 15: DP3 TEMP[7].w, TEMP[7], TEMP[7] 16: RSQ TEMP[7].w, |TEMP[7].wwww| 17: MUL TEMP[7].xyz, TEMP[7].wwww, TEMP[7] 18: DP3_SAT TEMP[7].w, TEMP[7], TEMP[6] 19: POW TEMP[7].w, TEMP[7].wwww, IMM[1].xxxx 20: MUL TEMP[7].w, TEMP[7], TEMP[4] 21: MAD TEMP[4].xyz, TEMP[7].wwww, CONST[4], TEMP[4] 22: DP3_SAT TEMP[2].w, TEMP[6], TEMP[2] 23: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 24: MAX TEMP[0].xyz, TEMP[0], CONST[5] 25: TEX TEMP[2], IN[4], SAMP[7], 2D 26: MAD_SAT TEMP[3].xy, -IN[4].zzzz, TEMP[2].yyyy, TEMP[2].xzzz 27: CMP TEMP[2].w, -TEMP[3].xxxx, TEMP[2].wwww, IMM[0].zzzz 28: MAD_SAT TEMP[2].w, -IMM[0].wwww, TEMP[3].yyyy, TEMP[2].wwww 29: SUB_SAT TEMP[3].xyz, TEMP[0], CONST[6] 30: MAD TEMP[0].xyz, TEMP[2].wwww, -TEMP[3], TEMP[0] 31: MUL TEMP[4].xyz, TEMP[4], TEMP[0] 32: TEX TEMP[0].xyz, TEMP[5], SAMP[4], 2D 33: MAD_SAT TEMP[1].xyz, TEMP[0], CONST[7], TEMP[4] 34: MAD_SAT TEMP[8].x, IN[0].xxxx, CONST[8].xxxx, CONST[8].yyyy 35: LRP OUT[0].xyz, TEMP[8].xxxx, TEMP[1], CONST[9] 36: MOV OUT[0].w, TEMP[1] 37: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[10].xxxx, -const[10].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[10].xxxx, -const[10].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[11].xxxx; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[10].zzzz; 28: MAD_SAT temp[2].w, -const[10].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL temp[4].xyz, temp[4], temp[0]; 32: TEX temp[0].xyz, temp[5], 2D[4]; 33: MAD_SAT temp[1].xyz, temp[0], const[7], temp[4]; 34: MAD_SAT temp[8].x, input[0].xxxx, const[8].xxxx, const[8].yyyy; 35: LRP output[0].xyz, temp[8].xxxx, temp[1], const[9]; 36: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[10].xxxx, -const[10].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[10].xxxx, -const[10].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[11].xxxx; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[10].zzzz; 28: MAD_SAT temp[2].w, -const[10].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL temp[4].xyz, temp[4], temp[0]; 32: TEX temp[0].xyz, temp[5], 2D[4]; 33: MAD_SAT temp[1].xyz, temp[0], const[7], temp[4]; 34: MAD_SAT temp[8].x, input[0].xxxx, const[8].xxxx, const[8].yyyy; 35: LRP output[0].xyz, temp[8].xxxx, temp[1], const[9]; 36: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[10].xxxx, -const[10].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[10].xxxx, -const[10].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[11].xxxx; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[10].zzzz; 28: MAD_SAT temp[2].w, -const[10].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL temp[4].xyz, temp[4], temp[0]; 32: TEX temp[0].xyz, temp[5], 2D[4]; 33: MAD_SAT temp[1].xyz, temp[0], const[7], temp[4]; 34: MAD_SAT temp[8].x, input[0].xxxx, const[8].xxxx, const[8].yyyy; 35: LRP output[0].xyz, temp[8].xxxx, temp[1], const[9]; 36: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[10].xxxx, -const[10].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[10].xxxx, -const[10].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[11].xxxx; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[10].zzzz; 28: MAD_SAT temp[2].w, -const[10].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL temp[4].xyz, temp[4], temp[0]; 32: TEX temp[0].xyz, temp[5], 2D[4]; 33: MAD_SAT temp[1].xyz, temp[0], const[7], temp[4]; 34: MAD_SAT temp[8].x, input[0].xxxx, const[8].xxxx, const[8].yyyy; 35: LRP output[0].xyz, temp[8].xxxx, temp[1], const[9]; 36: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[10].xxxx, -const[10].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[10].xxxx, -const[10].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[11].xxxx; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[10].zzzz; 28: MAD_SAT temp[2].w, -const[10].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL temp[4].xyz, temp[4], temp[0]; 32: TEX temp[0].xyz, temp[5], 2D[4]; 33: MAD_SAT temp[1].xyz, temp[0], const[7], temp[4]; 34: MAD_SAT temp[8].x, input[0].xxxx, const[8].xxxx, const[8].yyyy; 35: LRP output[0].xyz, temp[8].xxxx, temp[1], const[9]; 36: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[10].xxxx, -const[10].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[10].xxxx, -const[10].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: POW temp[7].w, temp[7].wwww, const[11].xxxx; 20: MUL temp[7].w, temp[7], temp[4]; 21: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: TEX temp[2], input[4], 2D[7]; 26: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 27: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[10].zzzz; 28: MAD_SAT temp[2].w, -const[10].wwww, temp[3].yyyy, temp[2].wwww; 29: SUB_SAT temp[3].xyz, temp[0], const[6]; 30: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 31: MUL temp[4].xyz, temp[4], temp[0]; 32: TEX temp[0].xyz, temp[5], 2D[4]; 33: MAD_SAT temp[1].xyz, temp[0], const[7], temp[4]; 34: MAD_SAT temp[8].x, input[0].xxxx, const[8].xxxx, const[8].yyyy; 35: LRP output[0].xyz, temp[8].xxxx, temp[1], const[9]; 36: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[10].xxxx, -const[10].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4], temp[5], 2D[0]; 11: MUL temp[4].xyz, temp[4], const[0]; 12: TEX temp[6].xyz, temp[5], 2D[3]; 13: MAD temp[6].xyz, temp[6], const[10].xxxx, -const[10].yyyy; 14: ADD temp[7].xyz, temp[3], temp[2]; 15: DP3 temp[7].w, temp[7], temp[7]; 16: RSQ temp[7].w, |temp[7].wwww|; 17: MUL temp[7].xyz, temp[7].wwww, temp[7]; 18: DP3_SAT temp[7].w, temp[7], temp[6]; 19: LG2 temp[9].w, temp[7].wwww; 20: MUL temp[9].w, temp[9].wwww, const[11].xxxx; 21: EX2 temp[7].w, temp[9].wwww; 22: MUL temp[7].w, temp[7], temp[4]; 23: MAD temp[4].xyz, temp[7].wwww, const[4], temp[4]; 24: DP3_SAT temp[2].w, temp[6], temp[2]; 25: MUL temp[0].xyz, temp[0], temp[2].wwww; 26: MAX temp[0].xyz, temp[0], const[5]; 27: TEX temp[2], input[4], 2D[7]; 28: MAD_SAT temp[3].xy, -input[4].zzzz, temp[2].yyyy, temp[2].xzzz; 29: CMP temp[2].w, -temp[3].xxxx, temp[2].wwww, const[10].zzzz; 30: MAD_SAT temp[2].w, -const[10].wwww, temp[3].yyyy, temp[2].wwww; 31: ADD_SAT temp[3].xyz, temp[0], -const[6]; 32: MAD temp[0].xyz, temp[2].wwww, -temp[3], temp[0]; 33: MUL temp[4].xyz, temp[4], temp[0]; 34: TEX temp[0].xyz, temp[5], 2D[4]; 35: MAD_SAT temp[1].xyz, temp[0], const[7], temp[4]; 36: MAD_SAT temp[8].x, input[0].xxxx, const[8].xxxx, const[8].yyyy; 37: ADD temp[10].xyz, temp[1], -const[9]; 38: MAD output[0].xyz, temp[8].xxxx, temp[10], const[9]; 39: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[10].xxx_, -const[10].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[3]; 8: MAD temp[4].w, temp[4].___w, const[2].___x, const[2].___y; 9: MAD temp[5].xy, temp[4].ww__, temp[3].xy__, input[1].xy__; 10: TEX temp[4], temp[5].xy__, 2D[0]; 11: MUL temp[4].xyz, temp[4].xyz_, const[0].xyz_; 12: TEX temp[6].xyz, temp[5].xy__, 2D[3]; 13: MAD temp[6].xyz, temp[6].xyz_, const[10].xxx_, -const[10].yyy_; 14: ADD temp[7].xyz, temp[3].xyz_, temp[2].xyz_; 15: DP3 temp[7].w, temp[7].xyz_, temp[7].xyz_; 16: RSQ temp[7].w, |temp[7].___w|; 17: MUL temp[7].xyz, temp[7].www_, temp[7].xyz_; 18: DP3_SAT temp[7].w, temp[7].xyz_, temp[6].xyz_; 19: LG2 temp[9].w, temp[7].___w; 20: MUL temp[9].w, temp[9].___w, const[11].___x; 21: EX2 temp[7].w, temp[9].___w; 22: MUL temp[7].w, temp[7].___w, temp[4].___w; 23: MAD temp[4].xyz, temp[7].www_, const[4].xyz_, temp[4].xyz_; 24: DP3_SAT temp[2].w, temp[6].xyz_, temp[2].xyz_; 25: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 26: MAX temp[0].xyz, temp[0].xyz_, const[5].xyz_; 27: TEX temp[2], input[4].xy__, 2D[7]; 28: MAD_SAT temp[3].xy, -input[4].zz__, temp[2].yy__, temp[2].xz__; 29: CMP temp[2].w, -temp[3].___x, temp[2].___w, const[10].___z; 30: MAD_SAT temp[2].w, -const[10].___w, temp[3].___y, temp[2].___w; 31: ADD_SAT temp[3].xyz, temp[0].xyz_, -const[6].xyz_; 32: MAD temp[0].xyz, temp[2].www_, -temp[3].xyz_, temp[0].xyz_; 33: MUL temp[4].xyz, temp[4].xyz_, temp[0].xyz_; 34: TEX temp[0].xyz, temp[5].xy__, 2D[4]; 35: MAD_SAT temp[1].xyz, temp[0].xyz_, const[7].xyz_, temp[4].xyz_; 36: MAD_SAT temp[8].x, input[0].x___, const[8].x___, const[8].y___; 37: ADD temp[10].xyz, temp[1].xyz_, -const[9].xyz_; 38: MAD output[0].xyz, temp[8].xxx_, temp[10].xyz_, const[9].xyz_; 39: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, const[10].xxx_, -const[10].yyy_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, const[10].xxx_, -const[10].yyy_; 14: ADD temp[25].xyz, temp[17].xyz_, temp[14].xyz_; 15: DP3 temp[26].w, temp[25].xyz_, temp[25].xyz_; 16: RSQ temp[27].w, |temp[26].___w|; 17: MUL temp[28].xyz, temp[27].www_, temp[25].xyz_; 18: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 19: LG2 temp[30].w, temp[29].___w; 20: MUL temp[31].w, temp[30].___w, const[11].___x; 21: EX2 temp[32].w, temp[31].___w; 22: MUL temp[33].w, temp[32].___w, temp[21].___w; 23: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 24: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 25: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 26: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 27: TEX temp[38], input[4].xy__, 2D[7]; 28: MAD_SAT temp[39].xy, -input[4].zz__, temp[38].yy__, temp[38].xz__; 29: CMP temp[40].w, -temp[39].___x, temp[38].___w, const[10].___z; 30: MAD_SAT temp[41].w, -const[10].___w, temp[39].___y, temp[40].___w; 31: ADD_SAT temp[42].xyz, temp[37].xyz_, -const[6].xyz_; 32: MAD temp[43].xyz, temp[41].www_, -temp[42].xyz_, temp[37].xyz_; 33: MUL temp[44].xyz, temp[34].xyz_, temp[43].xyz_; 34: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 35: MAD_SAT temp[46].xyz, temp[45].xyz_, const[7].xyz_, temp[44].xyz_; 36: MAD_SAT temp[47].x, input[0].x___, const[8].x___, const[8].y___; 37: ADD temp[48].xyz, temp[46].xyz_, -const[9].xyz_; 38: MAD output[0].xyz, temp[47].xxx_, temp[48].xyz_, const[9].xyz_; 39: MOV output[0].w, temp[12].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, const[10].xxx_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, const[10].xxx_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, const[11].___x; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: TEX temp[38], input[4].xy__, 2D[7]; 27: MAD_SAT temp[39].xy, -input[4].zz__, temp[38].yy__, temp[38].xz__; 28: CMP temp[40].w, -temp[39].___x, temp[38].___w, none.___0; 29: MAD_SAT temp[41].w, -const[10].___w, temp[39].___y, temp[40].___w; 30: ADD_SAT temp[42].xyz, temp[37].xyz_, -const[6].xyz_; 31: MAD temp[43].xyz, temp[41].www_, -temp[42].xyz_, temp[37].xyz_; 32: MUL temp[44].xyz, temp[34].xyz_, temp[43].xyz_; 33: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 34: MAD_SAT temp[46].xyz, temp[45].xyz_, const[7].xyz_, temp[44].xyz_; 35: MAD_SAT temp[47].x, input[0].x___, const[8].x___, const[8].y___; 36: MAD output[0].xyz, temp[47].xxx_, (temp[46] - const[9]).xyz_, const[9].xyz_; 37: MOV output[0].w, temp[12].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, 32.000000 (0x60).___w; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: TEX temp[38], input[4].xy__, 2D[7]; 27: MAD_SAT temp[39].xy, -input[4].zz__, temp[38].yy__, temp[38].xz__; 28: CMP temp[40].w, -temp[39].___x, temp[38].___w, none.___0; 29: MAD_SAT temp[41].w, -8.000000 (0x50).___w, temp[39].___y, temp[40].___w; 30: ADD_SAT temp[42].xyz, temp[37].xyz_, -const[6].xyz_; 31: MAD temp[43].xyz, temp[41].www_, -temp[42].xyz_, temp[37].xyz_; 32: MUL temp[44].xyz, temp[34].xyz_, temp[43].xyz_; 33: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 34: MAD_SAT temp[46].xyz, temp[45].xyz_, const[7].xyz_, temp[44].xyz_; 35: MAD_SAT temp[47].x, input[0].x___, const[8].x___, const[8].y___; 36: MAD output[0].xyz, temp[47].xxx_, (temp[46] - const[9]).xyz_, const[9].xyz_; 37: MOV output[0].w, temp[12].___w; CONST[10] = { 2.0000 1.0000 0.0000 8.0000 } CONST[11] = { 32.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, 32.000000 (0x60).___w; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: TEX temp[38], input[4].xy__, 2D[7]; 27: MAD_SAT temp[39].xy, -input[4].zz__, temp[38].yy__, temp[38].xz__; 28: CMP temp[40].w, -temp[39].___x, temp[38].___w, none.___0; 29: MAD_SAT temp[41].w, -8.000000 (0x50).___w, temp[39].___y, temp[40].___w; 30: ADD_SAT temp[42].xyz, temp[37].xyz_, -const[6].xyz_; 31: MAD temp[43].xyz, temp[41].www_, -temp[42].xyz_, temp[37].xyz_; 32: MUL temp[44].xyz, temp[34].xyz_, temp[43].xyz_; 33: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 34: MAD_SAT temp[46].xyz, temp[45].xyz_, const[7].xyz_, temp[44].xyz_; 35: MAD_SAT temp[47].x, input[0].x___, const[8].x___, const[8].y___; 36: MAD output[0].xyz, temp[47].xxx_, (temp[46] - const[9]).xyz_, const[9].xyz_; 37: MOV output[0].w, temp[12].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: MUL temp[22].xyz, temp[21].xyz_, const[0].xyz_; 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: MAD temp[24].xyz, temp[23].xyz_, 2.000000 (0x40).www_, -none.111_; 14: DP3 temp[26].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 15: RSQ temp[27].w, |temp[26].___w|; 16: MUL temp[28].xyz, temp[27].www_, (temp[14] + temp[17]).xyz_; 17: DP3_SAT temp[29].w, temp[28].xyz_, temp[24].xyz_; 18: LG2 temp[30].w, temp[29].___w; 19: MUL temp[31].w, temp[30].___w, 32.000000 (0x60).___w; 20: EX2 temp[32].w, temp[31].___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MAD temp[34].xyz, temp[33].www_, const[4].xyz_, temp[22].xyz_; 23: DP3_SAT temp[35].w, temp[24].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: TEX temp[38], input[4].xy__, 2D[7]; 27: MAD_SAT temp[39].xy, -input[4].zz__, temp[38].yy__, temp[38].xz__; 28: CMP temp[40].w, -temp[39].___x, temp[38].___w, none.___0; 29: MAD_SAT temp[41].w, -8.000000 (0x50).___w, temp[39].___y, temp[40].___w; 30: ADD_SAT temp[42].xyz, temp[37].xyz_, -const[6].xyz_; 31: MAD temp[43].xyz, temp[41].www_, -temp[42].xyz_, temp[37].xyz_; 32: MUL temp[44].xyz, temp[34].xyz_, temp[43].xyz_; 33: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 34: MAD_SAT temp[46].xyz, temp[45].xyz_, const[7].xyz_, temp[44].xyz_; 35: MAD_SAT temp[47].x, input[0].x___, const[8].x___, const[8].y___; 36: MAD output[0].xyz, temp[47].xxx_, (temp[46] - const[9]).xyz_, const[9].xyz_; 37: MOV output[0].w, temp[12].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: src0.w = temp[11], src1.w = const[0] MAD_SAT temp[12].w, src0.w, src1.w, src0.0 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[13], src0.w = 2.000000 (0x40) MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 5: src0.w = temp[15] RSQ temp[16].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: src0.xyz = const[2], src0.w = temp[18] MAD temp[19].w, src0.w, src0.x, src0.y 9: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ 10: TEX temp[21], temp[20].xy__, 2D[0]; 11: src0.xyz = temp[21], src1.xyz = const[0] MAD temp[22].xyz, src0.xyz, src1.xyz, src0.000 12: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 13: src0.xyz = temp[23], src0.w = 2.000000 (0x40) MAD temp[24].xyz, src0.xyz, src0.www, -src0.111 14: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[26].w, src0._, src0._ 15: src0.w = temp[26] RSQ temp[27].w, |src0.w| 16: src0.xyz = temp[17], src0.w = temp[27], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[28].xyz, src0.www, srcp.xyz, src0.000 17: src0.xyz = temp[28], src1.xyz = temp[24] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[29].w, src0._, src0._ 18: src0.w = temp[29] LG2 temp[30].w, src0.w 19: src0.w = temp[30], src1.w = 32.000000 (0x60) MAD temp[31].w, src0.w, src1.w, src0.0 20: src0.w = temp[31] EX2 temp[32].w, src0.w 21: src0.w = temp[32], src1.w = temp[21] MAD temp[33].w, src0.w, src1.w, src0.0 22: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[22] MAD temp[34].xyz, src0.www, src0.xyz, src1.xyz 23: src0.xyz = temp[24], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 24: src0.xyz = temp[11], src0.w = temp[35] MAD temp[36].xyz, src0.xyz, src0.www, src0.000 25: src0.xyz = temp[36], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz 26: TEX temp[38], input[4].xy__, 2D[7]; 27: src0.xyz = input[4], src1.xyz = temp[38] MAD_SAT temp[39].xy, -src0.zz_, src1.yy_, src1.xz_ 28: src0.xyz = temp[39], src0.w = temp[38] CMP temp[40].w, src0.0, src0.w, -src0.x 29: src0.xyz = temp[39], src0.w = 8.000000 (0x50), src1.w = temp[40] MAD_SAT temp[41].w, -src0.w, src0.y, src1.w 30: src0.xyz = temp[37], src1.xyz = const[6] MAD_SAT temp[42].xyz, src0.xyz, src0.111, -src1.xyz 31: src0.xyz = temp[42], src0.w = temp[41], src1.xyz = temp[37] MAD temp[43].xyz, src0.www, -src0.xyz, src1.xyz 32: src0.xyz = temp[34], src1.xyz = temp[43] MAD temp[44].xyz, src0.xyz, src1.xyz, src0.000 33: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 34: src0.xyz = temp[45], src1.xyz = const[7], src2.xyz = temp[44] MAD_SAT temp[46].xyz, src0.xyz, src1.xyz, src2.xyz 35: src0.xyz = input[0], src1.xyz = const[8] MAD_SAT temp[47].x, src0.x__, src1.x__, src1.y__ 36: src0.xyz = const[9], src1.xyz = temp[46], src2.xyz = temp[47], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 37: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[15], src1.xyz = const[8] MAD_SAT temp[47].x, src0.x__, src1.x__, src1.y__ RSQ temp[16].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[11], input[2].xy__, 2D[1]; 5: TEX temp[13].xyz, input[2].xy__, 2D[2]; 6: TEX temp[18].w, input[1].xy__, 2D[3]; 7: TEX temp[38], input[4].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 8: src0.xyz = input[4], src0.w = temp[18], src1.xyz = temp[38], src2.xyz = const[2] SEM_WAIT MAD_SAT temp[39].xy, -src0.zz_, src1.yy_, src1.xz_ MAD temp[19].w, src0.w, src2.x, src2.y 9: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1], src1.w = temp[38], src2.xyz = temp[39] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ CMP temp[40].w, src0.0, src1.w, -src2.x 10: src0.xyz = temp[13], src0.w = 2.000000 (0x40), src1.w = temp[11], src2.w = const[0] MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[12].w, src1.w, src2.w, src0.0 11: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[26].w, src0._, src0._ 12: src0.xyz = temp[39], src0.w = 8.000000 (0x50), src1.w = temp[40] MAD_SAT temp[41].w, -src0.w, src0.y, src1.w 13: src0.w = temp[26] RSQ temp[27].w, |src0.w| 14: src0.xyz = temp[17], src0.w = temp[27], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[28].xyz, src0.www, srcp.xyz, src0.000 15: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 16: BEGIN_TEX; 17: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 18: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 19: TEX temp[21], temp[20].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 20: src0.xyz = temp[23], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[24].xyz, src0.xyz, src0.www, -src0.111 21: src0.xyz = temp[24], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 22: src0.xyz = temp[28], src1.xyz = temp[24] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[29].w, src0._, src0._ 23: src0.xyz = temp[21], src0.w = temp[29], src1.xyz = const[0] MAD temp[22].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[30].w, src0.w 24: src0.xyz = temp[11], src0.w = temp[35], src1.w = temp[30], src2.w = 32.000000 (0x60) MAD temp[36].xyz, src0.xyz, src0.www, src0.000 MAD temp[31].w, src1.w, src2.w, src0.0 25: src0.xyz = temp[36], src0.w = temp[31], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz EX2 temp[32].w, src0.w 26: src0.xyz = temp[37], src0.w = temp[32], src1.xyz = const[6], src1.w = temp[21] MAD_SAT temp[42].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[33].w, src0.w, src1.w, src0.0 27: src0.xyz = temp[42], src0.w = temp[41], src1.xyz = temp[37] MAD temp[43].xyz, src0.www, -src0.xyz, src1.xyz 28: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[22] MAD temp[34].xyz, src0.www, src0.xyz, src1.xyz 29: src0.xyz = temp[34], src1.xyz = temp[43] MAD temp[44].xyz, src0.xyz, src1.xyz, src0.000 30: src0.xyz = temp[45], src1.xyz = const[7], src2.xyz = temp[44] MAD_SAT temp[46].xyz, src0.xyz, src1.xyz, src2.xyz 31: src0.xyz = const[9], src1.xyz = temp[46], src2.xyz = temp[47], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[15], src1.xyz = const[8] MAD_SAT temp[47].x, src0.x__, src1.x__, src1.y__ RSQ temp[16].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[11], input[2].xy__, 2D[1]; 5: TEX temp[13].xyz, input[2].xy__, 2D[2]; 6: TEX temp[18].w, input[1].xy__, 2D[3]; 7: TEX temp[38], input[4].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 8: src0.xyz = input[4], src0.w = temp[18], src1.xyz = temp[38], src2.xyz = const[2] SEM_WAIT MAD_SAT temp[39].xy, -src0.zz_, src1.yy_, src1.xz_ MAD temp[19].w, src0.w, src2.x, src2.y 9: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1], src1.w = temp[38], src2.xyz = temp[39] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ CMP temp[40].w, src0.0, src1.w, -src2.x 10: src0.xyz = temp[13], src0.w = 2.000000 (0x40), src1.w = temp[11], src2.w = const[0] MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[12].w, src1.w, src2.w, src0.0 11: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[26].w, src0._, src0._ 12: src0.xyz = temp[39], src0.w = 8.000000 (0x50), src1.w = temp[40] MAD_SAT temp[41].w, -src0.w, src0.y, src1.w 13: src0.w = temp[26] RSQ temp[27].w, |src0.w| 14: src0.xyz = temp[17], src0.w = temp[27], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[28].xyz, src0.www, srcp.xyz, src0.000 15: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 16: BEGIN_TEX; 17: TEX temp[45].xyz, temp[20].xy__, 2D[4]; 18: TEX temp[23].xyz, temp[20].xy__, 2D[3]; 19: TEX temp[21], temp[20].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 20: src0.xyz = temp[23], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[24].xyz, src0.xyz, src0.www, -src0.111 21: src0.xyz = temp[24], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 22: src0.xyz = temp[28], src1.xyz = temp[24] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[29].w, src0._, src0._ 23: src0.xyz = temp[21], src0.w = temp[29], src1.xyz = const[0] MAD temp[22].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[30].w, src0.w 24: src0.xyz = temp[11], src0.w = temp[35], src1.w = temp[30], src2.w = 32.000000 (0x60) MAD temp[36].xyz, src0.xyz, src0.www, src0.000 MAD temp[31].w, src1.w, src2.w, src0.0 25: src0.xyz = temp[36], src0.w = temp[31], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz EX2 temp[32].w, src0.w 26: src0.xyz = temp[37], src0.w = temp[32], src1.xyz = const[6], src1.w = temp[21] MAD_SAT temp[42].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[33].w, src0.w, src1.w, src0.0 27: src0.xyz = temp[42], src0.w = temp[41], src1.xyz = temp[37] MAD temp[43].xyz, src0.www, -src0.xyz, src1.xyz 28: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[22] MAD temp[34].xyz, src0.www, src0.xyz, src1.xyz 29: src0.xyz = temp[34], src1.xyz = temp[43] MAD temp[44].xyz, src0.xyz, src1.xyz, src0.000 30: src0.xyz = temp[45], src1.xyz = const[7], src2.xyz = temp[44] MAD_SAT temp[46].xyz, src0.xyz, src1.xyz, src2.xyz 31: src0.xyz = const[9], src1.xyz = temp[46], src2.xyz = temp[47], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 1: src0.xyz = input[4], src0.w = temp[0], src1.xyz = const[8] MAD_SAT temp[0].z, src0.__x, src1.__x, src1.__y RSQ temp[0].w, |src0.w| 2: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[4], input[1].xy__, 2D[1]; 5: TEX temp[1].xyz, input[1].xy__, 2D[2]; 6: TEX temp[0].w, input[0].xy__, 2D[3]; 7: TEX temp[5], input[3].xy__, 2D[7] SEM_WAIT SEM_ACQUIRE; 8: src0.xyz = input[3], src0.w = temp[0], src1.xyz = temp[5], src2.xyz = const[2] SEM_WAIT MAD_SAT temp[3].xy, -src0.zz_, src1.yy_, src1.xz_ MAD temp[0].w, src0.w, src2.x, src2.y 9: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = input[0], src1.w = temp[5], src2.xyz = temp[3] MAD temp[0].xy, src0.ww_, src0.xy_, src1.xy_ CMP temp[0].w, src0.0, src1.w, -src2.x 10: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[4], src2.w = const[0] MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[1].w, src1.w, src2.w, src0.0 11: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[2].w, src0._, src0._ 12: src0.xyz = temp[3], src0.w = 8.000000 (0x50), src1.w = temp[0] MAD_SAT temp[0].w, -src0.w, src0.y, src1.w 13: src0.w = temp[2] RSQ temp[2].w, |src0.w| 14: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[2].xyz, src0.www, srcp.xyz, src0.000 15: src0.w = temp[1] MAD color[0].w, src0.w, src0.1, src0.0 16: BEGIN_TEX; 17: TEX temp[3].xyz, temp[0].xy__, 2D[4]; 18: TEX temp[5].xyz, temp[0].xy__, 2D[3]; 19: TEX temp[6], temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 20: src0.xyz = temp[5], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[5].xyz, src0.xyz, src0.www, -src0.111 21: src0.xyz = temp[5], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 22: src0.xyz = temp[2], src1.xyz = temp[5] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 23: src0.xyz = temp[6], src0.w = temp[2], src1.xyz = const[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 LG2 temp[2].w, src0.w 24: src0.xyz = temp[4], src0.w = temp[1], src1.w = temp[2], src2.w = 32.000000 (0x60) MAD temp[2].xyz, src0.xyz, src0.www, src0.000 MAD temp[1].w, src1.w, src2.w, src0.0 25: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[5] MAX temp[2].xyz, src0.xyz, src1.xyz EX2 temp[1].w, src0.w 26: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[6], src1.w = temp[6] MAD_SAT temp[4].xyz, src0.xyz, src0.111, -src1.xyz MAD temp[1].w, src0.w, src1.w, src0.0 27: src0.xyz = temp[4], src0.w = temp[0], src1.xyz = temp[2] MAD temp[2].xyz, src0.www, -src0.xyz, src1.xyz 28: src0.xyz = const[4], src0.w = temp[1], src1.xyz = temp[1] MAD temp[1].xyz, src0.www, src0.xyz, src1.xyz 29: src0.xyz = temp[1], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 30: src0.xyz = temp[3], src1.xyz = const[7], src2.xyz = temp[1] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 31: src0.xyz = const[9], src1.xyz = temp[1], src2.xyz = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.zzz, srcp.xyz, src0.xyz R500 Fragment Program: -------- 0 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000f1:DP3 dest:15 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00086000:ALU wmask: AB omask: NONE 1:RGB_ADDR 0x08042004:Addr0: 4t, Addr1: 8c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122090:rgb_A_src:0 0/0/R 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00191000:MAD dest:0 rgb_C_src:1 0/0/G 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe404f401: src: 1 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004003:TEX wmask: A omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02470000: id: 7 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe405f403: src: 3 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 7 0:CMN_INST 0x00085804:ALU TEX_WAIT wmask: ARG omask: NONE 1:RGB_ADDR 0x10201403:Addr0: 3t, Addr1: 5t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084ac48:rgb_A_src:0 B/B/0 1 rgb_B_src:1 G/G/0 0 targ: 0 4 ALPHA_INST:0x0010c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:2 R 0 targ 0 w:0 5 RGBA_INST: 0x0c441030:MAD dest:3 rgb_C_src:1 R/B/0 0 alp_C_src:2 G 0 8 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x00300002:Addr0: 2t, Addr1: 0t, Addr2: 3t, srcp:0 2:ALPHA_ADDR 0x08001400:Addr0: 0t, Addr1: 5t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084046c:rgb_A_src:0 A/A/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x00690006:CMP dest:0 alp_A_src:0 0 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x44421000:MAD dest:0 rgb_C_src:1 R/G/0 0 alp_C_src:2 R 1 9 0:CMN_INST 0x00107a00:ALU NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x100010c0:Addr0: 192t, Addr1: 4t, Addr2: 0c, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 0 0 10 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001a1:DP3 dest:26 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080000d0:Addr0: 208t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0022c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 G 0 targ 0 w:0 5 RGBA_INST: 0x1a000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:1 A 0 12 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c02b:RSQ dest:2 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00040001:OUT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 15 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00440000: id: 4 op:LD, , SCALED 2:TEX_ADDR: 0xe403f400: src: 0 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 16 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 17 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe406f400: src: 0 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 18 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8050:MAD dest:5 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 19 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000405:Addr0: 5t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000231:DP3 dest:35 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001402:Addr0: 2t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001d1:DP3 dest:29 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040006:Addr0: 6t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c029:LN2 dest:2 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0e000801:Addr0: 1t, Addr1: 2t, Addr2: 224t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 23 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041402:Addr0: 2t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0000c018:EX2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000025:MAX dest:2 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00087800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041802:Addr0: 2t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001801:Addr0: 1t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20a21040:MAD dest:4 rgb_C_src:1 R/G/B 1 alp_C_src:0 0 0 25 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000804:Addr0: 4t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0144036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 1 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 26 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000504:Addr0: 4c, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 27 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 28 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00141c03:Addr0: 3t, Addr1: 7c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 29 0:CMN_INST 0x00038005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x40000509:Addr0: 9c, Addr1: 1t, Addr2: 0t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 30 Instructions ~ 20 Vector Instructions (RGB) ~ 15 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 7 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 7 Temporary Registers ~ 4 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL OUT[4], GENERIC[1] DCL OUT[5], GENERIC[2] DCL CONST[0..7] DCL TEMP[0..1] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[2].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: MOV_SAT OUT[1], IN[2] 6: ADD OUT[3].xy, IN[3], CONST[0].yzww 7: SUB TEMP[1], CONST[1], IN[0] 8: DP3 TEMP[1].w, TEMP[1], TEMP[1] 9: RSQ TEMP[1].w, |TEMP[1].wwww| 10: MUL TEMP[1].xyz, TEMP[1].wwww, TEMP[1] 11: MOV OUT[4].xyz, IN[1] 12: ADD OUT[5].xyz, CONST[2], TEMP[1] 13: DP4 OUT[2].x, -IN[0], CONST[3] 14: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3], const[0].yzww; 7: SUB temp[1], const[1], input[0]; 8: DP3 temp[1].w, temp[1], temp[1]; 9: RSQ temp[1].w, |temp[1].wwww|; 10: MUL temp[1].xyz, temp[1].wwww, temp[1]; 11: MOV output[4].xyz, input[1]; 12: ADD output[5].xyz, const[2], temp[1]; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3], const[0].yzww; 7: SUB temp[1], const[1], input[0]; 8: DP3 temp[1].w, temp[1], temp[1]; 9: RSQ temp[1].w, |temp[1].wwww|; 10: MUL temp[1].xyz, temp[1].wwww, temp[1]; 11: MOV output[4].xyz, input[1]; 12: ADD output[5].xyz, const[2], temp[1]; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3], const[0].yzww; 7: ADD temp[1], const[1], -input[0]; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].wwww|; 10: MUL temp[1].xyz, temp[1].wwww, temp[1]; 11: MOV output[4].xyz, input[1]; 12: ADD output[5].xyz, const[2], temp[1]; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0]._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[0]; 15: MOV output[6], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[0]; 15: MOV output[6], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[0]; 15: MOV output[6], temp[0]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 6: op: 0x00304203 dst: 2o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01fa2002 reg: 0c swiz: Y/ Z/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 7: op: 0x00702003 dst: 1t op: VE_ADD src0: 0x01d10022 reg: 1c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 8: op: 0x00802001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00802048 dst: 1t op: ME_RECIP_SQRT_DX src0: 0x00db6028 reg: 1t swiz: W/ W/ W/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src1: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 11: op: 0x00706203 dst: 3o op: VE_ADD src0: 0x01d10021 reg: 1i swiz: X/ Y/ Z/ U src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 12: op: 0x00708203 dst: 4o op: VE_ADD src0: 0x01d10042 reg: 2c swiz: X/ Y/ Z/ U src1: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 13: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 14: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 15: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 16 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], FOG, PERSPECTIVE DCL IN[2], GENERIC[0], PERSPECTIVE DCL IN[3], GENERIC[1], PERSPECTIVE DCL IN[4], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL CONST[1] DCL CONST[3..6] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 0.0000, 128.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[0], 2D 1: MUL TEMP[0].xyz, TEMP[0], IMM[0].xxxx 2: TEX TEMP[1].xy, IN[2], SAMP[1], 2D 3: MUL TEMP[2].xyz, TEMP[0], CONST[1].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: DP3 TEMP[4].w, IN[4], IN[4] 8: RSQ TEMP[4].w, |TEMP[4].wwww| 9: MUL TEMP[4].xyz, TEMP[4].wwww, IN[4] 10: DP3_SAT TEMP[5].x, TEMP[4], TEMP[3] 11: POW TEMP[5].x, TEMP[5].xxxx, IMM[0].zzzz 12: MUL TEMP[5].x, TEMP[5].xxxx, CONST[1].xxxx 13: MUL TEMP[5].x, TEMP[5].xxxx, TEMP[1].xxxx 14: DP3 TEMP[4].y, TEMP[3], CONST[3] 15: MAD TEMP[4].x, TEMP[4].yyyy, CONST[4].xxxx, CONST[4].yyyy 16: MAD_SAT TEMP[4].x, TEMP[4].yyyy, TEMP[4].xxxx, CONST[4].zzzz 17: MAD TEMP[0].xyz, TEMP[4].xxxx, TEMP[0], TEMP[5].xxxx 18: MUL TEMP[0].xyz, TEMP[0], IN[0] 19: LRP_SAT TEMP[6].xyz, TEMP[1].yyyy, TEMP[2], TEMP[0] 20: MUL_SAT TEMP[6].w, TEMP[0].wwww, IN[0].wwww 21: MAD_SAT TEMP[7].x, IN[1].xxxx, CONST[5].xxxx, CONST[5].yyyy 22: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[6], CONST[6] 23: MOV OUT[0].w, TEMP[6] 24: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[7].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[7].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: DP3 temp[4].y, temp[3], const[3]; 15: MAD temp[4].x, temp[4].yyyy, const[4].xxxx, const[4].yyyy; 16: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[4].zzzz; 17: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 18: MUL temp[0].xyz, temp[0], input[0]; 19: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 20: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 21: MAD_SAT temp[7].x, input[1].xxxx, const[5].xxxx, const[5].yyyy; 22: LRP output[0].xyz, temp[7].xxxx, temp[6], const[6]; 23: MOV output[0].w, temp[6]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[7].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[7].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: DP3 temp[4].y, temp[3], const[3]; 15: MAD temp[4].x, temp[4].yyyy, const[4].xxxx, const[4].yyyy; 16: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[4].zzzz; 17: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 18: MUL temp[0].xyz, temp[0], input[0]; 19: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 20: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 21: MAD_SAT temp[7].x, input[1].xxxx, const[5].xxxx, const[5].yyyy; 22: LRP output[0].xyz, temp[7].xxxx, temp[6], const[6]; 23: MOV output[0].w, temp[6]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[7].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[7].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: DP3 temp[4].y, temp[3], const[3]; 15: MAD temp[4].x, temp[4].yyyy, const[4].xxxx, const[4].yyyy; 16: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[4].zzzz; 17: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 18: MUL temp[0].xyz, temp[0], input[0]; 19: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 20: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 21: MAD_SAT temp[7].x, input[1].xxxx, const[5].xxxx, const[5].yyyy; 22: LRP output[0].xyz, temp[7].xxxx, temp[6], const[6]; 23: MOV output[0].w, temp[6]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[7].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[7].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: DP3 temp[4].y, temp[3], const[3]; 15: MAD temp[4].x, temp[4].yyyy, const[4].xxxx, const[4].yyyy; 16: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[4].zzzz; 17: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 18: MUL temp[0].xyz, temp[0], input[0]; 19: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 20: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 21: MAD_SAT temp[7].x, input[1].xxxx, const[5].xxxx, const[5].yyyy; 22: LRP output[0].xyz, temp[7].xxxx, temp[6], const[6]; 23: MOV output[0].w, temp[6]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[7].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[7].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: DP3 temp[4].y, temp[3], const[3]; 15: MAD temp[4].x, temp[4].yyyy, const[4].xxxx, const[4].yyyy; 16: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[4].zzzz; 17: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 18: MUL temp[0].xyz, temp[0], input[0]; 19: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 20: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 21: MAD_SAT temp[7].x, input[1].xxxx, const[5].xxxx, const[5].yyyy; 22: LRP output[0].xyz, temp[7].xxxx, temp[6], const[6]; 23: MOV output[0].w, temp[6]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[7].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[7].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: DP3 temp[4].y, temp[3], const[3]; 15: MAD temp[4].x, temp[4].yyyy, const[4].xxxx, const[4].yyyy; 16: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[4].zzzz; 17: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 18: MUL temp[0].xyz, temp[0], input[0]; 19: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 20: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 21: MAD_SAT temp[7].x, input[1].xxxx, const[5].xxxx, const[5].yyyy; 22: LRP output[0].xyz, temp[7].xxxx, temp[6], const[6]; 23: MOV output[0].w, temp[6]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[7].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: LG2 temp[8].w, temp[5].xxxx; 12: MUL temp[8].w, temp[8].wwww, const[7].zzzz; 13: EX2 temp[5].x, temp[8].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: DP3 temp[4].y, temp[3], const[3]; 17: MAD temp[4].x, temp[4].yyyy, const[4].xxxx, const[4].yyyy; 18: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[4].zzzz; 19: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 20: MUL temp[0].xyz, temp[0], input[0]; 21: ADD temp[6].xyz, temp[2], -temp[0]; 22: MAD_SAT temp[6].xyz, temp[1].yyyy, temp[6], temp[0]; 23: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 24: MAD_SAT temp[7].x, input[1].xxxx, const[5].xxxx, const[5].yyyy; 25: ADD temp[9].xyz, temp[6], -const[6]; 26: MAD output[0].xyz, temp[7].xxxx, temp[9], const[6]; 27: MOV output[0].w, temp[6]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[0]; 1: MUL temp[0].xyz, temp[0].xyz_, const[7].xxx_; 2: TEX temp[1].xy, input[2].xy__, 2D[1]; 3: MUL temp[2].xyz, temp[0].xyz_, const[1].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: DP3 temp[4].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[4].w, |temp[4].___w|; 9: MUL temp[4].xyz, temp[4].www_, input[4].xyz_; 10: DP3_SAT temp[5].x, temp[4].xyz_, temp[3].xyz_; 11: LG2 temp[8].w, temp[5].___x; 12: MUL temp[8].w, temp[8].___w, const[7].___z; 13: EX2 temp[5].x, temp[8].w___; 14: MUL temp[5].x, temp[5].x___, const[1].x___; 15: MUL temp[5].x, temp[5].x___, temp[1].x___; 16: DP3 temp[4].y, temp[3].xyz_, const[3].xyz_; 17: MAD temp[4].x, temp[4].y___, const[4].x___, const[4].y___; 18: MAD_SAT temp[4].x, temp[4].y___, temp[4].x___, const[4].z___; 19: MAD temp[0].xyz, temp[4].xxx_, temp[0].xyz_, temp[5].xxx_; 20: MUL temp[0].xyz, temp[0].xyz_, input[0].xyz_; 21: ADD temp[6].xyz, temp[2].xyz_, -temp[0].xyz_; 22: MAD_SAT temp[6].xyz, temp[1].yyy_, temp[6].xyz_, temp[0].xyz_; 23: MUL_SAT temp[6].w, temp[0].___w, input[0].___w; 24: MAD_SAT temp[7].x, input[1].x___, const[5].x___, const[5].y___; 25: ADD temp[9].xyz, temp[6].xyz_, -const[6].xyz_; 26: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[6].xyz_; 27: MOV output[0].w, temp[6].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[7].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, const[7].___z; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: DP3 temp[26].y, temp[16].xyz_, const[3].xyz_; 17: MAD temp[27].x, temp[26].y___, const[4].x___, const[4].y___; 18: MAD_SAT temp[28].x, temp[26].y___, temp[27].x___, const[4].z___; 19: MAD temp[29].xyz, temp[28].xxx_, temp[11].xyz_, temp[25].xxx_; 20: MUL temp[30].xyz, temp[29].xyz_, input[0].xyz_; 21: ADD temp[31].xyz, temp[13].xyz_, -temp[30].xyz_; 22: MAD_SAT temp[32].xyz, temp[12].yyy_, temp[31].xyz_, temp[30].xyz_; 23: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 24: MAD_SAT temp[34].x, input[1].x___, const[5].x___, const[5].y___; 25: ADD temp[35].xyz, temp[32].xyz_, -const[6].xyz_; 26: MAD output[0].xyz, temp[34].xxx_, temp[35].xyz_, const[6].xyz_; 27: MOV output[0].w, temp[33].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[7].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, const[7].___z; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: DP3 temp[26].y, temp[16].xyz_, const[3].xyz_; 17: MAD temp[27].x, temp[26].y___, const[4].x___, const[4].y___; 18: MAD_SAT temp[28].x, temp[26].y___, temp[27].x___, const[4].z___; 19: MAD temp[29].xyz, temp[28].xxx_, temp[11].xyz_, temp[25].xxx_; 20: MUL temp[30].xyz, temp[29].xyz_, input[0].xyz_; 21: MAD_SAT temp[32].xyz, temp[12].yyy_, (temp[13] - temp[30]).xyz_, temp[30].xyz_; 22: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 23: MAD_SAT temp[34].x, input[1].x___, const[5].x___, const[5].y___; 24: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[6]).xyz_, const[6].xyz_; 25: MOV output[0].w, temp[33].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, 128.000000 (0x70).___w; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: DP3 temp[26].y, temp[16].xyz_, const[3].xyz_; 17: MAD temp[27].x, temp[26].y___, const[4].x___, const[4].y___; 18: MAD_SAT temp[28].x, temp[26].y___, temp[27].x___, const[4].z___; 19: MAD temp[29].xyz, temp[28].xxx_, temp[11].xyz_, temp[25].xxx_; 20: MUL temp[30].xyz, temp[29].xyz_, input[0].xyz_; 21: MAD_SAT temp[32].xyz, temp[12].yyy_, (temp[13] - temp[30]).xyz_, temp[30].xyz_; 22: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 23: MAD_SAT temp[34].x, input[1].x___, const[5].x___, const[5].y___; 24: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[6]).xyz_, const[6].xyz_; 25: MOV output[0].w, temp[33].___w; CONST[7] = { 2.0000 0.0000 128.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, 128.000000 (0x70).___w; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: DP3 temp[26].y, temp[16].xyz_, const[3].xyz_; 17: MAD temp[27].x, temp[26].y___, const[4].x___, const[4].y___; 18: MAD_SAT temp[28].x, temp[26].y___, temp[27].x___, const[4].z___; 19: MAD temp[29].xyz, temp[28].xxx_, temp[11].xyz_, temp[25].xxx_; 20: MUL temp[30].xyz, temp[29].xyz_, input[0].xyz_; 21: MAD_SAT temp[32].xyz, temp[12].yyy_, (temp[13] - temp[30]).xyz_, temp[30].xyz_; 22: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 23: MAD_SAT temp[34].x, input[1].x___, const[5].x___, const[5].y___; 24: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[6]).xyz_, const[6].xyz_; 25: MOV output[0].w, temp[33].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, 128.000000 (0x70).___w; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: DP3 temp[26].y, temp[16].xyz_, const[3].xyz_; 17: MAD temp[27].x, temp[26].y___, const[4].x___, const[4].y___; 18: MAD_SAT temp[28].x, temp[26].y___, temp[27].x___, const[4].z___; 19: MAD temp[29].xyz, temp[28].xxx_, temp[11].xyz_, temp[25].xxx_; 20: MUL temp[30].xyz, temp[29].xyz_, input[0].xyz_; 21: MAD_SAT temp[32].xyz, temp[12].yyy_, (temp[13] - temp[30]).xyz_, temp[30].xyz_; 22: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 23: MAD_SAT temp[34].x, input[1].x___, const[5].x___, const[5].y___; 24: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[6]).xyz_, const[6].xyz_; 25: MOV output[0].w, temp[33].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: src0.xyz = temp[10], src0.w = 2.000000 (0x40) MAD temp[11].xyz, src0.xyz, src0.www, src0.000 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: src0.xyz = temp[11], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[17].w, src0._, src0._ 8: src0.w = temp[17] RSQ temp[18].w, |src0.w| 9: src0.xyz = input[4], src0.w = temp[18] MAD temp[19].xyz, src0.www, src0.xyz, src0.000 10: src0.xyz = temp[19], src1.xyz = temp[16] DP3_SAT temp[20].x, src0.xyz, src1.xyz 11: src0.xyz = temp[20] LG2 temp[21].w, src0.x 12: src0.w = temp[21], src1.w = 128.000000 (0x70) MAD temp[22].w, src0.w, src1.w, src0.0 13: src0.w = temp[22] REPL_ALPHA temp[23].x EX2, src0.w 14: src0.xyz = temp[23], src1.xyz = const[1] MAD temp[24].x, src0.x__, src1.x__, src0.000 15: src0.xyz = temp[24], src1.xyz = temp[12] MAD temp[25].x, src0.x__, src1.x__, src0.000 16: src0.xyz = temp[16], src1.xyz = const[3] DP3 temp[26].y, src0.xyz, src1.xyz 17: src0.xyz = temp[26], src1.xyz = const[4] MAD temp[27].x, src0.y__, src1.x__, src1.y__ 18: src0.xyz = temp[26], src1.xyz = temp[27], src2.xyz = const[4] MAD_SAT temp[28].x, src0.y__, src1.x__, src2.z__ 19: src0.xyz = temp[28], src1.xyz = temp[11], src2.xyz = temp[25] MAD temp[29].xyz, src0.xxx, src1.xyz, src2.xxx 20: src0.xyz = temp[29], src1.xyz = input[0] MAD temp[30].xyz, src0.xyz, src1.xyz, src0.000 21: src0.xyz = temp[30], src1.xyz = temp[13], src2.xyz = temp[12], srcp.xyz = (src1 - src0) MAD_SAT temp[32].xyz, src2.yyy, srcp.xyz, src0.xyz 22: src0.w = temp[10], src1.w = input[0] MAD_SAT temp[33].w, src0.w, src1.w, src0.0 23: src0.xyz = input[1], src1.xyz = const[5] MAD_SAT temp[34].x, src0.x__, src1.x__, src1.y__ 24: src0.xyz = const[6], src1.xyz = temp[32], src2.xyz = temp[34], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 25: src0.w = temp[33] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 4: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[17].w, src0._, src0._ 5: src0.w = temp[17] RSQ temp[18].w, |src0.w| 6: src0.xyz = input[4], src0.w = temp[18] MAD temp[19].xyz, src0.www, src0.xyz, src0.000 7: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[11], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 9: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[5], src1.w = input[0] MAD_SAT temp[34].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[33].w, src0.w, src1.w, src0.0 10: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 11: src0.xyz = temp[19], src1.xyz = temp[16] DP3_SAT temp[20].x, src0.xyz, src1.xyz 12: src0.xyz = temp[16], src1.xyz = const[3], src2.xyz = temp[20] DP3 temp[26].y, src0.xyz, src1.xyz LG2 temp[21].w, src2.x 13: src0.xyz = temp[26], src0.w = temp[21], src1.xyz = const[4], src1.w = 128.000000 (0x70) MAD temp[27].x, src0.y__, src1.x__, src1.y__ MAD temp[22].w, src0.w, src1.w, src0.0 14: src0.xyz = temp[26], src1.xyz = temp[27], src2.xyz = const[4] MAD_SAT temp[28].x, src0.y__, src1.x__, src2.z__ 15: src0.w = temp[22] REPL_ALPHA temp[23].x EX2, src0.w 16: src0.xyz = temp[23], src1.xyz = const[1] MAD temp[24].x, src0.x__, src1.x__, src0.000 17: src0.xyz = temp[24], src1.xyz = temp[12] MAD temp[25].x, src0.x__, src1.x__, src0.000 18: src0.xyz = temp[28], src1.xyz = temp[11], src2.xyz = temp[25] MAD temp[29].xyz, src0.xxx, src1.xyz, src2.xxx 19: src0.xyz = temp[29], src1.xyz = input[0] MAD temp[30].xyz, src0.xyz, src1.xyz, src0.000 20: src0.xyz = temp[30], src1.xyz = temp[13], src2.xyz = temp[12], srcp.xyz = (src1 - src0) MAD_SAT temp[32].xyz, src2.yyy, srcp.xyz, src0.xyz 21: src0.xyz = const[6], src0.w = temp[33], src1.xyz = temp[32], src2.xyz = temp[34], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 4: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[17].w, src0._, src0._ 5: src0.w = temp[17] RSQ temp[18].w, |src0.w| 6: src0.xyz = input[4], src0.w = temp[18] MAD temp[19].xyz, src0.www, src0.xyz, src0.000 7: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[11], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 9: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[5], src1.w = input[0] MAD_SAT temp[34].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[33].w, src0.w, src1.w, src0.0 10: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 11: src0.xyz = temp[19], src1.xyz = temp[16] DP3_SAT temp[20].x, src0.xyz, src1.xyz 12: src0.xyz = temp[16], src1.xyz = const[3], src2.xyz = temp[20] DP3 temp[26].y, src0.xyz, src1.xyz LG2 temp[21].w, src2.x 13: src0.xyz = temp[26], src0.w = temp[21], src1.xyz = const[4], src1.w = 128.000000 (0x70) MAD temp[27].x, src0.y__, src1.x__, src1.y__ MAD temp[22].w, src0.w, src1.w, src0.0 14: src0.xyz = temp[26], src1.xyz = temp[27], src2.xyz = const[4] MAD_SAT temp[28].x, src0.y__, src1.x__, src2.z__ 15: src0.w = temp[22] REPL_ALPHA temp[23].x EX2, src0.w 16: src0.xyz = temp[23], src1.xyz = const[1] MAD temp[24].x, src0.x__, src1.x__, src0.000 17: src0.xyz = temp[24], src1.xyz = temp[12] MAD temp[25].x, src0.x__, src1.x__, src0.000 18: src0.xyz = temp[28], src1.xyz = temp[11], src2.xyz = temp[25] MAD temp[29].xyz, src0.xxx, src1.xyz, src2.xxx 19: src0.xyz = temp[29], src1.xyz = input[0] MAD temp[30].xyz, src0.xyz, src1.xyz, src0.000 20: src0.xyz = temp[30], src1.xyz = temp[13], src2.xyz = temp[12], srcp.xyz = (src1 - src0) MAD_SAT temp[32].xyz, src2.yyy, srcp.xyz, src0.xyz 21: src0.xyz = const[6], src0.w = temp[33], src1.xyz = temp[32], src2.xyz = temp[34], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[1].xy__, 2D[0]; 2: TEX temp[1].xy, input[1].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[1].w, src0._, src0._ 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[2].w, src0._, src0._ 5: src0.w = temp[2] RSQ temp[2].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[2] MAD temp[3].xyz, src0.www, src0.xyz, src0.000 7: src0.xyz = temp[5], src0.w = 2.000000 (0x40), src1.w = temp[1] SEM_WAIT MAD temp[6].xyz, src0.xyz, src0.www, src0.000 RSQ temp[1].w, |src1.w| 8: src0.xyz = temp[6], src1.xyz = const[1] MAD temp[7].xyz, src0.xyz, src1.yyy, src0.000 9: src0.xyz = input[4], src0.w = temp[5], src1.xyz = const[5], src1.w = input[0] MAD_SAT temp[1].z, src0.__x, src1.__x, src1.__y MAD_SAT temp[2].w, src0.w, src1.w, src0.0 10: src0.xyz = input[2], src0.w = temp[1] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 11: src0.xyz = temp[3], src1.xyz = temp[2] DP3_SAT temp[3].x, src0.xyz, src1.xyz 12: src0.xyz = temp[2], src1.xyz = const[3], src2.xyz = temp[3] DP3 temp[2].x, src0.xyz, src1.xyz LG2 temp[1].w, src2.x 13: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[4], src1.w = 128.000000 (0x70) MAD temp[2].y, src0._x_, src1._x_, src1._y_ MAD temp[1].w, src0.w, src1.w, src0.0 14: src0.xyz = temp[2], src1.xyz = temp[2], src2.xyz = const[4] MAD_SAT temp[2].x, src0.x__, src1.y__, src2.z__ 15: src0.w = temp[1] REPL_ALPHA temp[2].y EX2, src0.w 16: src0.xyz = temp[2], src1.xyz = const[1] MAD temp[2].y, src0._y_, src1._x_, src0._0_ 17: src0.xyz = temp[2], src1.xyz = temp[1] MAD temp[2].y, src0._y_, src1._x_, src0._0_ 18: src0.xyz = temp[2], src1.xyz = temp[6], src2.xyz = temp[2] MAD temp[2].xyz, src0.xxx, src1.xyz, src2.yyy 19: src0.xyz = temp[2], src1.xyz = input[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 20: src0.xyz = temp[0], src1.xyz = temp[7], src2.xyz = temp[1], srcp.xyz = (src1 - src0) MAD_SAT temp[0].xyz, src2.yyy, srcp.xyz, src0.xyz 21: src0.xyz = const[6], src0.w = temp[2], src1.xyz = temp[0], src2.xyz = temp[1], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.zzz, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f401: src: 1 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00001807:TEX TEX_WAIT wmask: RG omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000111:DP3 dest:17 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c02b:RSQ dest:2 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080004c0:Addr0: 192t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d01b:RSQ dest:1 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08040406:Addr0: 6t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00186000:ALU wmask: AB omask: NONE 1:RGB_ADDR 0x08041404:Addr0: 4t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000005:Addr0: 5t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122090:rgb_A_src:0 0/0/R 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20191010:MAD dest:1 rgb_C_src:1 0/0/G 0 alp_C_src:0 0 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000803:Addr0: 3t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000031:DP3 dest:3 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00004800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x00340c02:Addr0: 2t, Addr1: 3c, Addr2: 3t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00002019:LN2 dest:1 alp_A_src:2 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000021:DP3 dest:2 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00005000:ALU wmask: AG omask: NONE 1:RGB_ADDR 0x08041002:Addr0: 2t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803c001:Addr0: 1t, Addr1: 240t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00822410:rgb_A_src:0 0/R/0 0 rgb_B_src:1 0/R/0 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20431020:MAD dest:2 rgb_C_src:1 0/G/0 0 alp_C_src:0 0 0 13 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x10400802:Addr0: 2t, Addr1: 2t, Addr2: 4c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0090a480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 G/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0048a020:MAD dest:2 rgb_C_src:2 B/0/0 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00001000:ALU wmask: G omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000002a:SOP dest:2 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00001000:ALU wmask: G omask: NONE 1:RGB_ADDR 0x08040402:Addr0: 2t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00822430:rgb_A_src:0 0/G/0 0 rgb_B_src:1 0/R/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00001000:ALU wmask: G omask: NONE 1:RGB_ADDR 0x08000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00822430:rgb_A_src:0 0/G/0 0 rgb_B_src:1 0/R/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x00201802:Addr0: 2t, Addr1: 6t, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00126020:MAD dest:2 rgb_C_src:2 G/G/G 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00003a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x40101c00:Addr0: 0t, Addr1: 7t, Addr2: 1t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446126:rgb_A_src:2 G/G/G 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40100106:Addr0: 6c, Addr1: 0t, Addr2: 1t, srcp:1 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 21 Instructions ~ 18 Vector Instructions (RGB) ~ 9 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 2 Texture Instructions ~ 2 Presub Operations ~ 0 OMOD Operations ~ 8 Temporary Registers ~ 2 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL IN[5] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL OUT[4], GENERIC[1] DCL OUT[5], GENERIC[2] DCL CONST[0..200] DCL TEMP[0..5] DCL ADDR[0] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[2].yzw, IMM[0].xxxy 1: ARL ADDR[0].x, IN[5].xxxx 2: MOV TEMP[0], CONST[ADDR[0].x] 3: MOV TEMP[1], CONST[ADDR[0].x+1] 4: XPD TEMP[2].xyz, TEMP[0], IN[0] 5: MAD TEMP[2].xyz, IN[0], TEMP[0].wwww, TEMP[2] 6: ADD TEMP[2].xyz, TEMP[2], TEMP[1] 7: XPD TEMP[2].xyz, TEMP[0], TEMP[2] 8: MAD TEMP[2].xyz, TEMP[1], TEMP[0].wwww, TEMP[2] 9: MAD TEMP[2].xyz, TEMP[0], -TEMP[1].wwww, TEMP[2] 10: MAD TEMP[2].xyz, CONST[192].xxxx, TEMP[2], IN[0] 11: MOV TEMP[2].w, IN[0].wwww 12: XPD TEMP[1].xyz, TEMP[0], IN[1] 13: MAD TEMP[1].xyz, TEMP[0].wwww, IN[1], TEMP[1] 14: XPD TEMP[1].xyz, TEMP[0], TEMP[1] 15: MAD TEMP[1].xyz, CONST[192].xxxx, TEMP[1], IN[1] 16: XPD TEMP[3].xyz, TEMP[0], IN[4] 17: MAD TEMP[3].xyz, TEMP[0].wwww, IN[4], TEMP[3] 18: XPD TEMP[3].xyz, TEMP[0], TEMP[3] 19: MAD TEMP[3].xyz, CONST[192].xxxx, TEMP[3], IN[4] 20: DP4 OUT[0].x, TEMP[2], CONST[193] 21: DP4 OUT[0].y, TEMP[2], CONST[194] 22: DP4 OUT[0].z, TEMP[2], CONST[195] 23: DP4 OUT[0].w, TEMP[2], CONST[196] 24: MOV_SAT OUT[1], IN[2] 25: ADD OUT[3].xy, IN[3], CONST[197].yzww 26: SUB TEMP[0], CONST[198], TEMP[2] 27: DP3 TEMP[0].w, TEMP[0], TEMP[0] 28: RSQ TEMP[0].w, |TEMP[0].wwww| 29: MUL TEMP[0].xyz, TEMP[0].wwww, TEMP[0] 30: XPD TEMP[4].xyz, TEMP[1], TEMP[3] 31: MUL TEMP[4].xyz, TEMP[4], IN[4].wwww 32: DP3 OUT[4].x, CONST[199], TEMP[3] 33: DP3 OUT[4].y, CONST[199], TEMP[4] 34: DP3 OUT[4].z, CONST[199], TEMP[1] 35: ADD TEMP[5].xyz, CONST[199], TEMP[0] 36: DP3 OUT[5].x, TEMP[5], TEMP[3] 37: DP3 OUT[5].y, TEMP[5], TEMP[4] 38: DP3 OUT[5].z, TEMP[5], TEMP[1] 39: DP4 OUT[2].x, -TEMP[2], CONST[200] 40: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: ARL addr[0].x, input[5].xxxx; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: XPD temp[2].xyz, temp[0], input[0]; 5: MAD temp[2].xyz, input[0], temp[0].wwww, temp[2]; 6: ADD temp[2].xyz, temp[2], temp[1]; 7: XPD temp[2].xyz, temp[0], temp[2]; 8: MAD temp[2].xyz, temp[1], temp[0].wwww, temp[2]; 9: MAD temp[2].xyz, temp[0], -temp[1].wwww, temp[2]; 10: MAD temp[2].xyz, const[192].xxxx, temp[2], input[0]; 11: MOV temp[2].w, input[0].wwww; 12: XPD temp[1].xyz, temp[0], input[1]; 13: MAD temp[1].xyz, temp[0].wwww, input[1], temp[1]; 14: XPD temp[1].xyz, temp[0], temp[1]; 15: MAD temp[1].xyz, const[192].xxxx, temp[1], input[1]; 16: XPD temp[3].xyz, temp[0], input[4]; 17: MAD temp[3].xyz, temp[0].wwww, input[4], temp[3]; 18: XPD temp[3].xyz, temp[0], temp[3]; 19: MAD temp[3].xyz, const[192].xxxx, temp[3], input[4]; 20: DP4 temp[6].x, temp[2], const[193]; 21: DP4 temp[6].y, temp[2], const[194]; 22: DP4 temp[6].z, temp[2], const[195]; 23: DP4 temp[6].w, temp[2], const[196]; 24: MOV_SAT output[1], input[2]; 25: ADD output[3].xy, input[3], const[197].yzww; 26: SUB temp[0], const[198], temp[2]; 27: DP3 temp[0].w, temp[0], temp[0]; 28: RSQ temp[0].w, |temp[0].wwww|; 29: MUL temp[0].xyz, temp[0].wwww, temp[0]; 30: XPD temp[4].xyz, temp[1], temp[3]; 31: MUL temp[4].xyz, temp[4], input[4].wwww; 32: DP3 output[4].x, const[199], temp[3]; 33: DP3 output[4].y, const[199], temp[4]; 34: DP3 output[4].z, const[199], temp[1]; 35: ADD temp[5].xyz, const[199], temp[0]; 36: DP3 output[5].x, temp[5], temp[3]; 37: DP3 output[5].y, temp[5], temp[4]; 38: DP3 output[5].z, temp[5], temp[1]; 39: DP4 output[2].x, -temp[2], const[200]; 40: MOV output[0], temp[6]; 41: MOV output[6], temp[6]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: ARL addr[0].x, input[5].xxxx; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: XPD temp[2].xyz, temp[0], input[0]; 5: MAD temp[2].xyz, input[0], temp[0].wwww, temp[2]; 6: ADD temp[2].xyz, temp[2], temp[1]; 7: XPD temp[2].xyz, temp[0], temp[2]; 8: MAD temp[2].xyz, temp[1], temp[0].wwww, temp[2]; 9: MAD temp[2].xyz, temp[0], -temp[1].wwww, temp[2]; 10: MAD temp[2].xyz, const[192].xxxx, temp[2], input[0]; 11: MOV temp[2].w, input[0].wwww; 12: XPD temp[1].xyz, temp[0], input[1]; 13: MAD temp[1].xyz, temp[0].wwww, input[1], temp[1]; 14: XPD temp[1].xyz, temp[0], temp[1]; 15: MAD temp[1].xyz, const[192].xxxx, temp[1], input[1]; 16: XPD temp[3].xyz, temp[0], input[4]; 17: MAD temp[3].xyz, temp[0].wwww, input[4], temp[3]; 18: XPD temp[3].xyz, temp[0], temp[3]; 19: MAD temp[3].xyz, const[192].xxxx, temp[3], input[4]; 20: DP4 temp[6].x, temp[2], const[193]; 21: DP4 temp[6].y, temp[2], const[194]; 22: DP4 temp[6].z, temp[2], const[195]; 23: DP4 temp[6].w, temp[2], const[196]; 24: MOV_SAT output[1], input[2]; 25: ADD output[3].xy, input[3], const[197].yzww; 26: SUB temp[0], const[198], temp[2]; 27: DP3 temp[0].w, temp[0], temp[0]; 28: RSQ temp[0].w, |temp[0].wwww|; 29: MUL temp[0].xyz, temp[0].wwww, temp[0]; 30: XPD temp[4].xyz, temp[1], temp[3]; 31: MUL temp[4].xyz, temp[4], input[4].wwww; 32: DP3 output[4].x, const[199], temp[3]; 33: DP3 output[4].y, const[199], temp[4]; 34: DP3 output[4].z, const[199], temp[1]; 35: ADD temp[5].xyz, const[199], temp[0]; 36: DP3 output[5].x, temp[5], temp[3]; 37: DP3 output[5].y, temp[5], temp[4]; 38: DP3 output[5].z, temp[5], temp[1]; 39: DP4 output[2].x, -temp[2], const[200]; 40: MOV output[0], temp[6]; 41: MOV output[6], temp[6]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: ARL addr[0].x, input[5].xxxx; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: MUL temp[2].xyz, temp[0].zxyw, input[0].yzxw; 5: MAD temp[2].xyz, temp[0].yzxw, input[0].zxyw, -temp[2]; 6: MAD temp[2].xyz, input[0], temp[0].wwww, temp[2]; 7: ADD temp[2].xyz, temp[2], temp[1]; 8: MUL temp[7].xyz, temp[0].zxyw, temp[2].yzxw; 9: MAD temp[2].xyz, temp[0].yzxw, temp[2].zxyw, -temp[7]; 10: MAD temp[2].xyz, temp[1], temp[0].wwww, temp[2]; 11: MAD temp[2].xyz, temp[0], -temp[1].wwww, temp[2]; 12: MAD temp[2].xyz, const[192].xxxx, temp[2], input[0]; 13: MOV temp[2].w, input[0].wwww; 14: MUL temp[1].xyz, temp[0].zxyw, input[1].yzxw; 15: MAD temp[1].xyz, temp[0].yzxw, input[1].zxyw, -temp[1]; 16: MAD temp[1].xyz, temp[0].wwww, input[1], temp[1]; 17: MUL temp[8].xyz, temp[0].zxyw, temp[1].yzxw; 18: MAD temp[1].xyz, temp[0].yzxw, temp[1].zxyw, -temp[8]; 19: MAD temp[1].xyz, const[192].xxxx, temp[1], input[1]; 20: MUL temp[3].xyz, temp[0].zxyw, input[4].yzxw; 21: MAD temp[3].xyz, temp[0].yzxw, input[4].zxyw, -temp[3]; 22: MAD temp[3].xyz, temp[0].wwww, input[4], temp[3]; 23: MUL temp[9].xyz, temp[0].zxyw, temp[3].yzxw; 24: MAD temp[3].xyz, temp[0].yzxw, temp[3].zxyw, -temp[9]; 25: MAD temp[3].xyz, const[192].xxxx, temp[3], input[4]; 26: DP4 temp[6].x, temp[2], const[193]; 27: DP4 temp[6].y, temp[2], const[194]; 28: DP4 temp[6].z, temp[2], const[195]; 29: DP4 temp[6].w, temp[2], const[196]; 30: MOV_SAT output[1], input[2]; 31: ADD output[3].xy, input[3], const[197].yzww; 32: ADD temp[0], const[198], -temp[2]; 33: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 34: RSQ temp[0].w, |temp[0].wwww|; 35: MUL temp[0].xyz, temp[0].wwww, temp[0]; 36: MUL temp[4].xyz, temp[1].zxyw, temp[3].yzxw; 37: MAD temp[4].xyz, temp[1].yzxw, temp[3].zxyw, -temp[4]; 38: MUL temp[4].xyz, temp[4], input[4].wwww; 39: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 40: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 41: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 42: ADD temp[5].xyz, const[199], temp[0]; 43: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 44: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 45: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 46: DP4 output[2].x, -temp[2], const[200]; 47: MOV output[0], temp[6]; 48: MOV output[6], temp[6]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0]._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: MUL temp[2].xyz, temp[0].zxy_, input[0].yzx_; 5: MAD temp[2].xyz, temp[0].yzx_, input[0].zxy_, -temp[2].xyz_; 6: MAD temp[2].xyz, input[0].xyz_, temp[0].www_, temp[2].xyz_; 7: ADD temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 8: MUL temp[7].xyz, temp[0].zxy_, temp[2].yzx_; 9: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[7].xyz_; 10: MAD temp[2].xyz, temp[1].xyz_, temp[0].www_, temp[2].xyz_; 11: MAD temp[2].xyz, temp[0].xyz_, -temp[1].www_, temp[2].xyz_; 12: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[0].xyz_; 13: MOV temp[2].w, input[0].___w; 14: MUL temp[1].xyz, temp[0].zxy_, input[1].yzx_; 15: MAD temp[1].xyz, temp[0].yzx_, input[1].zxy_, -temp[1].xyz_; 16: MAD temp[1].xyz, temp[0].www_, input[1].xyz_, temp[1].xyz_; 17: MUL temp[8].xyz, temp[0].zxy_, temp[1].yzx_; 18: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[8].xyz_; 19: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[1].xyz_; 20: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 21: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 22: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 23: MUL temp[9].xyz, temp[0].zxy_, temp[3].yzx_; 24: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[9].xyz_; 25: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 26: DP4 temp[6].x, temp[2], const[193]; 27: DP4 temp[6].y, temp[2], const[194]; 28: DP4 temp[6].z, temp[2], const[195]; 29: DP4 temp[6].w, temp[2], const[196]; 30: MOV_SAT output[1], input[2]; 31: ADD output[3].xy, input[3].xy__, const[197].yz__; 32: ADD temp[0].xyz, const[198].xyz_, -temp[2].xyz_; 33: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 34: RSQ temp[0].w, |temp[0].___w|; 35: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 36: MUL temp[4].xyz, temp[1].zxy_, temp[3].yzx_; 37: MAD temp[4].xyz, temp[1].yzx_, temp[3].zxy_, -temp[4].xyz_; 38: MUL temp[4].xyz, temp[4].xyz_, input[4].www_; 39: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 40: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 41: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 42: ADD temp[5].xyz, const[199].xyz_, temp[0].xyz_; 43: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 44: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 45: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 46: DP4 output[2].x, -temp[2], const[200]; 47: MOV output[0], temp[6]; 48: MOV output[6], temp[6]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[2].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[2].xyz, temp[0].yzx_, input[0].zxy_, -temp[2].xyz_; 5: MAD temp[2].xyz, input[0].xyz_, temp[0].www_, temp[2].xyz_; 6: ADD temp[2].xyz, temp[2].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[7].xyz, temp[0].zxy_, temp[2].yzx_; 8: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[7].xyz_; 9: MAD temp[2].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[2].xyz_; 10: MAD temp[2].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[2].xyz_; 11: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[0].xyz_; 12: MOV temp[2].w, input[0].___w; 13: MUL temp[1].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[1].xyz, temp[0].yzx_, input[1].zxy_, -temp[1].xyz_; 15: MAD temp[1].xyz, temp[0].www_, input[1].xyz_, temp[1].xyz_; 16: MUL temp[8].xyz, temp[0].zxy_, temp[1].yzx_; 17: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[8].xyz_; 18: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[9].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[9].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[6].x, temp[2], const[193]; 26: DP4 temp[6].y, temp[2], const[194]; 27: DP4 temp[6].z, temp[2], const[195]; 28: DP4 temp[6].w, temp[2], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[2].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[4].xyz, temp[1].zxy_, temp[3].yzx_; 36: MAD temp[4].xyz, temp[1].yzx_, temp[3].zxy_, -temp[4].xyz_; 37: MUL temp[4].xyz, temp[4].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 41: ADD temp[5].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 44: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 45: DP4 output[2].x, -temp[2], const[200]; 46: MOV output[0], temp[6]; 47: MOV output[6], temp[6]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[2].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[2].xyz, temp[0].yzx_, input[0].zxy_, -temp[2].xyz_; 5: MAD temp[2].xyz, input[0].xyz_, temp[0].www_, temp[2].xyz_; 6: ADD temp[2].xyz, temp[2].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[7].xyz, temp[0].zxy_, temp[2].yzx_; 8: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[7].xyz_; 9: MAD temp[2].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[2].xyz_; 10: MAD temp[2].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[2].xyz_; 11: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[0].xyz_; 12: MOV temp[2].w, input[0].___w; 13: MUL temp[1].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[1].xyz, temp[0].yzx_, input[1].zxy_, -temp[1].xyz_; 15: MAD temp[1].xyz, temp[0].www_, input[1].xyz_, temp[1].xyz_; 16: MUL temp[8].xyz, temp[0].zxy_, temp[1].yzx_; 17: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[8].xyz_; 18: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[9].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[9].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[6].x, temp[2], const[193]; 26: DP4 temp[6].y, temp[2], const[194]; 27: DP4 temp[6].z, temp[2], const[195]; 28: DP4 temp[6].w, temp[2], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[2].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[4].xyz, temp[1].zxy_, temp[3].yzx_; 36: MAD temp[4].xyz, temp[1].yzx_, temp[3].zxy_, -temp[4].xyz_; 37: MUL temp[4].xyz, temp[4].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 41: ADD temp[5].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 44: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 45: DP4 output[2].x, -temp[2], const[200]; 46: MOV output[0], temp[6]; 47: MOV output[6], temp[6]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[1].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[1].xyz, temp[0].yzx_, input[0].zxy_, -temp[1].xyz_; 5: MAD temp[1].xyz, input[0].xyz_, temp[0].www_, temp[1].xyz_; 6: ADD temp[1].xyz, temp[1].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[2].xyz, temp[0].zxy_, temp[1].yzx_; 8: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[2].xyz_; 9: MAD temp[1].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[1].xyz_; 10: MAD temp[1].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[1].xyz_; 11: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[0].xyz_; 12: MOV temp[1].w, input[0].___w; 13: MUL temp[2].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[2].xyz, temp[0].yzx_, input[1].zxy_, -temp[2].xyz_; 15: MAD temp[2].xyz, temp[0].www_, input[1].xyz_, temp[2].xyz_; 16: MUL temp[3].xyz, temp[0].zxy_, temp[2].yzx_; 17: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[3].xyz_; 18: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[4].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[4].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[4].x, temp[1], const[193]; 26: DP4 temp[4].y, temp[1], const[194]; 27: DP4 temp[4].z, temp[1], const[195]; 28: DP4 temp[4].w, temp[1], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[1].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[5].xyz, temp[2].zxy_, temp[3].yzx_; 36: MAD temp[5].xyz, temp[2].yzx_, temp[3].zxy_, -temp[5].xyz_; 37: MUL temp[5].xyz, temp[5].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[5].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[2].xyz0; 41: ADD temp[0].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[0].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[0].xyz0, temp[5].xyz0; 44: DP4 output[5].z, temp[0].xyz0, temp[2].xyz0; 45: DP4 output[2].x, -temp[1], const[200]; 46: MOV output[0], temp[4]; 47: MOV output[6], temp[4]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[1].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[1].xyz, temp[0].yzx_, input[0].zxy_, -temp[1].xyz_; 5: MAD temp[1].xyz, input[0].xyz_, temp[0].www_, temp[1].xyz_; 6: ADD temp[1].xyz, temp[1].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[2].xyz, temp[0].zxy_, temp[1].yzx_; 8: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[2].xyz_; 9: MAD temp[1].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[1].xyz_; 10: MAD temp[1].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[1].xyz_; 11: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[0].xyz_; 12: MOV temp[1].w, input[0].___w; 13: MUL temp[2].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[2].xyz, temp[0].yzx_, input[1].zxy_, -temp[2].xyz_; 15: MAD temp[2].xyz, temp[0].www_, input[1].xyz_, temp[2].xyz_; 16: MUL temp[3].xyz, temp[0].zxy_, temp[2].yzx_; 17: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[3].xyz_; 18: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[4].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[4].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[4].x, temp[1], const[193]; 26: DP4 temp[4].y, temp[1], const[194]; 27: DP4 temp[4].z, temp[1], const[195]; 28: DP4 temp[4].w, temp[1], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[1].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[5].xyz, temp[2].zxy_, temp[3].yzx_; 36: MAD temp[5].xyz, temp[2].yzx_, temp[3].zxy_, -temp[5].xyz_; 37: MUL temp[5].xyz, temp[5].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[5].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[2].xyz0; 41: ADD temp[0].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[0].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[0].xyz0, temp[5].xyz0; 44: DP4 output[5].z, temp[0].xyz0, temp[2].xyz0; 45: DP4 output[2].x, -temp[1], const[200]; 46: MOV output[0], temp[4]; 47: MOV output[6], temp[4]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[1].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[1].xyz, temp[0].yzx_, input[0].zxy_, -temp[1].xyz_; 5: MAD temp[1].xyz, input[0].xyz_, temp[0].www_, temp[1].xyz_; 6: ADD temp[1].xyz, temp[1].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[2].xyz, temp[0].zxy_, temp[1].yzx_; 8: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[2].xyz_; 9: MAD temp[1].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[1].xyz_; 10: MAD temp[1].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[1].xyz_; 11: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[0].xyz_; 12: MOV temp[1].w, input[0].___w; 13: MUL temp[2].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[2].xyz, temp[0].yzx_, input[1].zxy_, -temp[2].xyz_; 15: MAD temp[2].xyz, temp[0].www_, input[1].xyz_, temp[2].xyz_; 16: MUL temp[3].xyz, temp[0].zxy_, temp[2].yzx_; 17: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[3].xyz_; 18: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[4].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[4].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[4].x, temp[1], const[193]; 26: DP4 temp[4].y, temp[1], const[194]; 27: DP4 temp[4].z, temp[1], const[195]; 28: DP4 temp[4].w, temp[1], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[1].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[5].xyz, temp[2].zxy_, temp[3].yzx_; 36: MAD temp[5].xyz, temp[2].yzx_, temp[3].zxy_, -temp[5].xyz_; 37: MUL temp[5].xyz, temp[5].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[5].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[2].xyz0; 41: ADD temp[0].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[0].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[0].xyz0, temp[5].xyz0; 44: DP4 output[5].z, temp[0].xyz0, temp[2].xyz0; 45: DP4 output[2].x, -temp[1], const[200]; 46: MOV output[0], temp[4]; 47: MOV output[6], temp[4]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x0010010d dst: 0a0 op: VE_FLT2FIX_DX src0: 0x01ff00a1 reg: 5i swiz: X/ U/ U/ U src1: 0x012480a1 reg: 5i swiz: 0/ 0/ 0/ 0 src2: 0x012480a1 reg: 5i swiz: 0/ 0/ 0/ 0 2: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10012 reg: 0c swiz: X/ Y/ Z/ W src1: 0x01248012 reg: 0c swiz: 0/ 0/ 0/ 0 src2: 0x01248012 reg: 0c swiz: 0/ 0/ 0/ 0 3: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22001 reg: 0i swiz: Y/ Z/ X/ U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84001 reg: 0i swiz: Z/ X/ Y/ U src2: 0x1fd10020 reg: 1t swiz: -X/-Y/-Z/-U 5: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01d10001 reg: 0i swiz: X/ Y/ Z/ U src1: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src2: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U 6: op: 0x00702003 dst: 1t op: VE_ADD src0: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src1: 0x01d10032 reg: 1c swiz: X/ Y/ Z/ U src2: 0x01248032 reg: 1c swiz: 0/ 0/ 0/ 0 7: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 8: op: 0x00702080 dst: 1t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 9: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01d10032 reg: 1c swiz: X/ Y/ Z/ U src1: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src2: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U 10: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01d10000 reg: 0t swiz: X/ Y/ Z/ U src1: 0x1fdb6032 reg: 1c swiz: -W/-W/-W/-U src2: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U 11: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01c01802 reg: 192c swiz: X/ X/ X/ U src1: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src2: 0x01d10001 reg: 0i swiz: X/ Y/ Z/ U 12: op: 0x00802003 dst: 1t op: VE_ADD src0: 0x00ffe001 reg: 0i swiz: U/ U/ U/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 14: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 15: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src1: 0x01d10021 reg: 1i swiz: X/ Y/ Z/ U src2: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U 16: op: 0x00706002 dst: 3t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22040 reg: 2t swiz: Y/ Z/ X/ U src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 17: op: 0x00704080 dst: 2t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84040 reg: 2t swiz: Z/ X/ Y/ U src2: 0x1fd10060 reg: 3t swiz: -X/-Y/-Z/-U 18: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c01802 reg: 192c swiz: X/ X/ X/ U src1: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src2: 0x01d10021 reg: 1i swiz: X/ Y/ Z/ U 19: op: 0x00706002 dst: 3t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22081 reg: 4i swiz: Y/ Z/ X/ U src2: 0x01248081 reg: 4i swiz: 0/ 0/ 0/ 0 20: op: 0x00706004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84081 reg: 4i swiz: Z/ X/ Y/ U src2: 0x1fd10060 reg: 3t swiz: -X/-Y/-Z/-U 21: op: 0x00706004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src1: 0x01d10081 reg: 4i swiz: X/ Y/ Z/ U src2: 0x01d10060 reg: 3t swiz: X/ Y/ Z/ U 22: op: 0x00708002 dst: 4t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22060 reg: 3t swiz: Y/ Z/ X/ U src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 23: op: 0x00706080 dst: 3t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84060 reg: 3t swiz: Z/ X/ Y/ U src2: 0x1fd10080 reg: 4t swiz: -X/-Y/-Z/-U 24: op: 0x00706004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x01c01802 reg: 192c swiz: X/ X/ X/ U src1: 0x01d10060 reg: 3t swiz: X/ Y/ Z/ U src2: 0x01d10081 reg: 4i swiz: X/ Y/ Z/ U 25: op: 0x00108001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11822 reg: 193c swiz: X/ Y/ Z/ W src2: 0x01249822 reg: 193c swiz: 0/ 0/ 0/ 0 26: op: 0x00208001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11842 reg: 194c swiz: X/ Y/ Z/ W src2: 0x01249842 reg: 194c swiz: 0/ 0/ 0/ 0 27: op: 0x00408001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11862 reg: 195c swiz: X/ Y/ Z/ W src2: 0x01249862 reg: 195c swiz: 0/ 0/ 0/ 0 28: op: 0x00808001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11882 reg: 196c swiz: X/ Y/ Z/ W src2: 0x01249882 reg: 196c swiz: 0/ 0/ 0/ 0 29: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 30: op: 0x00304203 dst: 2o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01fa38a2 reg: 197c swiz: Y/ Z/ U/ U src2: 0x012498a2 reg: 197c swiz: 0/ 0/ 0/ 0 31: op: 0x00700003 dst: 0t op: VE_ADD src0: 0x01d118c2 reg: 198c swiz: X/ Y/ Z/ U src1: 0x1fd10020 reg: 1t swiz: -X/-Y/-Z/-U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 32: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 33: op: 0x00800048 dst: 0t op: ME_RECIP_SQRT_DX src0: 0x00db6008 reg: 0t swiz: W/ W/ W/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 34: op: 0x00700002 dst: 0t op: VE_MULTIPLY src0: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src1: 0x01d10000 reg: 0t swiz: X/ Y/ Z/ U src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 35: op: 0x0070a002 dst: 5t op: VE_MULTIPLY src0: 0x01c84040 reg: 2t swiz: Z/ X/ Y/ U src1: 0x01c22060 reg: 3t swiz: Y/ Z/ X/ U src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 36: op: 0x0070a080 dst: 5t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22040 reg: 2t swiz: Y/ Z/ X/ U src1: 0x01c84060 reg: 3t swiz: Z/ X/ Y/ U src2: 0x1fd100a0 reg: 5t swiz: -X/-Y/-Z/-U 37: op: 0x0070a002 dst: 5t op: VE_MULTIPLY src0: 0x01d100a0 reg: 5t swiz: X/ Y/ Z/ U src1: 0x01db6081 reg: 4i swiz: W/ W/ W/ U src2: 0x01248081 reg: 4i swiz: 0/ 0/ 0/ 0 38: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x011118e2 reg: 199c swiz: X/ Y/ Z/ 0 src1: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 39: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x011118e2 reg: 199c swiz: X/ Y/ Z/ 0 src1: 0x011100a0 reg: 5t swiz: X/ Y/ Z/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 40: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x011118e2 reg: 199c swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 41: op: 0x00700003 dst: 0t op: VE_ADD src0: 0x01d118e2 reg: 199c swiz: X/ Y/ Z/ U src1: 0x01d10000 reg: 0t swiz: X/ Y/ Z/ U src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 42: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 43: op: 0x00208201 dst: 4o op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x011100a0 reg: 5t swiz: X/ Y/ Z/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 44: op: 0x00408201 dst: 4o op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 45: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10020 reg: 1t swiz: -X/-Y/-Z/-W src1: 0x00d11902 reg: 200c swiz: X/ Y/ Z/ W src2: 0x01249902 reg: 200c swiz: 0/ 0/ 0/ 0 46: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 47: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 48 Instructions ~ 0 Flow Control Instructions ~ 6 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], FOG, PERSPECTIVE DCL IN[2], GENERIC[0], PERSPECTIVE DCL IN[3], GENERIC[1], PERSPECTIVE DCL IN[4], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[3] DCL CONST[1] DCL CONST[3..5] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 0.5000, 0.0000, 128.0000} 0: TEX TEMP[0], IN[2], SAMP[0], 2D 1: MUL TEMP[0].xyz, TEMP[0], IMM[0].xxxx 2: TEX TEMP[1].xy, IN[2], SAMP[1], 2D 3: MUL TEMP[2].xyz, TEMP[0], CONST[1].yyyy 4: TEX TEMP[3], IN[2], SAMP[3], 2D 5: SUB TEMP[3].xyz, TEMP[3], IMM[0].yyyy 6: DP3 TEMP[3].w, TEMP[3], TEMP[3] 7: RSQ TEMP[3].w, |TEMP[3].wwww| 8: MUL TEMP[3].xyz, TEMP[3].wwww, TEMP[3] 9: DP3 TEMP[4].w, IN[4], IN[4] 10: RSQ TEMP[4].w, |TEMP[4].wwww| 11: MUL TEMP[4].xyz, TEMP[4].wwww, IN[4] 12: DP3_SAT TEMP[5].x, TEMP[4], TEMP[3] 13: POW TEMP[5].x, TEMP[5].xxxx, IMM[0].wwww 14: MUL TEMP[5].x, TEMP[5].xxxx, CONST[1].xxxx 15: MUL TEMP[5].x, TEMP[5].xxxx, TEMP[1].xxxx 16: DP3 TEMP[4].y, TEMP[3], IN[3] 17: MAD TEMP[4].x, TEMP[4].yyyy, CONST[3].xxxx, CONST[3].yyyy 18: MAD_SAT TEMP[4].x, TEMP[4].yyyy, TEMP[4].xxxx, CONST[3].zzzz 19: MAD TEMP[0].xyz, TEMP[4].xxxx, TEMP[0], TEMP[5].xxxx 20: MUL TEMP[0].xyz, TEMP[0], IN[0] 21: LRP_SAT TEMP[6].xyz, TEMP[1].yyyy, TEMP[2], TEMP[0] 22: MUL_SAT TEMP[6].w, TEMP[0].wwww, IN[0].wwww 23: MAD_SAT TEMP[7].x, IN[1].xxxx, CONST[4].xxxx, CONST[4].yyyy 24: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[6], CONST[5] 25: MOV OUT[0].w, TEMP[6] 26: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[6].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[6].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[4], input[4]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[4]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[6].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: DP3 temp[4].y, temp[3], input[3]; 17: MAD temp[4].x, temp[4].yyyy, const[3].xxxx, const[3].yyyy; 18: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[3].zzzz; 19: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 20: MUL temp[0].xyz, temp[0], input[0]; 21: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 22: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 23: MAD_SAT temp[7].x, input[1].xxxx, const[4].xxxx, const[4].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[6], const[5]; 25: MOV output[0].w, temp[6]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[6].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[6].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[4], input[4]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[4]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[6].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: DP3 temp[4].y, temp[3], input[3]; 17: MAD temp[4].x, temp[4].yyyy, const[3].xxxx, const[3].yyyy; 18: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[3].zzzz; 19: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 20: MUL temp[0].xyz, temp[0], input[0]; 21: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 22: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 23: MAD_SAT temp[7].x, input[1].xxxx, const[4].xxxx, const[4].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[6], const[5]; 25: MOV output[0].w, temp[6]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[6].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[6].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[4], input[4]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[4]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[6].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: DP3 temp[4].y, temp[3], input[3]; 17: MAD temp[4].x, temp[4].yyyy, const[3].xxxx, const[3].yyyy; 18: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[3].zzzz; 19: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 20: MUL temp[0].xyz, temp[0], input[0]; 21: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 22: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 23: MAD_SAT temp[7].x, input[1].xxxx, const[4].xxxx, const[4].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[6], const[5]; 25: MOV output[0].w, temp[6]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[6].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[6].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[4], input[4]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[4]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[6].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: DP3 temp[4].y, temp[3], input[3]; 17: MAD temp[4].x, temp[4].yyyy, const[3].xxxx, const[3].yyyy; 18: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[3].zzzz; 19: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 20: MUL temp[0].xyz, temp[0], input[0]; 21: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 22: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 23: MAD_SAT temp[7].x, input[1].xxxx, const[4].xxxx, const[4].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[6], const[5]; 25: MOV output[0].w, temp[6]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[6].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[6].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[4], input[4]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[4]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[6].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: DP3 temp[4].y, temp[3], input[3]; 17: MAD temp[4].x, temp[4].yyyy, const[3].xxxx, const[3].yyyy; 18: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[3].zzzz; 19: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 20: MUL temp[0].xyz, temp[0], input[0]; 21: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 22: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 23: MAD_SAT temp[7].x, input[1].xxxx, const[4].xxxx, const[4].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[6], const[5]; 25: MOV output[0].w, temp[6]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[6].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[6].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[4], input[4]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[4]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[6].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: DP3 temp[4].y, temp[3], input[3]; 17: MAD temp[4].x, temp[4].yyyy, const[3].xxxx, const[3].yyyy; 18: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[3].zzzz; 19: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 20: MUL temp[0].xyz, temp[0], input[0]; 21: LRP_SAT temp[6].xyz, temp[1].yyyy, temp[2], temp[0]; 22: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 23: MAD_SAT temp[7].x, input[1].xxxx, const[4].xxxx, const[4].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[6], const[5]; 25: MOV output[0].w, temp[6]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[6].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: ADD temp[3].xyz, temp[3], -const[6].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[4], input[4]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[4]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: LG2 temp[8].w, temp[5].xxxx; 14: MUL temp[8].w, temp[8].wwww, const[6].wwww; 15: EX2 temp[5].x, temp[8].wwww; 16: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 17: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 18: DP3 temp[4].y, temp[3], input[3]; 19: MAD temp[4].x, temp[4].yyyy, const[3].xxxx, const[3].yyyy; 20: MAD_SAT temp[4].x, temp[4].yyyy, temp[4].xxxx, const[3].zzzz; 21: MAD temp[0].xyz, temp[4].xxxx, temp[0], temp[5].xxxx; 22: MUL temp[0].xyz, temp[0], input[0]; 23: ADD temp[6].xyz, temp[2], -temp[0]; 24: MAD_SAT temp[6].xyz, temp[1].yyyy, temp[6], temp[0]; 25: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 26: MAD_SAT temp[7].x, input[1].xxxx, const[4].xxxx, const[4].yyyy; 27: ADD temp[9].xyz, temp[6], -const[5]; 28: MAD output[0].xyz, temp[7].xxxx, temp[9], const[5]; 29: MOV output[0].w, temp[6]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[0]; 1: MUL temp[0].xyz, temp[0].xyz_, const[6].xxx_; 2: TEX temp[1].xy, input[2].xy__, 2D[1]; 3: MUL temp[2].xyz, temp[0].xyz_, const[1].yyy_; 4: TEX temp[3].xyz, input[2].xy__, 2D[3]; 5: ADD temp[3].xyz, temp[3].xyz_, -const[6].yyy_; 6: DP3 temp[3].w, temp[3].xyz_, temp[3].xyz_; 7: RSQ temp[3].w, |temp[3].___w|; 8: MUL temp[3].xyz, temp[3].www_, temp[3].xyz_; 9: DP3 temp[4].w, input[4].xyz_, input[4].xyz_; 10: RSQ temp[4].w, |temp[4].___w|; 11: MUL temp[4].xyz, temp[4].www_, input[4].xyz_; 12: DP3_SAT temp[5].x, temp[4].xyz_, temp[3].xyz_; 13: LG2 temp[8].w, temp[5].___x; 14: MUL temp[8].w, temp[8].___w, const[6].___w; 15: EX2 temp[5].x, temp[8].w___; 16: MUL temp[5].x, temp[5].x___, const[1].x___; 17: MUL temp[5].x, temp[5].x___, temp[1].x___; 18: DP3 temp[4].y, temp[3].xyz_, input[3].xyz_; 19: MAD temp[4].x, temp[4].y___, const[3].x___, const[3].y___; 20: MAD_SAT temp[4].x, temp[4].y___, temp[4].x___, const[3].z___; 21: MAD temp[0].xyz, temp[4].xxx_, temp[0].xyz_, temp[5].xxx_; 22: MUL temp[0].xyz, temp[0].xyz_, input[0].xyz_; 23: ADD temp[6].xyz, temp[2].xyz_, -temp[0].xyz_; 24: MAD_SAT temp[6].xyz, temp[1].yyy_, temp[6].xyz_, temp[0].xyz_; 25: MUL_SAT temp[6].w, temp[0].___w, input[0].___w; 26: MAD_SAT temp[7].x, input[1].x___, const[4].x___, const[4].y___; 27: ADD temp[9].xyz, temp[6].xyz_, -const[5].xyz_; 28: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[5].xyz_; 29: MOV output[0].w, temp[6].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[6].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -const[6].yyy_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[4].xyz_, input[4].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[4].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, const[6].___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: DP3 temp[28].y, temp[18].xyz_, input[3].xyz_; 19: MAD temp[29].x, temp[28].y___, const[3].x___, const[3].y___; 20: MAD_SAT temp[30].x, temp[28].y___, temp[29].x___, const[3].z___; 21: MAD temp[31].xyz, temp[30].xxx_, temp[11].xyz_, temp[27].xxx_; 22: MUL temp[32].xyz, temp[31].xyz_, input[0].xyz_; 23: ADD temp[33].xyz, temp[13].xyz_, -temp[32].xyz_; 24: MAD_SAT temp[34].xyz, temp[12].yyy_, temp[33].xyz_, temp[32].xyz_; 25: MUL_SAT temp[35].w, temp[10].___w, input[0].___w; 26: MAD_SAT temp[36].x, input[1].x___, const[4].x___, const[4].y___; 27: ADD temp[37].xyz, temp[34].xyz_, -const[5].xyz_; 28: MAD output[0].xyz, temp[36].xxx_, temp[37].xyz_, const[5].xyz_; 29: MOV output[0].w, temp[35].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[6].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[4].xyz_, input[4].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[4].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, const[6].___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: DP3 temp[28].y, temp[18].xyz_, input[3].xyz_; 19: MAD temp[29].x, temp[28].y___, const[3].x___, const[3].y___; 20: MAD_SAT temp[30].x, temp[28].y___, temp[29].x___, const[3].z___; 21: MAD temp[31].xyz, temp[30].xxx_, temp[11].xyz_, temp[27].xxx_; 22: MUL temp[32].xyz, temp[31].xyz_, input[0].xyz_; 23: MAD_SAT temp[34].xyz, temp[12].yyy_, (temp[13] - temp[32]).xyz_, temp[32].xyz_; 24: MUL_SAT temp[35].w, temp[10].___w, input[0].___w; 25: MAD_SAT temp[36].x, input[1].x___, const[4].x___, const[4].y___; 26: MAD output[0].xyz, temp[36].xxx_, (temp[34] - const[5]).xyz_, const[5].xyz_; 27: MOV output[0].w, temp[35].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[4].xyz_, input[4].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[4].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, 128.000000 (0x70).___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: DP3 temp[28].y, temp[18].xyz_, input[3].xyz_; 19: MAD temp[29].x, temp[28].y___, const[3].x___, const[3].y___; 20: MAD_SAT temp[30].x, temp[28].y___, temp[29].x___, const[3].z___; 21: MAD temp[31].xyz, temp[30].xxx_, temp[11].xyz_, temp[27].xxx_; 22: MUL temp[32].xyz, temp[31].xyz_, input[0].xyz_; 23: MAD_SAT temp[34].xyz, temp[12].yyy_, (temp[13] - temp[32]).xyz_, temp[32].xyz_; 24: MUL_SAT temp[35].w, temp[10].___w, input[0].___w; 25: MAD_SAT temp[36].x, input[1].x___, const[4].x___, const[4].y___; 26: MAD output[0].xyz, temp[36].xxx_, (temp[34] - const[5]).xyz_, const[5].xyz_; 27: MOV output[0].w, temp[35].___w; CONST[6] = { 2.0000 0.5000 0.0000 128.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[4].xyz_, input[4].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[4].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, 128.000000 (0x70).___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: DP3 temp[28].y, temp[18].xyz_, input[3].xyz_; 19: MAD temp[29].x, temp[28].y___, const[3].x___, const[3].y___; 20: MAD_SAT temp[30].x, temp[28].y___, temp[29].x___, const[3].z___; 21: MAD temp[31].xyz, temp[30].xxx_, temp[11].xyz_, temp[27].xxx_; 22: MUL temp[32].xyz, temp[31].xyz_, input[0].xyz_; 23: MAD_SAT temp[34].xyz, temp[12].yyy_, (temp[13] - temp[32]).xyz_, temp[32].xyz_; 24: MUL_SAT temp[35].w, temp[10].___w, input[0].___w; 25: MAD_SAT temp[36].x, input[1].x___, const[4].x___, const[4].y___; 26: MAD output[0].xyz, temp[36].xxx_, (temp[34] - const[5]).xyz_, const[5].xyz_; 27: MOV output[0].w, temp[35].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[4].xyz_, input[4].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[4].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, 128.000000 (0x70).___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: DP3 temp[28].y, temp[18].xyz_, input[3].xyz_; 19: MAD temp[29].x, temp[28].y___, const[3].x___, const[3].y___; 20: MAD_SAT temp[30].x, temp[28].y___, temp[29].x___, const[3].z___; 21: MAD temp[31].xyz, temp[30].xxx_, temp[11].xyz_, temp[27].xxx_; 22: MUL temp[32].xyz, temp[31].xyz_, input[0].xyz_; 23: MAD_SAT temp[34].xyz, temp[12].yyy_, (temp[13] - temp[32]).xyz_, temp[32].xyz_; 24: MUL_SAT temp[35].w, temp[10].___w, input[0].___w; 25: MAD_SAT temp[36].x, input[1].x___, const[4].x___, const[4].y___; 26: MAD output[0].xyz, temp[36].xxx_, (temp[34] - const[5]).xyz_, const[5].xyz_; 27: MOV output[0].w, temp[35].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: src0.xyz = temp[10], src0.w = 2.000000 (0x40) MAD temp[11].xyz, src0.xyz, src0.www, src0.000 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: src0.xyz = temp[11], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: src0.xyz = temp[14] MAD temp[15].xyz, src0.xyz, src0.111, -src0.HHH 6: src0.xyz = temp[15] DP3, src0.xyz, src0.xyz DP3 temp[16].w, src0._, src0._ 7: src0.w = temp[16] RSQ temp[17].w, |src0.w| 8: src0.xyz = temp[15], src0.w = temp[17] MAD temp[18].xyz, src0.www, src0.xyz, src0.000 9: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[19].w, src0._, src0._ 10: src0.w = temp[19] RSQ temp[20].w, |src0.w| 11: src0.xyz = input[4], src0.w = temp[20] MAD temp[21].xyz, src0.www, src0.xyz, src0.000 12: src0.xyz = temp[21], src1.xyz = temp[18] DP3_SAT temp[22].x, src0.xyz, src1.xyz 13: src0.xyz = temp[22] LG2 temp[23].w, src0.x 14: src0.w = temp[23], src1.w = 128.000000 (0x70) MAD temp[24].w, src0.w, src1.w, src0.0 15: src0.w = temp[24] REPL_ALPHA temp[25].x EX2, src0.w 16: src0.xyz = temp[25], src1.xyz = const[1] MAD temp[26].x, src0.x__, src1.x__, src0.000 17: src0.xyz = temp[26], src1.xyz = temp[12] MAD temp[27].x, src0.x__, src1.x__, src0.000 18: src0.xyz = temp[18], src1.xyz = input[3] DP3 temp[28].y, src0.xyz, src1.xyz 19: src0.xyz = temp[28], src1.xyz = const[3] MAD temp[29].x, src0.y__, src1.x__, src1.y__ 20: src0.xyz = temp[28], src1.xyz = temp[29], src2.xyz = const[3] MAD_SAT temp[30].x, src0.y__, src1.x__, src2.z__ 21: src0.xyz = temp[30], src1.xyz = temp[11], src2.xyz = temp[27] MAD temp[31].xyz, src0.xxx, src1.xyz, src2.xxx 22: src0.xyz = temp[31], src1.xyz = input[0] MAD temp[32].xyz, src0.xyz, src1.xyz, src0.000 23: src0.xyz = temp[32], src1.xyz = temp[13], src2.xyz = temp[12], srcp.xyz = (src1 - src0) MAD_SAT temp[34].xyz, src2.yyy, srcp.xyz, src0.xyz 24: src0.w = temp[10], src1.w = input[0] MAD_SAT temp[35].w, src0.w, src1.w, src0.0 25: src0.xyz = input[1], src1.xyz = const[4] MAD_SAT temp[36].x, src0.x__, src1.x__, src1.y__ 26: src0.xyz = const[5], src1.xyz = temp[34], src2.xyz = temp[36], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 27: src0.w = temp[35] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: TEX temp[14].xyz, input[2].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[19].w, src0._, src0._ 5: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.w = temp[19] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 RSQ temp[20].w, |src1.w| 6: src0.xyz = temp[14] MAD temp[15].xyz, src0.xyz, src0.111, -src0.HHH 7: src0.xyz = temp[15] DP3, src0.xyz, src0.xyz DP3 temp[16].w, src0._, src0._ 8: src0.xyz = temp[11], src0.w = temp[16], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 RSQ temp[17].w, |src0.w| 9: src0.xyz = temp[15], src0.w = temp[17] MAD temp[18].xyz, src0.www, src0.xyz, src0.000 10: src0.xyz = temp[18], src1.xyz = input[3] DP3 temp[28].y, src0.xyz, src1.xyz 11: src0.xyz = input[4], src0.w = temp[20], src1.xyz = temp[28], src2.xyz = const[3] MAD temp[21].xyz, src0.www, src0.xyz, src0.000 MAD temp[29].w, src1.y, src2.x, src2.y 12: src0.xyz = temp[21], src1.xyz = temp[18] DP3_SAT temp[22].x, src0.xyz, src1.xyz 13: src0.xyz = temp[28], src0.w = temp[29], src1.xyz = temp[29], src2.xyz = const[3] MAD_SAT temp[30].w, src0.y, src0.w, src2.z 14: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[4], src1.w = input[0] MAD_SAT temp[36].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[35].w, src0.w, src1.w, src0.0 15: src0.xyz = temp[22] LG2 temp[23].w, src0.x 16: src0.w = temp[23], src1.w = 128.000000 (0x70) MAD temp[24].w, src0.w, src1.w, src0.0 17: src0.w = temp[24] REPL_ALPHA temp[25].x EX2, src0.w 18: src0.xyz = temp[25], src1.xyz = const[1] MAD temp[26].x, src0.x__, src1.x__, src0.000 19: src0.xyz = temp[26], src1.xyz = temp[12] MAD temp[27].x, src0.x__, src1.x__, src0.000 20: src0.xyz = temp[30], src0.w = temp[30], src1.xyz = temp[11], src2.xyz = temp[27] MAD temp[31].xyz, src0.www, src1.xyz, src2.xxx 21: src0.xyz = temp[31], src1.xyz = input[0] MAD temp[32].xyz, src0.xyz, src1.xyz, src0.000 22: src0.xyz = temp[32], src1.xyz = temp[13], src2.xyz = temp[12], srcp.xyz = (src1 - src0) MAD_SAT temp[34].xyz, src2.yyy, srcp.xyz, src0.xyz 23: src0.xyz = const[5], src0.w = temp[35], src1.xyz = temp[34], src2.xyz = temp[36], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: TEX temp[14].xyz, input[2].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[19].w, src0._, src0._ 5: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.w = temp[19] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 RSQ temp[20].w, |src1.w| 6: src0.xyz = temp[14] MAD temp[15].xyz, src0.xyz, src0.111, -src0.HHH 7: src0.xyz = temp[15] DP3, src0.xyz, src0.xyz DP3 temp[16].w, src0._, src0._ 8: src0.xyz = temp[11], src0.w = temp[16], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 RSQ temp[17].w, |src0.w| 9: src0.xyz = temp[15], src0.w = temp[17] MAD temp[18].xyz, src0.www, src0.xyz, src0.000 10: src0.xyz = temp[18], src1.xyz = input[3] DP3 temp[28].y, src0.xyz, src1.xyz 11: src0.xyz = input[4], src0.w = temp[20], src1.xyz = temp[28], src2.xyz = const[3] MAD temp[21].xyz, src0.www, src0.xyz, src0.000 MAD temp[29].w, src1.y, src2.x, src2.y 12: src0.xyz = temp[21], src1.xyz = temp[18] DP3_SAT temp[22].x, src0.xyz, src1.xyz 13: src0.xyz = temp[28], src0.w = temp[29], src2.xyz = const[3] MAD_SAT temp[30].w, src0.y, src0.w, src2.z 14: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[4], src1.w = input[0] MAD_SAT temp[36].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[35].w, src0.w, src1.w, src0.0 15: src0.xyz = temp[22] LG2 temp[23].w, src0.x 16: src0.w = temp[23], src1.w = 128.000000 (0x70) MAD temp[24].w, src0.w, src1.w, src0.0 17: src0.w = temp[24] REPL_ALPHA temp[25].x EX2, src0.w 18: src0.xyz = temp[25], src1.xyz = const[1] MAD temp[26].x, src0.x__, src1.x__, src0.000 19: src0.xyz = temp[26], src1.xyz = temp[12] MAD temp[27].x, src0.x__, src1.x__, src0.000 20: src0.w = temp[30], src1.xyz = temp[11], src2.xyz = temp[27] MAD temp[31].xyz, src0.www, src1.xyz, src2.xxx 21: src0.xyz = temp[31], src1.xyz = input[0] MAD temp[32].xyz, src0.xyz, src1.xyz, src0.000 22: src0.xyz = temp[32], src1.xyz = temp[13], src2.xyz = temp[12], srcp.xyz = (src1 - src0) MAD_SAT temp[34].xyz, src2.yyy, srcp.xyz, src0.xyz 23: src0.xyz = const[5], src0.w = temp[35], src1.xyz = temp[34], src2.xyz = temp[36], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[1].xy__, 2D[0]; 2: TEX temp[4].yz, input[1].xy__, 2D[1]; 3: TEX temp[1].xyz, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[1].w, src0._, src0._ 5: src0.xyz = temp[5], src0.w = 2.000000 (0x40), src1.w = temp[1] SEM_WAIT MAD temp[6].xyz, src0.xyz, src0.www, src0.000 RSQ temp[1].w, |src1.w| 6: src0.xyz = temp[1] MAD temp[1].xyz, src0.xyz, src0.111, -src0.HHH 7: src0.xyz = temp[1] DP3, src0.xyz, src0.xyz DP3 temp[2].w, src0._, src0._ 8: src0.xyz = temp[6], src0.w = temp[2], src1.xyz = const[1] MAD temp[7].xyz, src0.xyz, src1.yyy, src0.000 RSQ temp[2].w, |src0.w| 9: src0.xyz = temp[1], src0.w = temp[2] MAD temp[1].xyz, src0.www, src0.xyz, src0.000 10: src0.xyz = temp[1], src1.xyz = input[2] DP3 temp[2].x, src0.xyz, src1.xyz 11: src0.xyz = input[3], src0.w = temp[1], src1.xyz = temp[2], src2.xyz = const[3] MAD temp[3].xyz, src0.www, src0.xyz, src0.000 MAD temp[1].w, src1.x, src2.x, src2.y 12: src0.xyz = temp[3], src1.xyz = temp[1] DP3_SAT temp[1].x, src0.xyz, src1.xyz 13: src0.xyz = temp[2], src0.w = temp[1], src2.xyz = const[3] MAD_SAT temp[1].w, src0.x, src0.w, src2.z 14: src0.xyz = input[4], src0.w = temp[5], src1.xyz = const[4], src1.w = input[0] MAD_SAT temp[1].y, src0._x_, src1._x_, src1._y_ MAD_SAT temp[2].w, src0.w, src1.w, src0.0 15: src0.xyz = temp[1] LG2 temp[3].w, src0.x 16: src0.w = temp[3], src1.w = 128.000000 (0x70) MAD temp[3].w, src0.w, src1.w, src0.0 17: src0.w = temp[3] REPL_ALPHA temp[1].x EX2, src0.w 18: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[1].x, src0.x__, src1.x__, src0.0__ 19: src0.xyz = temp[1], src1.xyz = temp[4] MAD temp[1].x, src0.x__, src1.y__, src0.0__ 20: src0.w = temp[1], src1.xyz = temp[6], src2.xyz = temp[1] MAD temp[2].xyz, src0.www, src1.xyz, src2.xxx 21: src0.xyz = temp[2], src1.xyz = input[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 22: src0.xyz = temp[0], src1.xyz = temp[7], src2.xyz = temp[4], srcp.xyz = (src1 - src0) MAD_SAT temp[0].xyz, src2.zzz, srcp.xyz, src0.xyz 23: src0.xyz = const[5], src0.w = temp[2], src1.xyz = temp[0], src2.xyz = temp[1], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.yyy, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f401: src: 1 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003003:TEX wmask: GB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xd004f401: src: 1 R/G/A/A dst: 4 R/R/G/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000131:DP3 dest:19 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080004c0:Addr0: 192t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d01b:RSQ dest:1 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00db4010:MAD dest:1 rgb_C_src:0 H/H/H 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000101:DP3 dest:16 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040406:Addr0: 6t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0004c02b:RSQ dest:2 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000021:DP3 dest:2 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x10300803:Addr0: 3t, Addr1: 2t, Addr2: 3c, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00101010:MAD dest:1 alp_A_src:1 R 0 alp_B_src:2 R 0 targ 0 w:0 5 RGBA_INST: 0x0c490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:2 G 0 11 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000403:Addr0: 3t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000011:DP3 dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x10320002:Addr0: 2t, Addr1: 128t, Addr2: 3c, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00600010:MAD dest:1 alp_A_src:0 R 0 alp_B_src:0 A 0 targ 0 w:0 5 RGBA_INST: 0x14000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:2 B 0 13 0:CMN_INST 0x00185000:ALU wmask: AG omask: NONE 1:RGB_ADDR 0x08041004:Addr0: 4t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000005:Addr0: 5t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00822410:rgb_A_src:0 0/R/0 0 rgb_B_src:1 0/R/0 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20431010:MAD dest:1 rgb_C_src:1 0/G/0 0 alp_C_src:0 0 0 14 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000039:LN2 dest:3 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803c003:Addr0: 3t, Addr1: 240t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 16 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000001a:SOP dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08040401:Addr0: 1t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08001001:Addr0: 1t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0090a480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 G/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x00101880:Addr0: 128t, Addr1: 6t, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044236c:rgb_A_src:0 A/A/A 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00002020:MAD dest:2 rgb_C_src:2 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00003a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x40401c00:Addr0: 0t, Addr1: 7t, Addr2: 4t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40100105:Addr0: 5c, Addr1: 0t, Addr2: 1t, srcp:1 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446126:rgb_A_src:2 G/G/G 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 23 Instructions ~ 17 Vector Instructions (RGB) ~ 11 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 2 Presub Operations ~ 0 OMOD Operations ~ 8 Temporary Registers ~ 2 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR IMM[0] FLT32 { 0.0000, 0.0000, 0.0000, 0.0000} 0: MOV OUT[0], IMM[0].xxxx 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'register rename' # Radeon Compiler Program 0: MOV output[0], temp[0].0000; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], none.0000; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: MOV output[0], none.0000; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], none.0000; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], none.0000; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: MAD color[0].xyz, src0.000, src0.111, src0.000 MAD color[0].w, src0.0, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: MAD color[0].xyz, src0.000, src0.111, src0.000 MAD color[0].w, src0.0, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: MAD color[0].xyz, src0.000, src0.111, src0.000 MAD color[0].w, src0.0, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: MAD color[0].xyz, src0.000, src0.111, src0.000 MAD color[0].w, src0.0, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0490:rgb_A_src:0 0/0/0 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c10000:MAD dest:0 alp_A_src:0 0 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL CONST[0] DCL CONST[3..6] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 64.0000, 0.0000} IMM[1] FLT32 { 128.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[0], 2D 8: TEX TEMP[5].xyz, IN[1], SAMP[3], 2D 9: MAD TEMP[5].xyz, TEMP[5], IMM[0].xxxx, -IMM[0].yyyy 10: ADD TEMP[6].xyz, TEMP[3], TEMP[2] 11: DP3 TEMP[6].w, TEMP[6], TEMP[6] 12: RSQ TEMP[6].w, |TEMP[6].wwww| 13: MUL TEMP[6].xyz, TEMP[6].wwww, TEMP[6] 14: DP3_SAT TEMP[6].w, TEMP[6], TEMP[5] 15: POW TEMP[6].w, TEMP[6].wwww, IMM[1].xxxx 16: MUL_SAT TEMP[6].w, TEMP[6], IMM[0].zzzz 17: MUL TEMP[6].w, TEMP[6], TEMP[4] 18: MUL TEMP[4].xyz, TEMP[6].wwww, CONST[3] 19: DP3_SAT TEMP[2].w, TEMP[5], TEMP[2] 20: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 21: MAX TEMP[0].xyz, TEMP[0], CONST[4] 22: MUL_SAT TEMP[1].xyz, TEMP[4], TEMP[0] 23: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[5].xxxx, CONST[5].yyyy 24: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[6] 25: MOV OUT[0].w, TEMP[1] 26: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[8].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[7].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[8].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[7].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[8].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[7].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[8].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[7].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[8].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[7].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[8].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[7].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL_SAT temp[1].xyz, temp[4], temp[0]; 23: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 24: LRP output[0].xyz, temp[7].xxxx, temp[1], const[6]; 25: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[7].xxxx, -const[7].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[7].xxxx, -const[7].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: LG2 temp[8].w, temp[6].wwww; 16: MUL temp[8].w, temp[8].wwww, const[8].xxxx; 17: EX2 temp[6].w, temp[8].wwww; 18: MUL_SAT temp[6].w, temp[6], const[7].zzzz; 19: MUL temp[6].w, temp[6], temp[4]; 20: MUL temp[4].xyz, temp[6].wwww, const[3]; 21: DP3_SAT temp[2].w, temp[5], temp[2]; 22: MUL temp[0].xyz, temp[0], temp[2].wwww; 23: MAX temp[0].xyz, temp[0], const[4]; 24: MUL_SAT temp[1].xyz, temp[4], temp[0]; 25: MAD_SAT temp[7].x, input[0].xxxx, const[5].xxxx, const[5].yyyy; 26: ADD temp[9].xyz, temp[1], -const[6]; 27: MAD output[0].xyz, temp[7].xxxx, temp[9], const[6]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[7].xxx_, -const[7].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[0]; 8: TEX temp[5].xyz, input[1].xy__, 2D[3]; 9: MAD temp[5].xyz, temp[5].xyz_, const[7].xxx_, -const[7].yyy_; 10: ADD temp[6].xyz, temp[3].xyz_, temp[2].xyz_; 11: DP3 temp[6].w, temp[6].xyz_, temp[6].xyz_; 12: RSQ temp[6].w, |temp[6].___w|; 13: MUL temp[6].xyz, temp[6].www_, temp[6].xyz_; 14: DP3_SAT temp[6].w, temp[6].xyz_, temp[5].xyz_; 15: LG2 temp[8].w, temp[6].___w; 16: MUL temp[8].w, temp[8].___w, const[8].___x; 17: EX2 temp[6].w, temp[8].___w; 18: MUL_SAT temp[6].w, temp[6].___w, const[7].___z; 19: MUL temp[6].w, temp[6].___w, temp[4].___w; 20: MUL temp[4].xyz, temp[6].www_, const[3].xyz_; 21: DP3_SAT temp[2].w, temp[5].xyz_, temp[2].xyz_; 22: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 23: MAX temp[0].xyz, temp[0].xyz_, const[4].xyz_; 24: MUL_SAT temp[1].xyz, temp[4].xyz_, temp[0].xyz_; 25: MAD_SAT temp[7].x, input[0].x___, const[5].x___, const[5].y___; 26: ADD temp[9].xyz, temp[1].xyz_, -const[6].xyz_; 27: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[6].xyz_; 28: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[7].xxx_, -const[7].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, const[7].xxx_, -const[7].yyy_; 10: ADD temp[20].xyz, temp[16].xyz_, temp[13].xyz_; 11: DP3 temp[21].w, temp[20].xyz_, temp[20].xyz_; 12: RSQ temp[22].w, |temp[21].___w|; 13: MUL temp[23].xyz, temp[22].www_, temp[20].xyz_; 14: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 15: LG2 temp[25].w, temp[24].___w; 16: MUL temp[26].w, temp[25].___w, const[8].___x; 17: EX2 temp[27].w, temp[26].___w; 18: MUL_SAT temp[28].w, temp[27].___w, const[7].___z; 19: MUL temp[29].w, temp[28].___w, temp[17].___w; 20: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 21: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 22: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 23: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 24: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 25: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 26: ADD temp[36].xyz, temp[34].xyz_, -const[6].xyz_; 27: MAD output[0].xyz, temp[35].xxx_, temp[36].xyz_, const[6].xyz_; 28: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[7].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, const[7].xxx_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, const[8].___x; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, const[7].___z; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, 2.000000 (0x40).www_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, 128.000000 (0x70).___w; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, 64.000000 (0x68).___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; CONST[7] = { 2.0000 1.0000 64.0000 0.0000 } CONST[8] = { 128.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, 2.000000 (0x40).www_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, 128.000000 (0x70).___w; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, 64.000000 (0x68).___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, 2.000000 (0x40).www_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, 128.000000 (0x70).___w; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, 64.000000 (0x68).___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL_SAT temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: MAD_SAT temp[35].x, input[0].x___, const[5].x___, const[5].y___; 25: MAD output[0].xyz, temp[35].xxx_, (temp[34] - const[6]).xyz_, const[6].xyz_; 26: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: src0.xyz = temp[18], src0.w = 2.000000 (0x40) MAD temp[19].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[21].w, src0._, src0._ 11: src0.w = temp[21] RSQ temp[22].w, |src0.w| 12: src0.xyz = temp[16], src0.w = temp[22], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[23].xyz, src0.www, srcp.xyz, src0.000 13: src0.xyz = temp[23], src1.xyz = temp[19] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[24].w, src0._, src0._ 14: src0.w = temp[24] LG2 temp[25].w, src0.w 15: src0.w = temp[25], src1.w = 128.000000 (0x70) MAD temp[26].w, src0.w, src1.w, src0.0 16: src0.w = temp[26] EX2 temp[27].w, src0.w 17: src0.w = temp[27], src1.w = 64.000000 (0x68) MAD_SAT temp[28].w, src0.w, src1.w, src0.0 18: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 19: src0.xyz = const[3], src0.w = temp[29] MAD temp[30].xyz, src0.www, src0.xyz, src0.000 20: src0.xyz = temp[19], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 21: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 22: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD_SAT temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = input[0], src1.xyz = const[5] MAD_SAT temp[35].x, src0.x__, src1.x__, src1.y__ 25: src0.xyz = const[6], src1.xyz = temp[34], src2.xyz = temp[35], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 26: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17].w, input[1].xy__, 2D[0]; 4: TEX temp[18].xyz, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 6: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 7: src0.xyz = temp[18], src0.w = 2.000000 (0x40) MAD temp[19].xyz, src0.xyz, src0.www, -src0.111 8: src0.xyz = temp[19], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 9: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 11: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[5], src1.w = const[0] MAD_SAT temp[35].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 12: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 13: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[21].w, src0._, src0._ 14: src0.w = temp[21] RSQ temp[22].w, |src0.w| 15: src0.xyz = temp[16], src0.w = temp[22], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[23].xyz, src0.www, srcp.xyz, src0.000 16: src0.xyz = temp[23], src1.xyz = temp[19] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[24].w, src0._, src0._ 17: src0.w = temp[24] LG2 temp[25].w, src0.w 18: src0.w = temp[25], src1.w = 128.000000 (0x70) MAD temp[26].w, src0.w, src1.w, src0.0 19: src0.w = temp[26] EX2 temp[27].w, src0.w 20: src0.w = temp[27], src1.w = 64.000000 (0x68) MAD_SAT temp[28].w, src0.w, src1.w, src0.0 21: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 22: src0.xyz = const[3], src0.w = temp[29] MAD temp[30].xyz, src0.www, src0.xyz, src0.000 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD_SAT temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = const[6], src0.w = temp[11], src1.xyz = temp[34], src2.xyz = temp[35], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17].w, input[1].xy__, 2D[0]; 4: TEX temp[18].xyz, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 6: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 7: src0.xyz = temp[18], src0.w = 2.000000 (0x40) MAD temp[19].xyz, src0.xyz, src0.www, -src0.111 8: src0.xyz = temp[19], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 9: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 11: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[5], src1.w = const[0] MAD_SAT temp[35].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 12: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 13: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[21].w, src0._, src0._ 14: src0.w = temp[21] RSQ temp[22].w, |src0.w| 15: src0.xyz = temp[16], src0.w = temp[22], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[23].xyz, src0.www, srcp.xyz, src0.000 16: src0.xyz = temp[23], src1.xyz = temp[19] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[24].w, src0._, src0._ 17: src0.w = temp[24] LG2 temp[25].w, src0.w 18: src0.w = temp[25], src1.w = 128.000000 (0x70) MAD temp[26].w, src0.w, src1.w, src0.0 19: src0.w = temp[26] EX2 temp[27].w, src0.w 20: src0.w = temp[27], src1.w = 64.000000 (0x68) MAD_SAT temp[28].w, src0.w, src1.w, src0.0 21: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 22: src0.xyz = const[3], src0.w = temp[29] MAD temp[30].xyz, src0.www, src0.xyz, src0.000 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD_SAT temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = const[6], src0.w = temp[11], src1.xyz = temp[34], src2.xyz = temp[35], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xy__, 2D[1]; 2: TEX temp[1].xyz, input[1].xy__, 2D[2]; 3: TEX temp[0].w, input[0].xy__, 2D[0]; 4: TEX temp[0].xyz, input[0].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 5: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[1].w, src0._, src0._ 6: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[1] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[1].w, |src1.w| 7: src0.xyz = temp[0], src0.w = 2.000000 (0x40) MAD temp[0].xyz, src0.xyz, src0.www, -src0.111 8: src0.xyz = temp[0], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 9: src0.xyz = temp[4], src0.w = temp[2] MAD temp[5].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[5], src1.xyz = const[4] MAX temp[5].xyz, src0.xyz, src1.xyz 11: src0.xyz = input[3], src0.w = temp[4], src1.xyz = const[5], src1.w = const[0] MAD_SAT temp[3].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[2].w, src0.w, src1.w, src0.0 12: src0.xyz = input[2], src0.w = temp[1] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 13: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[1].w, src0._, src0._ 14: src0.w = temp[1] RSQ temp[1].w, |src0.w| 15: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[1].xyz, src0.www, srcp.xyz, src0.000 16: src0.xyz = temp[1], src1.xyz = temp[0] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 17: src0.w = temp[1] LG2 temp[1].w, src0.w 18: src0.w = temp[1], src1.w = 128.000000 (0x70) MAD temp[1].w, src0.w, src1.w, src0.0 19: src0.w = temp[1] EX2 temp[1].w, src0.w 20: src0.w = temp[1], src1.w = 64.000000 (0x68) MAD_SAT temp[1].w, src0.w, src1.w, src0.0 21: src0.w = temp[1], src1.w = temp[0] MAD temp[0].w, src0.w, src1.w, src0.0 22: src0.xyz = const[3], src0.w = temp[0] MAD temp[0].xyz, src0.www, src0.xyz, src0.000 23: src0.xyz = temp[0], src1.xyz = temp[5] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src0.000 24: src0.xyz = const[6], src0.w = temp[2], src1.xyz = temp[0], src2.xyz = temp[3], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe404f401: src: 1 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004003:TEX wmask: A omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080004c0:Addr0: 192t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d01b:RSQ dest:1 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 6 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8000:MAD dest:0 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001f1:DP3 dest:31 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08041005:Addr0: 5t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000055:MAX dest:5 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00184800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x08041403:Addr0: 3t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040004:Addr0: 4t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20485030:MAD dest:3 rgb_C_src:1 G/0/0 0 alp_C_src:0 0 0 11 0:CMN_INST 0x00003a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000151:DP3 dest:21 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c01b:RSQ dest:1 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000181:DP3 dest:24 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c019:LN2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803c001:Addr0: 1t, Addr1: 240t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 18 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c018:EX2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803a001:Addr0: 1t, Addr1: 232t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 20 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 21 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08001400:Addr0: 0t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40300106:Addr0: 6c, Addr1: 0t, Addr2: 3t, srcp:1 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 24 Instructions ~ 14 Vector Instructions (RGB) ~ 13 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 4 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 6 Temporary Registers ~ 4 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[4] DCL CONST[0] DCL CONST[3..7] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 64.0000, 32.0000} IMM[1] FLT32 { 128.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[0], 2D 8: TEX TEMP[5].xyz, IN[1], SAMP[3], 2D 9: MAD TEMP[5].xyz, TEMP[5], IMM[0].xxxx, -IMM[0].yyyy 10: ADD TEMP[6].xyz, TEMP[3], TEMP[2] 11: DP3 TEMP[6].w, TEMP[6], TEMP[6] 12: RSQ TEMP[6].w, |TEMP[6].wwww| 13: MUL TEMP[6].xyz, TEMP[6].wwww, TEMP[6] 14: DP3_SAT TEMP[6].w, TEMP[6], TEMP[5] 15: POW TEMP[6].w, TEMP[6].wwww, IMM[1].xxxx 16: MUL_SAT TEMP[6].w, TEMP[6], IMM[0].zzzz 17: MUL TEMP[6].w, TEMP[6], TEMP[4] 18: MUL TEMP[4].xyz, TEMP[6].wwww, CONST[3] 19: DP3_SAT TEMP[2].w, TEMP[5], TEMP[2] 20: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 21: MAX TEMP[0].xyz, TEMP[0], CONST[4] 22: MUL TEMP[4].xyz, TEMP[4], TEMP[0] 23: TEX TEMP[0].xyz, IN[1], SAMP[4], 2D 24: MUL TEMP[0].xyz, TEMP[0], CONST[5] 25: MAX TEMP[2].x, TEMP[0].xxxx, TEMP[0].yyyy 26: MAX TEMP[2].x, TEMP[2].xxxx, TEMP[0].zzzz 27: MUL TEMP[2].x, TEMP[2].xxxx, TEMP[2].xxxx 28: MUL_SAT TEMP[2].x, TEMP[2].xxxx, IMM[0].wwww 29: MAD_SAT TEMP[1].xyz, TEMP[2].xxxx, TEMP[0], TEMP[4] 30: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[6].xxxx, CONST[6].yyyy 31: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[7] 32: MOV OUT[0].w, TEMP[1] 33: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[9].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[8].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MUL temp[0].xyz, temp[0], const[5]; 25: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 26: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 27: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 28: MUL_SAT temp[2].x, temp[2].xxxx, const[8].wwww; 29: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 30: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 31: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 32: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[9].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[8].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MUL temp[0].xyz, temp[0], const[5]; 25: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 26: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 27: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 28: MUL_SAT temp[2].x, temp[2].xxxx, const[8].wwww; 29: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 30: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 31: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 32: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[9].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[8].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MUL temp[0].xyz, temp[0], const[5]; 25: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 26: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 27: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 28: MUL_SAT temp[2].x, temp[2].xxxx, const[8].wwww; 29: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 30: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 31: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 32: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[9].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[8].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MUL temp[0].xyz, temp[0], const[5]; 25: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 26: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 27: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 28: MUL_SAT temp[2].x, temp[2].xxxx, const[8].wwww; 29: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 30: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 31: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 32: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[9].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[8].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MUL temp[0].xyz, temp[0], const[5]; 25: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 26: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 27: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 28: MUL_SAT temp[2].x, temp[2].xxxx, const[8].wwww; 29: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 30: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 31: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 32: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: POW temp[6].w, temp[6].wwww, const[9].xxxx; 16: MUL_SAT temp[6].w, temp[6], const[8].zzzz; 17: MUL temp[6].w, temp[6], temp[4]; 18: MUL temp[4].xyz, temp[6].wwww, const[3]; 19: DP3_SAT temp[2].w, temp[5], temp[2]; 20: MUL temp[0].xyz, temp[0], temp[2].wwww; 21: MAX temp[0].xyz, temp[0], const[4]; 22: MUL temp[4].xyz, temp[4], temp[0]; 23: TEX temp[0].xyz, input[1], 2D[4]; 24: MUL temp[0].xyz, temp[0], const[5]; 25: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 26: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 27: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 28: MUL_SAT temp[2].x, temp[2].xxxx, const[8].wwww; 29: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 30: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 31: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 32: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[0]; 8: TEX temp[5].xyz, input[1], 2D[3]; 9: MAD temp[5].xyz, temp[5], const[8].xxxx, -const[8].yyyy; 10: ADD temp[6].xyz, temp[3], temp[2]; 11: DP3 temp[6].w, temp[6], temp[6]; 12: RSQ temp[6].w, |temp[6].wwww|; 13: MUL temp[6].xyz, temp[6].wwww, temp[6]; 14: DP3_SAT temp[6].w, temp[6], temp[5]; 15: LG2 temp[8].w, temp[6].wwww; 16: MUL temp[8].w, temp[8].wwww, const[9].xxxx; 17: EX2 temp[6].w, temp[8].wwww; 18: MUL_SAT temp[6].w, temp[6], const[8].zzzz; 19: MUL temp[6].w, temp[6], temp[4]; 20: MUL temp[4].xyz, temp[6].wwww, const[3]; 21: DP3_SAT temp[2].w, temp[5], temp[2]; 22: MUL temp[0].xyz, temp[0], temp[2].wwww; 23: MAX temp[0].xyz, temp[0], const[4]; 24: MUL temp[4].xyz, temp[4], temp[0]; 25: TEX temp[0].xyz, input[1], 2D[4]; 26: MUL temp[0].xyz, temp[0], const[5]; 27: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 28: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 29: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 30: MUL_SAT temp[2].x, temp[2].xxxx, const[8].wwww; 31: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 32: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 33: ADD temp[9].xyz, temp[1], -const[7]; 34: MAD output[0].xyz, temp[7].xxxx, temp[9], const[7]; 35: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[0]; 8: TEX temp[5].xyz, input[1].xy__, 2D[3]; 9: MAD temp[5].xyz, temp[5].xyz_, const[8].xxx_, -const[8].yyy_; 10: ADD temp[6].xyz, temp[3].xyz_, temp[2].xyz_; 11: DP3 temp[6].w, temp[6].xyz_, temp[6].xyz_; 12: RSQ temp[6].w, |temp[6].___w|; 13: MUL temp[6].xyz, temp[6].www_, temp[6].xyz_; 14: DP3_SAT temp[6].w, temp[6].xyz_, temp[5].xyz_; 15: LG2 temp[8].w, temp[6].___w; 16: MUL temp[8].w, temp[8].___w, const[9].___x; 17: EX2 temp[6].w, temp[8].___w; 18: MUL_SAT temp[6].w, temp[6].___w, const[8].___z; 19: MUL temp[6].w, temp[6].___w, temp[4].___w; 20: MUL temp[4].xyz, temp[6].www_, const[3].xyz_; 21: DP3_SAT temp[2].w, temp[5].xyz_, temp[2].xyz_; 22: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 23: MAX temp[0].xyz, temp[0].xyz_, const[4].xyz_; 24: MUL temp[4].xyz, temp[4].xyz_, temp[0].xyz_; 25: TEX temp[0].xyz, input[1].xy__, 2D[4]; 26: MUL temp[0].xyz, temp[0].xyz_, const[5].xyz_; 27: MAX temp[2].x, temp[0].x___, temp[0].y___; 28: MAX temp[2].x, temp[2].x___, temp[0].z___; 29: MUL temp[2].x, temp[2].x___, temp[2].x___; 30: MUL_SAT temp[2].x, temp[2].x___, const[8].w___; 31: MAD_SAT temp[1].xyz, temp[2].xxx_, temp[0].xyz_, temp[4].xyz_; 32: MAD_SAT temp[7].x, input[0].x___, const[6].x___, const[6].y___; 33: ADD temp[9].xyz, temp[1].xyz_, -const[7].xyz_; 34: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[7].xyz_; 35: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, const[8].xxx_, -const[8].yyy_; 10: ADD temp[20].xyz, temp[16].xyz_, temp[13].xyz_; 11: DP3 temp[21].w, temp[20].xyz_, temp[20].xyz_; 12: RSQ temp[22].w, |temp[21].___w|; 13: MUL temp[23].xyz, temp[22].www_, temp[20].xyz_; 14: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 15: LG2 temp[25].w, temp[24].___w; 16: MUL temp[26].w, temp[25].___w, const[9].___x; 17: EX2 temp[27].w, temp[26].___w; 18: MUL_SAT temp[28].w, temp[27].___w, const[8].___z; 19: MUL temp[29].w, temp[28].___w, temp[17].___w; 20: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 21: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 22: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 23: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 24: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 25: TEX temp[35].xyz, input[1].xy__, 2D[4]; 26: MUL temp[36].xyz, temp[35].xyz_, const[5].xyz_; 27: MAX temp[37].x, temp[36].x___, temp[36].y___; 28: MAX temp[38].x, temp[37].x___, temp[36].z___; 29: MUL temp[39].x, temp[38].x___, temp[38].x___; 30: MUL_SAT temp[40].x, temp[39].x___, const[8].w___; 31: MAD_SAT temp[41].xyz, temp[40].xxx_, temp[36].xyz_, temp[34].xyz_; 32: MAD_SAT temp[42].x, input[0].x___, const[6].x___, const[6].y___; 33: ADD temp[43].xyz, temp[41].xyz_, -const[7].xyz_; 34: MAD output[0].xyz, temp[42].xxx_, temp[43].xyz_, const[7].xyz_; 35: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, const[8].xxx_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, const[9].___x; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, const[8].___z; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MUL temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MAX temp[37].x, temp[36].x___, temp[36].y___; 27: MAX temp[38].x, temp[37].x___, temp[36].z___; 28: MUL temp[39].x, temp[38].x___, temp[38].x___; 29: MUL_SAT temp[40].x, temp[39].x___, const[8].w___; 30: MAD_SAT temp[41].xyz, temp[40].xxx_, temp[36].xyz_, temp[34].xyz_; 31: MAD_SAT temp[42].x, input[0].x___, const[6].x___, const[6].y___; 32: MAD output[0].xyz, temp[42].xxx_, (temp[41] - const[7]).xyz_, const[7].xyz_; 33: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, 2.000000 (0x40).www_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, 128.000000 (0x70).___w; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, 64.000000 (0x68).___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MUL temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MAX temp[37].x, temp[36].x___, temp[36].y___; 27: MAX temp[38].x, temp[37].x___, temp[36].z___; 28: MUL temp[39].x, temp[38].x___, temp[38].x___; 29: MUL_SAT temp[40].x, temp[39].x___, 32.000000 (0x60).w___; 30: MAD_SAT temp[41].xyz, temp[40].xxx_, temp[36].xyz_, temp[34].xyz_; 31: MAD_SAT temp[42].x, input[0].x___, const[6].x___, const[6].y___; 32: MAD output[0].xyz, temp[42].xxx_, (temp[41] - const[7]).xyz_, const[7].xyz_; 33: MOV output[0].w, temp[11].___w; CONST[8] = { 2.0000 1.0000 64.0000 32.0000 } CONST[9] = { 128.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, 2.000000 (0x40).www_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, 128.000000 (0x70).___w; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, 64.000000 (0x68).___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MUL temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MAX temp[37].x, temp[36].x___, temp[36].y___; 27: MAX temp[38].x, temp[37].x___, temp[36].z___; 28: MUL temp[39].x, temp[38].x___, temp[38].x___; 29: MUL_SAT temp[40].x, temp[39].x___, 32.000000 (0x60).w___; 30: MAD_SAT temp[41].xyz, temp[40].xxx_, temp[36].xyz_, temp[34].xyz_; 31: MAD_SAT temp[42].x, input[0].x___, const[6].x___, const[6].y___; 32: MAD output[0].xyz, temp[42].xxx_, (temp[41] - const[7]).xyz_, const[7].xyz_; 33: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: MAD temp[19].xyz, temp[18].xyz_, 2.000000 (0x40).www_, -none.111_; 10: DP3 temp[21].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 11: RSQ temp[22].w, |temp[21].___w|; 12: MUL temp[23].xyz, temp[22].www_, (temp[13] + temp[16]).xyz_; 13: DP3_SAT temp[24].w, temp[23].xyz_, temp[19].xyz_; 14: LG2 temp[25].w, temp[24].___w; 15: MUL temp[26].w, temp[25].___w, 128.000000 (0x70).___w; 16: EX2 temp[27].w, temp[26].___w; 17: MUL_SAT temp[28].w, temp[27].___w, 64.000000 (0x68).___w; 18: MUL temp[29].w, temp[28].___w, temp[17].___w; 19: MUL temp[30].xyz, temp[29].www_, const[3].xyz_; 20: DP3_SAT temp[31].w, temp[19].xyz_, temp[13].xyz_; 21: MUL temp[32].xyz, temp[10].xyz_, temp[31].www_; 22: MAX temp[33].xyz, temp[32].xyz_, const[4].xyz_; 23: MUL temp[34].xyz, temp[30].xyz_, temp[33].xyz_; 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: MUL temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MAX temp[37].x, temp[36].x___, temp[36].y___; 27: MAX temp[38].x, temp[37].x___, temp[36].z___; 28: MUL temp[39].x, temp[38].x___, temp[38].x___; 29: MUL_SAT temp[40].x, temp[39].x___, 32.000000 (0x60).w___; 30: MAD_SAT temp[41].xyz, temp[40].xxx_, temp[36].xyz_, temp[34].xyz_; 31: MAD_SAT temp[42].x, input[0].x___, const[6].x___, const[6].y___; 32: MAD output[0].xyz, temp[42].xxx_, (temp[41] - const[7]).xyz_, const[7].xyz_; 33: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17].w, input[1].xy__, 2D[0]; 8: TEX temp[18].xyz, input[1].xy__, 2D[3]; 9: src0.xyz = temp[18], src0.w = 2.000000 (0x40) MAD temp[19].xyz, src0.xyz, src0.www, -src0.111 10: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[21].w, src0._, src0._ 11: src0.w = temp[21] RSQ temp[22].w, |src0.w| 12: src0.xyz = temp[16], src0.w = temp[22], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[23].xyz, src0.www, srcp.xyz, src0.000 13: src0.xyz = temp[23], src1.xyz = temp[19] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[24].w, src0._, src0._ 14: src0.w = temp[24] LG2 temp[25].w, src0.w 15: src0.w = temp[25], src1.w = 128.000000 (0x70) MAD temp[26].w, src0.w, src1.w, src0.0 16: src0.w = temp[26] EX2 temp[27].w, src0.w 17: src0.w = temp[27], src1.w = 64.000000 (0x68) MAD_SAT temp[28].w, src0.w, src1.w, src0.0 18: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 19: src0.xyz = const[3], src0.w = temp[29] MAD temp[30].xyz, src0.www, src0.xyz, src0.000 20: src0.xyz = temp[19], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 21: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 22: src0.xyz = temp[32], src1.xyz = const[4] MAX temp[33].xyz, src0.xyz, src1.xyz 23: src0.xyz = temp[30], src1.xyz = temp[33] MAD temp[34].xyz, src0.xyz, src1.xyz, src0.000 24: TEX temp[35].xyz, input[1].xy__, 2D[4]; 25: src0.xyz = temp[35], src1.xyz = const[5] MAD temp[36].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = temp[36] MAX temp[37].x, src0.x__, src0.y__ 27: src0.xyz = temp[37], src1.xyz = temp[36] MAX temp[38].x, src0.x__, src1.z__ 28: src0.xyz = temp[38] MAD temp[39].x, src0.x__, src0.x__, src0.000 29: src0.xyz = temp[39], src0.w = 32.000000 (0x60) MAD_SAT temp[40].x, src0.x__, src0.w__, src0.000 30: src0.xyz = temp[40], src1.xyz = temp[36], src2.xyz = temp[34] MAD_SAT temp[41].xyz, src0.xxx, src1.xyz, src2.xyz 31: src0.xyz = input[0], src1.xyz = const[6] MAD_SAT temp[42].x, src0.x__, src1.x__, src1.y__ 32: src0.xyz = const[7], src1.xyz = temp[41], src2.xyz = temp[42], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 33: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17].w, input[1].xy__, 2D[0]; 4: TEX temp[18].xyz, input[1].xy__, 2D[3]; 5: TEX temp[35].xyz, input[1].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[18], src0.w = 2.000000 (0x40) MAD temp[19].xyz, src0.xyz, src0.www, -src0.111 9: src0.xyz = temp[19], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 10: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[35], src1.xyz = const[5] MAD temp[36].xyz, src0.xyz, src1.xyz, src0.000 12: src0.xyz = input[3], src0.w = temp[15], src1.xyz = temp[36] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 MAX temp[37].w, src1.x, src1.y 13: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[21].w, src0._, src0._ 14: src0.xyz = temp[32], src0.w = temp[37], src1.xyz = const[4], src2.xyz = temp[36] MAX temp[33].xyz, src0.xyz, src1.xyz MAX temp[38].w, src0.w, src2.z 15: src0.xyz = temp[38], src0.w = temp[38], src1.w = temp[21] MAD temp[39].x, src0.w__, src0.w__, src0.000 RSQ temp[22].w, |src1.w| 16: src0.xyz = temp[16], src0.w = temp[22], src1.xyz = temp[13], src1.w = 32.000000 (0x60), src2.xyz = temp[39], srcp.xyz = (src1 + src0) MAD temp[23].xyz, src0.www, srcp.xyz, src0.000 MAD_SAT temp[40].w, src2.x, src1.w, src0.0 17: src0.xyz = temp[23], src1.xyz = temp[19] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[24].w, src0._, src0._ 18: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[42].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 19: src0.w = temp[24] LG2 temp[25].w, src0.w 20: src0.w = temp[25], src1.w = 128.000000 (0x70) MAD temp[26].w, src0.w, src1.w, src0.0 21: src0.w = temp[26] EX2 temp[27].w, src0.w 22: src0.w = temp[27], src1.w = 64.000000 (0x68) MAD_SAT temp[28].w, src0.w, src1.w, src0.0 23: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 24: src0.xyz = const[3], src0.w = temp[29] MAD temp[30].xyz, src0.www, src0.xyz, src0.000 25: src0.xyz = temp[30], src1.xyz = temp[33] MAD temp[34].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = temp[40], src0.w = temp[40], src1.xyz = temp[36], src2.xyz = temp[34] MAD_SAT temp[41].xyz, src0.www, src1.xyz, src2.xyz 27: src0.xyz = const[7], src0.w = temp[11], src1.xyz = temp[41], src2.xyz = temp[42], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[1]; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: TEX temp[17].w, input[1].xy__, 2D[0]; 4: TEX temp[18].xyz, input[1].xy__, 2D[3]; 5: TEX temp[35].xyz, input[1].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[14] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[15].w, |src1.w| 8: src0.xyz = temp[18], src0.w = 2.000000 (0x40) MAD temp[19].xyz, src0.xyz, src0.www, -src0.111 9: src0.xyz = temp[19], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[31].w, src0._, src0._ 10: src0.xyz = temp[10], src0.w = temp[31] MAD temp[32].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[35], src1.xyz = const[5] MAD temp[36].xyz, src0.xyz, src1.xyz, src0.000 12: src0.xyz = input[3], src0.w = temp[15], src1.xyz = temp[36] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 MAX temp[37].w, src1.x, src1.y 13: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[21].w, src0._, src0._ 14: src0.xyz = temp[32], src0.w = temp[37], src1.xyz = const[4], src2.xyz = temp[36] MAX temp[33].xyz, src0.xyz, src1.xyz MAX temp[38].w, src0.w, src2.z 15: src0.w = temp[38], src1.w = temp[21] MAD temp[39].x, src0.w__, src0.w__, src0.000 RSQ temp[22].w, |src1.w| 16: src0.xyz = temp[16], src0.w = temp[22], src1.xyz = temp[13], src1.w = 32.000000 (0x60), src2.xyz = temp[39], srcp.xyz = (src1 + src0) MAD temp[23].xyz, src0.www, srcp.xyz, src0.000 MAD_SAT temp[40].w, src2.x, src1.w, src0.0 17: src0.xyz = temp[23], src1.xyz = temp[19] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[24].w, src0._, src0._ 18: src0.xyz = input[0], src0.w = temp[10], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[42].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[11].w, src0.w, src1.w, src0.0 19: src0.w = temp[24] LG2 temp[25].w, src0.w 20: src0.w = temp[25], src1.w = 128.000000 (0x70) MAD temp[26].w, src0.w, src1.w, src0.0 21: src0.w = temp[26] EX2 temp[27].w, src0.w 22: src0.w = temp[27], src1.w = 64.000000 (0x68) MAD_SAT temp[28].w, src0.w, src1.w, src0.0 23: src0.w = temp[28], src1.w = temp[17] MAD temp[29].w, src0.w, src1.w, src0.0 24: src0.xyz = const[3], src0.w = temp[29] MAD temp[30].xyz, src0.www, src0.xyz, src0.000 25: src0.xyz = temp[30], src1.xyz = temp[33] MAD temp[34].xyz, src0.xyz, src1.xyz, src0.000 26: src0.w = temp[40], src1.xyz = temp[36], src2.xyz = temp[34] MAD_SAT temp[41].xyz, src0.www, src1.xyz, src2.xyz 27: src0.xyz = const[7], src0.w = temp[11], src1.xyz = temp[41], src2.xyz = temp[42], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xy__, 2D[1]; 2: TEX temp[1].xyz, input[1].xy__, 2D[2]; 3: TEX temp[0].w, input[0].xy__, 2D[0]; 4: TEX temp[5].xyz, input[0].xy__, 2D[3]; 5: TEX temp[0].xyz, input[0].xy__, 2D[4] SEM_WAIT SEM_ACQUIRE; 6: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[1].w, src0._, src0._ 7: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[1] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 RSQ temp[1].w, |src1.w| 8: src0.xyz = temp[5], src0.w = 2.000000 (0x40) MAD temp[5].xyz, src0.xyz, src0.www, -src0.111 9: src0.xyz = temp[5], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 10: src0.xyz = temp[4], src0.w = temp[2] MAD temp[6].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[0], src1.xyz = const[5] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 12: src0.xyz = input[2], src0.w = temp[1], src1.xyz = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 MAX temp[1].w, src1.x, src1.y 13: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[2].w, src0._, src0._ 14: src0.xyz = temp[6], src0.w = temp[1], src1.xyz = const[4], src2.xyz = temp[0] MAX temp[6].xyz, src0.xyz, src1.xyz MAX temp[1].w, src0.w, src2.z 15: src0.w = temp[1], src1.w = temp[2] MAD temp[3].y, src0._w_, src0._w_, src0._0_ RSQ temp[1].w, |src1.w| 16: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = temp[1], src1.w = 32.000000 (0x60), src2.xyz = temp[3], srcp.xyz = (src1 + src0) MAD temp[1].xyz, src0.www, srcp.xyz, src0.000 MAD_SAT temp[1].w, src2.y, src1.w, src0.0 17: src0.xyz = temp[1], src1.xyz = temp[5] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 18: src0.xyz = input[3], src0.w = temp[4], src1.xyz = const[6], src1.w = const[0] MAD_SAT temp[1].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[3].w, src0.w, src1.w, src0.0 19: src0.w = temp[2] LG2 temp[2].w, src0.w 20: src0.w = temp[2], src1.w = 128.000000 (0x70) MAD temp[2].w, src0.w, src1.w, src0.0 21: src0.w = temp[2] EX2 temp[2].w, src0.w 22: src0.w = temp[2], src1.w = 64.000000 (0x68) MAD_SAT temp[2].w, src0.w, src1.w, src0.0 23: src0.w = temp[2], src1.w = temp[0] MAD temp[0].w, src0.w, src1.w, src0.0 24: src0.xyz = const[3], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 25: src0.xyz = temp[2], src1.xyz = temp[6] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 26: src0.w = temp[1], src1.xyz = temp[0], src2.xyz = temp[2] MAD_SAT temp[0].xyz, src0.www, src1.xyz, src2.xyz 27: src0.xyz = const[7], src0.w = temp[3], src1.xyz = temp[0], src2.xyz = temp[1], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe404f401: src: 1 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004003:TEX wmask: A omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02440000: id: 4 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080004c0:Addr0: 192t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0004d01b:RSQ dest:1 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8050:MAD dest:5 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 8 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000405:Addr0: 5t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001f1:DP3 dest:31 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08041400:Addr0: 0t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00007a00:ALU NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00281013:MAX dest:1 alp_A_src:1 R 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000151:DP3 dest:21 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00041006:Addr0: 6t, Addr1: 4c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0050c013:MAX dest:1 alp_A_src:0 A 0 alp_B_src:2 B 0 targ 0 w:0 5 RGBA_INST: 0x00000065:MAX dest:6 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00005000:ALU wmask: AG omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x008e0470:rgb_A_src:0 0/A/0 0 rgb_B_src:0 0/A/0 0 targ: 0 4 ALPHA_INST:0x0004d01b:RSQ dest:1 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00107800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x80300402:Addr0: 2t, Addr1: 1t, Addr2: 3t, srcp:2 2:ALPHA_ADDR 0x08038001:Addr0: 1t, Addr1: 224t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00686010:MAD dest:1 alp_A_src:2 G 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 16 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001401:Addr0: 1t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000181:DP3 dest:24 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00184800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x08041803:Addr0: 3t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040004:Addr0: 4t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x0068c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20485010:MAD dest:1 rgb_C_src:1 G/0/0 0 alp_C_src:0 0 0 18 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c029:LN2 dest:2 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803c002:Addr0: 2t, Addr1: 240t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 20 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c028:EX2 dest:2 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 21 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803a002:Addr0: 2t, Addr1: 232t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 22 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 23 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020103:Addr0: 3c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08001802:Addr0: 2t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 25 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00200080:Addr0: 128t, Addr1: 0t, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044236c:rgb_A_src:0 A/A/A 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 26 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40100107:Addr0: 7c, Addr1: 0t, Addr2: 1t, srcp:1 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 27 Instructions ~ 17 Vector Instructions (RGB) ~ 16 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 5 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 7 Temporary Registers ~ 5 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL CONST[0] DCL CONST[2] DCL CONST[4..7] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 1.0000, 64.0000, 0.0000} IMM[1] FLT32 { 128.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[3], 2D 8: MAD TEMP[4].w, TEMP[4].wwww, CONST[2].xxxx, CONST[2].yyyy 9: MAD TEMP[5].xy, TEMP[4].wwww, TEMP[3], IN[1] 10: TEX TEMP[4].w, TEMP[5], SAMP[0], 2D 11: TEX TEMP[6].xyz, TEMP[5], SAMP[3], 2D 12: MAD TEMP[6].xyz, TEMP[6], IMM[0].xxxx, -IMM[0].yyyy 13: ADD TEMP[5].xyz, TEMP[3], TEMP[2] 14: DP3 TEMP[5].w, TEMP[5], TEMP[5] 15: RSQ TEMP[5].w, |TEMP[5].wwww| 16: MUL TEMP[5].xyz, TEMP[5].wwww, TEMP[5] 17: DP3_SAT TEMP[5].w, TEMP[5], TEMP[6] 18: POW TEMP[5].w, TEMP[5].wwww, IMM[1].xxxx 19: MUL_SAT TEMP[5].w, TEMP[5], IMM[0].zzzz 20: MUL TEMP[5].w, TEMP[5], TEMP[4] 21: MUL TEMP[4].xyz, TEMP[5].wwww, CONST[4] 22: DP3_SAT TEMP[2].w, TEMP[6], TEMP[2] 23: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 24: MAX TEMP[0].xyz, TEMP[0], CONST[5] 25: MUL_SAT TEMP[1].xyz, TEMP[4], TEMP[0] 26: MAD_SAT TEMP[7].x, IN[0].xxxx, CONST[6].xxxx, CONST[6].yyyy 27: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[1], CONST[7] 28: MOV OUT[0].w, TEMP[1] 29: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 13: ADD temp[5].xyz, temp[3], temp[2]; 14: DP3 temp[5].w, temp[5], temp[5]; 15: RSQ temp[5].w, |temp[5].wwww|; 16: MUL temp[5].xyz, temp[5].wwww, temp[5]; 17: DP3_SAT temp[5].w, temp[5], temp[6]; 18: POW temp[5].w, temp[5].wwww, const[9].xxxx; 19: MUL_SAT temp[5].w, temp[5], const[8].zzzz; 20: MUL temp[5].w, temp[5], temp[4]; 21: MUL temp[4].xyz, temp[5].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 13: ADD temp[5].xyz, temp[3], temp[2]; 14: DP3 temp[5].w, temp[5], temp[5]; 15: RSQ temp[5].w, |temp[5].wwww|; 16: MUL temp[5].xyz, temp[5].wwww, temp[5]; 17: DP3_SAT temp[5].w, temp[5], temp[6]; 18: POW temp[5].w, temp[5].wwww, const[9].xxxx; 19: MUL_SAT temp[5].w, temp[5], const[8].zzzz; 20: MUL temp[5].w, temp[5], temp[4]; 21: MUL temp[4].xyz, temp[5].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 13: ADD temp[5].xyz, temp[3], temp[2]; 14: DP3 temp[5].w, temp[5], temp[5]; 15: RSQ temp[5].w, |temp[5].wwww|; 16: MUL temp[5].xyz, temp[5].wwww, temp[5]; 17: DP3_SAT temp[5].w, temp[5], temp[6]; 18: POW temp[5].w, temp[5].wwww, const[9].xxxx; 19: MUL_SAT temp[5].w, temp[5], const[8].zzzz; 20: MUL temp[5].w, temp[5], temp[4]; 21: MUL temp[4].xyz, temp[5].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 13: ADD temp[5].xyz, temp[3], temp[2]; 14: DP3 temp[5].w, temp[5], temp[5]; 15: RSQ temp[5].w, |temp[5].wwww|; 16: MUL temp[5].xyz, temp[5].wwww, temp[5]; 17: DP3_SAT temp[5].w, temp[5], temp[6]; 18: POW temp[5].w, temp[5].wwww, const[9].xxxx; 19: MUL_SAT temp[5].w, temp[5], const[8].zzzz; 20: MUL temp[5].w, temp[5], temp[4]; 21: MUL temp[4].xyz, temp[5].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 13: ADD temp[5].xyz, temp[3], temp[2]; 14: DP3 temp[5].w, temp[5], temp[5]; 15: RSQ temp[5].w, |temp[5].wwww|; 16: MUL temp[5].xyz, temp[5].wwww, temp[5]; 17: DP3_SAT temp[5].w, temp[5], temp[6]; 18: POW temp[5].w, temp[5].wwww, const[9].xxxx; 19: MUL_SAT temp[5].w, temp[5], const[8].zzzz; 20: MUL temp[5].w, temp[5], temp[4]; 21: MUL temp[4].xyz, temp[5].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 13: ADD temp[5].xyz, temp[3], temp[2]; 14: DP3 temp[5].w, temp[5], temp[5]; 15: RSQ temp[5].w, |temp[5].wwww|; 16: MUL temp[5].xyz, temp[5].wwww, temp[5]; 17: DP3_SAT temp[5].w, temp[5], temp[6]; 18: POW temp[5].w, temp[5].wwww, const[9].xxxx; 19: MUL_SAT temp[5].w, temp[5], const[8].zzzz; 20: MUL temp[5].w, temp[5], temp[4]; 21: MUL temp[4].xyz, temp[5].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL_SAT temp[1].xyz, temp[4], temp[0]; 26: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 27: LRP output[0].xyz, temp[7].xxxx, temp[1], const[7]; 28: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[8].xxxx, -const[8].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[8].xxxx, -const[8].yyyy; 13: ADD temp[5].xyz, temp[3], temp[2]; 14: DP3 temp[5].w, temp[5], temp[5]; 15: RSQ temp[5].w, |temp[5].wwww|; 16: MUL temp[5].xyz, temp[5].wwww, temp[5]; 17: DP3_SAT temp[5].w, temp[5], temp[6]; 18: LG2 temp[8].w, temp[5].wwww; 19: MUL temp[8].w, temp[8].wwww, const[9].xxxx; 20: EX2 temp[5].w, temp[8].wwww; 21: MUL_SAT temp[5].w, temp[5], const[8].zzzz; 22: MUL temp[5].w, temp[5], temp[4]; 23: MUL temp[4].xyz, temp[5].wwww, const[4]; 24: DP3_SAT temp[2].w, temp[6], temp[2]; 25: MUL temp[0].xyz, temp[0], temp[2].wwww; 26: MAX temp[0].xyz, temp[0], const[5]; 27: MUL_SAT temp[1].xyz, temp[4], temp[0]; 28: MAD_SAT temp[7].x, input[0].xxxx, const[6].xxxx, const[6].yyyy; 29: ADD temp[9].xyz, temp[1], -const[7]; 30: MAD output[0].xyz, temp[7].xxxx, temp[9], const[7]; 31: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[3]; 8: MAD temp[4].w, temp[4].___w, const[2].___x, const[2].___y; 9: MAD temp[5].xy, temp[4].ww__, temp[3].xy__, input[1].xy__; 10: TEX temp[4].w, temp[5].xy__, 2D[0]; 11: TEX temp[6].xyz, temp[5].xy__, 2D[3]; 12: MAD temp[6].xyz, temp[6].xyz_, const[8].xxx_, -const[8].yyy_; 13: ADD temp[5].xyz, temp[3].xyz_, temp[2].xyz_; 14: DP3 temp[5].w, temp[5].xyz_, temp[5].xyz_; 15: RSQ temp[5].w, |temp[5].___w|; 16: MUL temp[5].xyz, temp[5].www_, temp[5].xyz_; 17: DP3_SAT temp[5].w, temp[5].xyz_, temp[6].xyz_; 18: LG2 temp[8].w, temp[5].___w; 19: MUL temp[8].w, temp[8].___w, const[9].___x; 20: EX2 temp[5].w, temp[8].___w; 21: MUL_SAT temp[5].w, temp[5].___w, const[8].___z; 22: MUL temp[5].w, temp[5].___w, temp[4].___w; 23: MUL temp[4].xyz, temp[5].www_, const[4].xyz_; 24: DP3_SAT temp[2].w, temp[6].xyz_, temp[2].xyz_; 25: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 26: MAX temp[0].xyz, temp[0].xyz_, const[5].xyz_; 27: MUL_SAT temp[1].xyz, temp[4].xyz_, temp[0].xyz_; 28: MAD_SAT temp[7].x, input[0].x___, const[6].x___, const[6].y___; 29: ADD temp[9].xyz, temp[1].xyz_, -const[7].xyz_; 30: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[7].xyz_; 31: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -const[8].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20].w, temp[19].xy__, 2D[0]; 11: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 12: MAD temp[22].xyz, temp[21].xyz_, const[8].xxx_, -const[8].yyy_; 13: ADD temp[23].xyz, temp[16].xyz_, temp[13].xyz_; 14: DP3 temp[24].w, temp[23].xyz_, temp[23].xyz_; 15: RSQ temp[25].w, |temp[24].___w|; 16: MUL temp[26].xyz, temp[25].www_, temp[23].xyz_; 17: DP3_SAT temp[27].w, temp[26].xyz_, temp[22].xyz_; 18: LG2 temp[28].w, temp[27].___w; 19: MUL temp[29].w, temp[28].___w, const[9].___x; 20: EX2 temp[30].w, temp[29].___w; 21: MUL_SAT temp[31].w, temp[30].___w, const[8].___z; 22: MUL temp[32].w, temp[31].___w, temp[20].___w; 23: MUL temp[33].xyz, temp[32].www_, const[4].xyz_; 24: DP3_SAT temp[34].w, temp[22].xyz_, temp[13].xyz_; 25: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 26: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 27: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 28: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 29: ADD temp[39].xyz, temp[37].xyz_, -const[7].xyz_; 30: MAD output[0].xyz, temp[38].xxx_, temp[39].xyz_, const[7].xyz_; 31: MOV output[0].w, temp[11].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, const[8].xxx_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20].w, temp[19].xy__, 2D[0]; 11: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 12: MAD temp[22].xyz, temp[21].xyz_, const[8].xxx_, -none.111_; 13: DP3 temp[24].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 14: RSQ temp[25].w, |temp[24].___w|; 15: MUL temp[26].xyz, temp[25].www_, (temp[13] + temp[16]).xyz_; 16: DP3_SAT temp[27].w, temp[26].xyz_, temp[22].xyz_; 17: LG2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, const[9].___x; 19: EX2 temp[30].w, temp[29].___w; 20: MUL_SAT temp[31].w, temp[30].___w, const[8].___z; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MUL temp[33].xyz, temp[32].www_, const[4].xyz_; 23: DP3_SAT temp[34].w, temp[22].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20].w, temp[19].xy__, 2D[0]; 11: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 12: MAD temp[22].xyz, temp[21].xyz_, 2.000000 (0x40).www_, -none.111_; 13: DP3 temp[24].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 14: RSQ temp[25].w, |temp[24].___w|; 15: MUL temp[26].xyz, temp[25].www_, (temp[13] + temp[16]).xyz_; 16: DP3_SAT temp[27].w, temp[26].xyz_, temp[22].xyz_; 17: LG2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, 128.000000 (0x70).___w; 19: EX2 temp[30].w, temp[29].___w; 20: MUL_SAT temp[31].w, temp[30].___w, 64.000000 (0x68).___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MUL temp[33].xyz, temp[32].www_, const[4].xyz_; 23: DP3_SAT temp[34].w, temp[22].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; CONST[8] = { 2.0000 1.0000 64.0000 0.0000 } CONST[9] = { 128.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20].w, temp[19].xy__, 2D[0]; 11: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 12: MAD temp[22].xyz, temp[21].xyz_, 2.000000 (0x40).www_, -none.111_; 13: DP3 temp[24].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 14: RSQ temp[25].w, |temp[24].___w|; 15: MUL temp[26].xyz, temp[25].www_, (temp[13] + temp[16]).xyz_; 16: DP3_SAT temp[27].w, temp[26].xyz_, temp[22].xyz_; 17: LG2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, 128.000000 (0x70).___w; 19: EX2 temp[30].w, temp[29].___w; 20: MUL_SAT temp[31].w, temp[30].___w, 64.000000 (0x68).___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MUL temp[33].xyz, temp[32].www_, const[4].xyz_; 23: DP3_SAT temp[34].w, temp[22].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: MUL_SAT temp[11].w, temp[10].___w, const[0].___w; 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: MAD temp[13].xyz, temp[12].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: MAD temp[18].w, temp[17].___w, const[2].___x, const[2].___y; 9: MAD temp[19].xy, temp[18].ww__, temp[16].xy__, input[1].xy__; 10: TEX temp[20].w, temp[19].xy__, 2D[0]; 11: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 12: MAD temp[22].xyz, temp[21].xyz_, 2.000000 (0x40).www_, -none.111_; 13: DP3 temp[24].w, (temp[13] + temp[16]).xyz_, (temp[13] + temp[16]).xyz_; 14: RSQ temp[25].w, |temp[24].___w|; 15: MUL temp[26].xyz, temp[25].www_, (temp[13] + temp[16]).xyz_; 16: DP3_SAT temp[27].w, temp[26].xyz_, temp[22].xyz_; 17: LG2 temp[28].w, temp[27].___w; 18: MUL temp[29].w, temp[28].___w, 128.000000 (0x70).___w; 19: EX2 temp[30].w, temp[29].___w; 20: MUL_SAT temp[31].w, temp[30].___w, 64.000000 (0x68).___w; 21: MUL temp[32].w, temp[31].___w, temp[20].___w; 22: MUL temp[33].xyz, temp[32].www_, const[4].xyz_; 23: DP3_SAT temp[34].w, temp[22].xyz_, temp[13].xyz_; 24: MUL temp[35].xyz, temp[10].xyz_, temp[34].www_; 25: MAX temp[36].xyz, temp[35].xyz_, const[5].xyz_; 26: MUL_SAT temp[37].xyz, temp[33].xyz_, temp[36].xyz_; 27: MAD_SAT temp[38].x, input[0].x___, const[6].x___, const[6].y___; 28: MAD output[0].xyz, temp[38].xxx_, (temp[37] - const[7]).xyz_, const[7].xyz_; 29: MOV output[0].w, temp[11].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[1]; 1: src0.w = temp[10], src1.w = const[0] MAD_SAT temp[11].w, src0.w, src1.w, src0.0 2: TEX temp[12].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[12], src0.w = 2.000000 (0x40) MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[17].w, input[1].xy__, 2D[3]; 8: src0.xyz = const[2], src0.w = temp[17] MAD temp[18].w, src0.w, src0.x, src0.y 9: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ 10: TEX temp[20].w, temp[19].xy__, 2D[0]; 11: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 12: src0.xyz = temp[21], src0.w = 2.000000 (0x40) MAD temp[22].xyz, src0.xyz, src0.www, -src0.111 13: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[24].w, src0._, src0._ 14: src0.w = temp[24] RSQ temp[25].w, |src0.w| 15: src0.xyz = temp[16], src0.w = temp[25], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[26].xyz, src0.www, srcp.xyz, src0.000 16: src0.xyz = temp[26], src1.xyz = temp[22] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[27].w, src0._, src0._ 17: src0.w = temp[27] LG2 temp[28].w, src0.w 18: src0.w = temp[28], src1.w = 128.000000 (0x70) MAD temp[29].w, src0.w, src1.w, src0.0 19: src0.w = temp[29] EX2 temp[30].w, src0.w 20: src0.w = temp[30], src1.w = 64.000000 (0x68) MAD_SAT temp[31].w, src0.w, src1.w, src0.0 21: src0.w = temp[31], src1.w = temp[20] MAD temp[32].w, src0.w, src1.w, src0.0 22: src0.xyz = const[4], src0.w = temp[32] MAD temp[33].xyz, src0.www, src0.xyz, src0.000 23: src0.xyz = temp[22], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 24: src0.xyz = temp[10], src0.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.www, src0.000 25: src0.xyz = temp[35], src1.xyz = const[5] MAX temp[36].xyz, src0.xyz, src1.xyz 26: src0.xyz = temp[33], src1.xyz = temp[36] MAD_SAT temp[37].xyz, src0.xyz, src1.xyz, src0.000 27: src0.xyz = input[0], src1.xyz = const[6] MAD_SAT temp[38].x, src0.x__, src1.x__, src1.y__ 28: src0.xyz = const[7], src1.xyz = temp[37], src2.xyz = temp[38], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 29: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[14], src1.xyz = const[6] MAD_SAT temp[38].x, src0.x__, src1.x__, src1.y__ RSQ temp[15].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[10], input[2].xy__, 2D[1]; 5: TEX temp[12].xyz, input[2].xy__, 2D[2]; 6: TEX temp[17].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[10], src2.w = const[0] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[11].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[24].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[17] MAD temp[18].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1], src1.w = temp[24] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[25].w, |src1.w| 11: src0.xyz = temp[16], src0.w = temp[25], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[26].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 15: TEX temp[20].w, temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 16: src0.xyz = temp[21], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[22].xyz, src0.xyz, src0.www, -src0.111 17: src0.xyz = temp[22], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 18: src0.xyz = temp[26], src1.xyz = temp[22] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[27].w, src0._, src0._ 19: src0.xyz = temp[10], src0.w = temp[34], src1.w = temp[27] MAD temp[35].xyz, src0.xyz, src0.www, src0.000 LG2 temp[28].w, src1.w 20: src0.xyz = temp[35], src0.w = temp[28], src1.xyz = const[5], src1.w = 128.000000 (0x70) MAX temp[36].xyz, src0.xyz, src1.xyz MAD temp[29].w, src0.w, src1.w, src0.0 21: src0.w = temp[29] EX2 temp[30].w, src0.w 22: src0.w = temp[30], src1.w = 64.000000 (0x68) MAD_SAT temp[31].w, src0.w, src1.w, src0.0 23: src0.w = temp[31], src1.w = temp[20] MAD temp[32].w, src0.w, src1.w, src0.0 24: src0.xyz = const[4], src0.w = temp[32] MAD temp[33].xyz, src0.www, src0.xyz, src0.000 25: src0.xyz = temp[33], src1.xyz = temp[36] MAD_SAT temp[37].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = const[7], src1.xyz = temp[37], src2.xyz = temp[38], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[14], src1.xyz = const[6] MAD_SAT temp[38].x, src0.x__, src1.x__, src1.y__ RSQ temp[15].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[10], input[2].xy__, 2D[1]; 5: TEX temp[12].xyz, input[2].xy__, 2D[2]; 6: TEX temp[17].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[12], src0.w = 2.000000 (0x40), src1.w = temp[10], src2.w = const[0] SEM_WAIT MAD temp[13].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[11].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[16], src1.xyz = temp[13], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[24].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[17] MAD temp[18].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[16], src0.w = temp[18], src1.xyz = input[1], src1.w = temp[24] MAD temp[19].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[25].w, |src1.w| 11: src0.xyz = temp[16], src0.w = temp[25], src1.xyz = temp[13], srcp.xyz = (src1 + src0) MAD temp[26].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[11] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[21].xyz, temp[19].xy__, 2D[3]; 15: TEX temp[20].w, temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 16: src0.xyz = temp[21], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[22].xyz, src0.xyz, src0.www, -src0.111 17: src0.xyz = temp[22], src1.xyz = temp[13] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[34].w, src0._, src0._ 18: src0.xyz = temp[26], src1.xyz = temp[22] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[27].w, src0._, src0._ 19: src0.xyz = temp[10], src0.w = temp[34], src1.w = temp[27] MAD temp[35].xyz, src0.xyz, src0.www, src0.000 LG2 temp[28].w, src1.w 20: src0.xyz = temp[35], src0.w = temp[28], src1.xyz = const[5], src1.w = 128.000000 (0x70) MAX temp[36].xyz, src0.xyz, src1.xyz MAD temp[29].w, src0.w, src1.w, src0.0 21: src0.w = temp[29] EX2 temp[30].w, src0.w 22: src0.w = temp[30], src1.w = 64.000000 (0x68) MAD_SAT temp[31].w, src0.w, src1.w, src0.0 23: src0.w = temp[31], src1.w = temp[20] MAD temp[32].w, src0.w, src1.w, src0.0 24: src0.xyz = const[4], src0.w = temp[32] MAD temp[33].xyz, src0.www, src0.xyz, src0.000 25: src0.xyz = temp[33], src1.xyz = temp[36] MAD_SAT temp[37].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = const[7], src1.xyz = temp[37], src2.xyz = temp[38], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 1: src0.xyz = input[3], src0.w = temp[0], src1.xyz = const[6] MAD_SAT temp[0].z, src0.__x, src1.__x, src1.__y RSQ temp[0].w, |src0.w| 2: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[3], input[1].xy__, 2D[1]; 5: TEX temp[1].xyz, input[1].xy__, 2D[2]; 6: TEX temp[0].w, input[0].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[3], src2.w = const[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[1].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[2].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[0] MAD temp[0].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = input[0], src1.w = temp[2] MAD temp[0].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[0].w, |src1.w| 11: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[2].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[1] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[4].xyz, temp[0].xy__, 2D[3]; 15: TEX temp[0].w, temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 16: src0.xyz = temp[4], src0.w = 2.000000 (0x40) SEM_WAIT MAD temp[4].xyz, src0.xyz, src0.www, -src0.111 17: src0.xyz = temp[4], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[1].w, src0._, src0._ 18: src0.xyz = temp[2], src1.xyz = temp[4] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 19: src0.xyz = temp[3], src0.w = temp[1], src1.w = temp[2] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 LG2 temp[1].w, src1.w 20: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = const[5], src1.w = 128.000000 (0x70) MAX temp[1].xyz, src0.xyz, src1.xyz MAD temp[1].w, src0.w, src1.w, src0.0 21: src0.w = temp[1] EX2 temp[1].w, src0.w 22: src0.w = temp[1], src1.w = 64.000000 (0x68) MAD_SAT temp[1].w, src0.w, src1.w, src0.0 23: src0.w = temp[1], src1.w = temp[0] MAD temp[0].w, src0.w, src1.w, src0.0 24: src0.xyz = const[4], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 25: src0.xyz = temp[2], src1.xyz = temp[1] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = const[7], src1.xyz = temp[1], src2.xyz = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.zzz, srcp.xyz, src0.xyz R500 Fragment Program: -------- 0 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00086000:ALU wmask: AB omask: NONE 1:RGB_ADDR 0x08041803:Addr0: 3t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122090:rgb_A_src:0 0/0/R 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00191000:MAD dest:0 rgb_C_src:1 0/0/G 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe403f401: src: 1 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00107a04:ALU TEX_WAIT NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x10000cc0:Addr0: 192t, Addr1: 3t, Addr2: 0c, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 0 0 7 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000181:DP3 dest:24 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020102:Addr0: 2c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x08000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 G 0 9 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000800:Addr0: 0t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084046c:rgb_A_src:0 A/A/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00421000:MAD dest:0 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00040001:OUT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 12 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe404f400: src: 0 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 13 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 14 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00ed8040:MAD dest:4 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 15 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000404:Addr0: 4t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000221:DP3 dest:34 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001002:Addr0: 2t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001b1:DP3 dest:27 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0000d019:LN2 dest:1 alp_A_src:1 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041401:Addr0: 1t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803c001:Addr0: 1t, Addr1: 240t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000015:MAX dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 19 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c018:EX2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803a001:Addr0: 1t, Addr1: 232t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 21 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 22 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020104:Addr0: 4c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x08000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00038005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x40000507:Addr0: 7c, Addr1: 1t, Addr2: 0t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 25 Instructions ~ 15 Vector Instructions (RGB) ~ 14 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 5 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 5 Temporary Registers ~ 4 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0] DCL CONST[2..7] DCL TEMP[0..3] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 2.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: ADD OUT[2].xy, IN[3], CONST[0] 6: MUL OUT[3].xy, IN[4], IMM[0].zzzz 7: DP4 OUT[1].x, -IN[0], CONST[2] 8: MAD TEMP[1], IN[2], IMM[0].wwww, -IMM[0].yyyy 9: XPD TEMP[2].xyz, IN[1], TEMP[1] 10: MUL TEMP[2].xyz, TEMP[2], TEMP[1].wwww 11: SUB TEMP[3].xyz, CONST[3], IN[0] 12: DP3 OUT[4].x, TEMP[3], TEMP[1] 13: DP3 OUT[4].y, TEMP[3], TEMP[2] 14: DP3 OUT[4].z, TEMP[3], IN[1] 15: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: XPD temp[2].xyz, input[1], temp[1]; 10: MUL temp[2].xyz, temp[2], temp[1].wwww; 11: SUB temp[3].xyz, const[3], input[0]; 12: DP3 output[4].x, temp[3], temp[1]; 13: DP3 output[4].y, temp[3], temp[2]; 14: DP3 output[4].z, temp[3], input[1]; 15: MOV output[0], temp[4]; 16: MOV output[5], temp[4]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, const[8].xxxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3], const[0]; 6: MUL output[3].xy, input[4], const[8].zzzz; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxyw, temp[1].yzxw; 10: MAD temp[2].xyz, input[1].yzxw, temp[1].zxyw, -temp[2]; 11: MUL temp[2].xyz, temp[2], temp[1].wwww; 12: ADD temp[3].xyz, const[3], -input[0]; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, const[8]._xxy; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -const[8].yyyy; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[4], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[4]; 17: MOV output[5], temp[4]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; CONST[8] = { 0.0000 1.0000 0.0000 2.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: ADD output[2].xy, input[3].xy__, const[0].xy__; 6: MUL output[3].xy, input[4].xy__, const[8].zz__; 7: DP4 output[1].x, -input[0], const[2]; 8: MAD temp[1], input[2], const[8].wwww, -none.1111; 9: MUL temp[2].xyz, input[1].zxy_, temp[1].yzx_; 10: MAD temp[2].xyz, input[1].yzx_, temp[1].zxy_, -temp[2].xyz_; 11: MUL temp[2].xyz, temp[2].xyz_, temp[1].www_; 12: ADD temp[3].xyz, const[3].xyz_, -input[0].xyz_; 13: DP4 output[4].x, temp[3].xyz0, temp[1].xyz0; 14: DP4 output[4].y, temp[3].xyz0, temp[2].xyz0; 15: DP4 output[4].z, temp[3].xyz0, input[1].xyz0; 16: MOV output[0], temp[0]; 17: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00e08203 dst: 4o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00302203 dst: 1o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01f90002 reg: 0c swiz: X/ Y/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00304202 dst: 2o op: VE_MULTIPLY src0: 0x01f90081 reg: 4i swiz: X/ Y/ U/ U src1: 0x01fa4102 reg: 8c swiz: Z/ Z/ U/ U src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 7: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x00db6102 reg: 8c swiz: W/ W/ W/ W src2: 0x1f6da040 reg: 2t swiz: -1/-1/-1/-1 9: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 11: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src1: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 12: op: 0x00706003 dst: 3t op: VE_ADD src0: 0x01d10062 reg: 3c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 14: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 15: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 16: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 18 Instructions ~ 0 Flow Control Instructions ~ 4 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL SAMP[4] DCL CONST[0] DCL CONST[2] DCL CONST[4..8] DCL TEMP[0..8] IMM[0] FLT32 { 2.0000, 1.0000, 64.0000, 32.0000} IMM[1] FLT32 { 128.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[1], 2D 1: MUL_SAT TEMP[1].w, TEMP[0], CONST[0] 2: TEX TEMP[2], IN[2], SAMP[2], 2D 3: MAD TEMP[2].xyz, TEMP[2], IMM[0].xxxx, -IMM[0].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: TEX TEMP[4].w, IN[1], SAMP[3], 2D 8: MAD TEMP[4].w, TEMP[4].wwww, CONST[2].xxxx, CONST[2].yyyy 9: MAD TEMP[5].xy, TEMP[4].wwww, TEMP[3], IN[1] 10: TEX TEMP[4].w, TEMP[5], SAMP[0], 2D 11: TEX TEMP[6].xyz, TEMP[5], SAMP[3], 2D 12: MAD TEMP[6].xyz, TEMP[6], IMM[0].xxxx, -IMM[0].yyyy 13: ADD TEMP[7].xyz, TEMP[3], TEMP[2] 14: DP3 TEMP[7].w, TEMP[7], TEMP[7] 15: RSQ TEMP[7].w, |TEMP[7].wwww| 16: MUL TEMP[7].xyz, TEMP[7].wwww, TEMP[7] 17: DP3_SAT TEMP[7].w, TEMP[7], TEMP[6] 18: POW TEMP[7].w, TEMP[7].wwww, IMM[1].xxxx 19: MUL_SAT TEMP[7].w, TEMP[7], IMM[0].zzzz 20: MUL TEMP[7].w, TEMP[7], TEMP[4] 21: MUL TEMP[4].xyz, TEMP[7].wwww, CONST[4] 22: DP3_SAT TEMP[2].w, TEMP[6], TEMP[2] 23: MUL TEMP[0].xyz, TEMP[0], TEMP[2].wwww 24: MAX TEMP[0].xyz, TEMP[0], CONST[5] 25: MUL TEMP[4].xyz, TEMP[4], TEMP[0] 26: TEX TEMP[0].xyz, TEMP[5], SAMP[4], 2D 27: MUL TEMP[0].xyz, TEMP[0], CONST[6] 28: MAX TEMP[2].x, TEMP[0].xxxx, TEMP[0].yyyy 29: MAX TEMP[2].x, TEMP[2].xxxx, TEMP[0].zzzz 30: MUL TEMP[2].x, TEMP[2].xxxx, TEMP[2].xxxx 31: MUL_SAT TEMP[2].x, TEMP[2].xxxx, IMM[0].wwww 32: MAD_SAT TEMP[1].xyz, TEMP[2].xxxx, TEMP[0], TEMP[4] 33: MAD_SAT TEMP[8].x, IN[0].xxxx, CONST[7].xxxx, CONST[7].yyyy 34: LRP OUT[0].xyz, TEMP[8].xxxx, TEMP[1], CONST[8] 35: MOV OUT[0].w, TEMP[1] 36: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 13: ADD temp[7].xyz, temp[3], temp[2]; 14: DP3 temp[7].w, temp[7], temp[7]; 15: RSQ temp[7].w, |temp[7].wwww|; 16: MUL temp[7].xyz, temp[7].wwww, temp[7]; 17: DP3_SAT temp[7].w, temp[7], temp[6]; 18: POW temp[7].w, temp[7].wwww, const[10].xxxx; 19: MUL_SAT temp[7].w, temp[7], const[9].zzzz; 20: MUL temp[7].w, temp[7], temp[4]; 21: MUL temp[4].xyz, temp[7].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MUL temp[0].xyz, temp[0], const[6]; 28: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 29: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 30: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 31: MUL_SAT temp[2].x, temp[2].xxxx, const[9].wwww; 32: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 33: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 34: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 35: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 13: ADD temp[7].xyz, temp[3], temp[2]; 14: DP3 temp[7].w, temp[7], temp[7]; 15: RSQ temp[7].w, |temp[7].wwww|; 16: MUL temp[7].xyz, temp[7].wwww, temp[7]; 17: DP3_SAT temp[7].w, temp[7], temp[6]; 18: POW temp[7].w, temp[7].wwww, const[10].xxxx; 19: MUL_SAT temp[7].w, temp[7], const[9].zzzz; 20: MUL temp[7].w, temp[7], temp[4]; 21: MUL temp[4].xyz, temp[7].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MUL temp[0].xyz, temp[0], const[6]; 28: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 29: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 30: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 31: MUL_SAT temp[2].x, temp[2].xxxx, const[9].wwww; 32: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 33: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 34: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 35: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 13: ADD temp[7].xyz, temp[3], temp[2]; 14: DP3 temp[7].w, temp[7], temp[7]; 15: RSQ temp[7].w, |temp[7].wwww|; 16: MUL temp[7].xyz, temp[7].wwww, temp[7]; 17: DP3_SAT temp[7].w, temp[7], temp[6]; 18: POW temp[7].w, temp[7].wwww, const[10].xxxx; 19: MUL_SAT temp[7].w, temp[7], const[9].zzzz; 20: MUL temp[7].w, temp[7], temp[4]; 21: MUL temp[4].xyz, temp[7].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MUL temp[0].xyz, temp[0], const[6]; 28: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 29: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 30: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 31: MUL_SAT temp[2].x, temp[2].xxxx, const[9].wwww; 32: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 33: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 34: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 35: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 13: ADD temp[7].xyz, temp[3], temp[2]; 14: DP3 temp[7].w, temp[7], temp[7]; 15: RSQ temp[7].w, |temp[7].wwww|; 16: MUL temp[7].xyz, temp[7].wwww, temp[7]; 17: DP3_SAT temp[7].w, temp[7], temp[6]; 18: POW temp[7].w, temp[7].wwww, const[10].xxxx; 19: MUL_SAT temp[7].w, temp[7], const[9].zzzz; 20: MUL temp[7].w, temp[7], temp[4]; 21: MUL temp[4].xyz, temp[7].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MUL temp[0].xyz, temp[0], const[6]; 28: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 29: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 30: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 31: MUL_SAT temp[2].x, temp[2].xxxx, const[9].wwww; 32: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 33: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 34: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 35: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 13: ADD temp[7].xyz, temp[3], temp[2]; 14: DP3 temp[7].w, temp[7], temp[7]; 15: RSQ temp[7].w, |temp[7].wwww|; 16: MUL temp[7].xyz, temp[7].wwww, temp[7]; 17: DP3_SAT temp[7].w, temp[7], temp[6]; 18: POW temp[7].w, temp[7].wwww, const[10].xxxx; 19: MUL_SAT temp[7].w, temp[7], const[9].zzzz; 20: MUL temp[7].w, temp[7], temp[4]; 21: MUL temp[4].xyz, temp[7].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MUL temp[0].xyz, temp[0], const[6]; 28: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 29: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 30: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 31: MUL_SAT temp[2].x, temp[2].xxxx, const[9].wwww; 32: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 33: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 34: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 35: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 13: ADD temp[7].xyz, temp[3], temp[2]; 14: DP3 temp[7].w, temp[7], temp[7]; 15: RSQ temp[7].w, |temp[7].wwww|; 16: MUL temp[7].xyz, temp[7].wwww, temp[7]; 17: DP3_SAT temp[7].w, temp[7], temp[6]; 18: POW temp[7].w, temp[7].wwww, const[10].xxxx; 19: MUL_SAT temp[7].w, temp[7], const[9].zzzz; 20: MUL temp[7].w, temp[7], temp[4]; 21: MUL temp[4].xyz, temp[7].wwww, const[4]; 22: DP3_SAT temp[2].w, temp[6], temp[2]; 23: MUL temp[0].xyz, temp[0], temp[2].wwww; 24: MAX temp[0].xyz, temp[0], const[5]; 25: MUL temp[4].xyz, temp[4], temp[0]; 26: TEX temp[0].xyz, temp[5], 2D[4]; 27: MUL temp[0].xyz, temp[0], const[6]; 28: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 29: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 30: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 31: MUL_SAT temp[2].x, temp[2].xxxx, const[9].wwww; 32: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 33: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 34: LRP output[0].xyz, temp[8].xxxx, temp[1], const[8]; 35: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[1]; 1: MUL_SAT temp[1].w, temp[0], const[0]; 2: TEX temp[2], input[2], 2D[2]; 3: MAD temp[2].xyz, temp[2], const[9].xxxx, -const[9].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: TEX temp[4].w, input[1], 2D[3]; 8: MAD temp[4].w, temp[4].wwww, const[2].xxxx, const[2].yyyy; 9: MAD temp[5].xy, temp[4].wwww, temp[3], input[1]; 10: TEX temp[4].w, temp[5], 2D[0]; 11: TEX temp[6].xyz, temp[5], 2D[3]; 12: MAD temp[6].xyz, temp[6], const[9].xxxx, -const[9].yyyy; 13: ADD temp[7].xyz, temp[3], temp[2]; 14: DP3 temp[7].w, temp[7], temp[7]; 15: RSQ temp[7].w, |temp[7].wwww|; 16: MUL temp[7].xyz, temp[7].wwww, temp[7]; 17: DP3_SAT temp[7].w, temp[7], temp[6]; 18: LG2 temp[9].w, temp[7].wwww; 19: MUL temp[9].w, temp[9].wwww, const[10].xxxx; 20: EX2 temp[7].w, temp[9].wwww; 21: MUL_SAT temp[7].w, temp[7], const[9].zzzz; 22: MUL temp[7].w, temp[7], temp[4]; 23: MUL temp[4].xyz, temp[7].wwww, const[4]; 24: DP3_SAT temp[2].w, temp[6], temp[2]; 25: MUL temp[0].xyz, temp[0], temp[2].wwww; 26: MAX temp[0].xyz, temp[0], const[5]; 27: MUL temp[4].xyz, temp[4], temp[0]; 28: TEX temp[0].xyz, temp[5], 2D[4]; 29: MUL temp[0].xyz, temp[0], const[6]; 30: MAX temp[2].x, temp[0].xxxx, temp[0].yyyy; 31: MAX temp[2].x, temp[2].xxxx, temp[0].zzzz; 32: MUL temp[2].x, temp[2].xxxx, temp[2].xxxx; 33: MUL_SAT temp[2].x, temp[2].xxxx, const[9].wwww; 34: MAD_SAT temp[1].xyz, temp[2].xxxx, temp[0], temp[4]; 35: MAD_SAT temp[8].x, input[0].xxxx, const[7].xxxx, const[7].yyyy; 36: ADD temp[10].xyz, temp[1], -const[8]; 37: MAD output[0].xyz, temp[8].xxxx, temp[10], const[8]; 38: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[1]; 1: MUL_SAT temp[1].w, temp[0].___w, const[0].___w; 2: TEX temp[2].xyz, input[2].xy__, 2D[2]; 3: MAD temp[2].xyz, temp[2].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: TEX temp[4].w, input[1].xy__, 2D[3]; 8: MAD temp[4].w, temp[4].___w, const[2].___x, const[2].___y; 9: MAD temp[5].xy, temp[4].ww__, temp[3].xy__, input[1].xy__; 10: TEX temp[4].w, temp[5].xy__, 2D[0]; 11: TEX temp[6].xyz, temp[5].xy__, 2D[3]; 12: MAD temp[6].xyz, temp[6].xyz_, const[9].xxx_, -const[9].yyy_; 13: ADD temp[7].xyz, temp[3].xyz_, temp[2].xyz_; 14: DP3 temp[7].w, temp[7].xyz_, temp[7].xyz_; 15: RSQ temp[7].w, |temp[7].___w|; 16: MUL temp[7].xyz, temp[7].www_, temp[7].xyz_; 17: DP3_SAT temp[7].w, temp[7].xyz_, temp[6].xyz_; 18: LG2 temp[9].w, temp[7].___w; 19: MUL temp[9].w, temp[9].___w, const[10].___x; 20: EX2 temp[7].w, temp[9].___w; 21: MUL_SAT temp[7].w, temp[7].___w, const[9].___z; 22: MUL temp[7].w, temp[7].___w, temp[4].___w; 23: MUL temp[4].xyz, temp[7].www_, const[4].xyz_; 24: DP3_SAT temp[2].w, temp[6].xyz_, temp[2].xyz_; 25: MUL temp[0].xyz, temp[0].xyz_, temp[2].www_; 26: MAX temp[0].xyz, temp[0].xyz_, const[5].xyz_; 27: MUL temp[4].xyz, temp[4].xyz_, temp[0].xyz_; 28: TEX temp[0].xyz, temp[5].xy__, 2D[4]; 29: MUL temp[0].xyz, temp[0].xyz_, const[6].xyz_; 30: MAX temp[2].x, temp[0].x___, temp[0].y___; 31: MAX temp[2].x, temp[2].x___, temp[0].z___; 32: MUL temp[2].x, temp[2].x___, temp[2].x___; 33: MUL_SAT temp[2].x, temp[2].x___, const[9].w___; 34: MAD_SAT temp[1].xyz, temp[2].xxx_, temp[0].xyz_, temp[4].xyz_; 35: MAD_SAT temp[8].x, input[0].x___, const[7].x___, const[7].y___; 36: ADD temp[10].xyz, temp[1].xyz_, -const[8].xyz_; 37: MAD output[0].xyz, temp[8].xxx_, temp[10].xyz_, const[8].xyz_; 38: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, const[9].xxx_, -const[9].yyy_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21].w, temp[20].xy__, 2D[0]; 11: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 12: MAD temp[23].xyz, temp[22].xyz_, const[9].xxx_, -const[9].yyy_; 13: ADD temp[24].xyz, temp[17].xyz_, temp[14].xyz_; 14: DP3 temp[25].w, temp[24].xyz_, temp[24].xyz_; 15: RSQ temp[26].w, |temp[25].___w|; 16: MUL temp[27].xyz, temp[26].www_, temp[24].xyz_; 17: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 18: LG2 temp[29].w, temp[28].___w; 19: MUL temp[30].w, temp[29].___w, const[10].___x; 20: EX2 temp[31].w, temp[30].___w; 21: MUL_SAT temp[32].w, temp[31].___w, const[9].___z; 22: MUL temp[33].w, temp[32].___w, temp[21].___w; 23: MUL temp[34].xyz, temp[33].www_, const[4].xyz_; 24: DP3_SAT temp[35].w, temp[23].xyz_, temp[14].xyz_; 25: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 26: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 27: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 28: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 29: MUL temp[40].xyz, temp[39].xyz_, const[6].xyz_; 30: MAX temp[41].x, temp[40].x___, temp[40].y___; 31: MAX temp[42].x, temp[41].x___, temp[40].z___; 32: MUL temp[43].x, temp[42].x___, temp[42].x___; 33: MUL_SAT temp[44].x, temp[43].x___, const[9].w___; 34: MAD_SAT temp[45].xyz, temp[44].xxx_, temp[40].xyz_, temp[38].xyz_; 35: MAD_SAT temp[46].x, input[0].x___, const[7].x___, const[7].y___; 36: ADD temp[47].xyz, temp[45].xyz_, -const[8].xyz_; 37: MAD output[0].xyz, temp[46].xxx_, temp[47].xyz_, const[8].xyz_; 38: MOV output[0].w, temp[12].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, const[9].xxx_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21].w, temp[20].xy__, 2D[0]; 11: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 12: MAD temp[23].xyz, temp[22].xyz_, const[9].xxx_, -none.111_; 13: DP3 temp[25].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 14: RSQ temp[26].w, |temp[25].___w|; 15: MUL temp[27].xyz, temp[26].www_, (temp[14] + temp[17]).xyz_; 16: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 17: LG2 temp[29].w, temp[28].___w; 18: MUL temp[30].w, temp[29].___w, const[10].___x; 19: EX2 temp[31].w, temp[30].___w; 20: MUL_SAT temp[32].w, temp[31].___w, const[9].___z; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MUL temp[34].xyz, temp[33].www_, const[4].xyz_; 23: DP3_SAT temp[35].w, temp[23].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MUL temp[40].xyz, temp[39].xyz_, const[6].xyz_; 29: MAX temp[41].x, temp[40].x___, temp[40].y___; 30: MAX temp[42].x, temp[41].x___, temp[40].z___; 31: MUL temp[43].x, temp[42].x___, temp[42].x___; 32: MUL_SAT temp[44].x, temp[43].x___, const[9].w___; 33: MAD_SAT temp[45].xyz, temp[44].xxx_, temp[40].xyz_, temp[38].xyz_; 34: MAD_SAT temp[46].x, input[0].x___, const[7].x___, const[7].y___; 35: MAD output[0].xyz, temp[46].xxx_, (temp[45] - const[8]).xyz_, const[8].xyz_; 36: MOV output[0].w, temp[12].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21].w, temp[20].xy__, 2D[0]; 11: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 12: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 13: DP3 temp[25].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 14: RSQ temp[26].w, |temp[25].___w|; 15: MUL temp[27].xyz, temp[26].www_, (temp[14] + temp[17]).xyz_; 16: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 17: LG2 temp[29].w, temp[28].___w; 18: MUL temp[30].w, temp[29].___w, 128.000000 (0x70).___w; 19: EX2 temp[31].w, temp[30].___w; 20: MUL_SAT temp[32].w, temp[31].___w, 64.000000 (0x68).___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MUL temp[34].xyz, temp[33].www_, const[4].xyz_; 23: DP3_SAT temp[35].w, temp[23].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MUL temp[40].xyz, temp[39].xyz_, const[6].xyz_; 29: MAX temp[41].x, temp[40].x___, temp[40].y___; 30: MAX temp[42].x, temp[41].x___, temp[40].z___; 31: MUL temp[43].x, temp[42].x___, temp[42].x___; 32: MUL_SAT temp[44].x, temp[43].x___, 32.000000 (0x60).w___; 33: MAD_SAT temp[45].xyz, temp[44].xxx_, temp[40].xyz_, temp[38].xyz_; 34: MAD_SAT temp[46].x, input[0].x___, const[7].x___, const[7].y___; 35: MAD output[0].xyz, temp[46].xxx_, (temp[45] - const[8]).xyz_, const[8].xyz_; 36: MOV output[0].w, temp[12].___w; CONST[9] = { 2.0000 1.0000 64.0000 32.0000 } CONST[10] = { 128.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21].w, temp[20].xy__, 2D[0]; 11: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 12: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 13: DP3 temp[25].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 14: RSQ temp[26].w, |temp[25].___w|; 15: MUL temp[27].xyz, temp[26].www_, (temp[14] + temp[17]).xyz_; 16: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 17: LG2 temp[29].w, temp[28].___w; 18: MUL temp[30].w, temp[29].___w, 128.000000 (0x70).___w; 19: EX2 temp[31].w, temp[30].___w; 20: MUL_SAT temp[32].w, temp[31].___w, 64.000000 (0x68).___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MUL temp[34].xyz, temp[33].www_, const[4].xyz_; 23: DP3_SAT temp[35].w, temp[23].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MUL temp[40].xyz, temp[39].xyz_, const[6].xyz_; 29: MAX temp[41].x, temp[40].x___, temp[40].y___; 30: MAX temp[42].x, temp[41].x___, temp[40].z___; 31: MUL temp[43].x, temp[42].x___, temp[42].x___; 32: MUL_SAT temp[44].x, temp[43].x___, 32.000000 (0x60).w___; 33: MAD_SAT temp[45].xyz, temp[44].xxx_, temp[40].xyz_, temp[38].xyz_; 34: MAD_SAT temp[46].x, input[0].x___, const[7].x___, const[7].y___; 35: MAD output[0].xyz, temp[46].xxx_, (temp[45] - const[8]).xyz_, const[8].xyz_; 36: MOV output[0].w, temp[12].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: MUL_SAT temp[12].w, temp[11].___w, const[0].___w; 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: MAD temp[14].xyz, temp[13].xyz_, 2.000000 (0x40).www_, -none.111_; 4: DP3 temp[15].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[16].w, |temp[15].___w|; 6: MUL temp[17].xyz, temp[16].www_, input[3].xyz_; 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: MAD temp[19].w, temp[18].___w, const[2].___x, const[2].___y; 9: MAD temp[20].xy, temp[19].ww__, temp[17].xy__, input[1].xy__; 10: TEX temp[21].w, temp[20].xy__, 2D[0]; 11: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 12: MAD temp[23].xyz, temp[22].xyz_, 2.000000 (0x40).www_, -none.111_; 13: DP3 temp[25].w, (temp[14] + temp[17]).xyz_, (temp[14] + temp[17]).xyz_; 14: RSQ temp[26].w, |temp[25].___w|; 15: MUL temp[27].xyz, temp[26].www_, (temp[14] + temp[17]).xyz_; 16: DP3_SAT temp[28].w, temp[27].xyz_, temp[23].xyz_; 17: LG2 temp[29].w, temp[28].___w; 18: MUL temp[30].w, temp[29].___w, 128.000000 (0x70).___w; 19: EX2 temp[31].w, temp[30].___w; 20: MUL_SAT temp[32].w, temp[31].___w, 64.000000 (0x68).___w; 21: MUL temp[33].w, temp[32].___w, temp[21].___w; 22: MUL temp[34].xyz, temp[33].www_, const[4].xyz_; 23: DP3_SAT temp[35].w, temp[23].xyz_, temp[14].xyz_; 24: MUL temp[36].xyz, temp[11].xyz_, temp[35].www_; 25: MAX temp[37].xyz, temp[36].xyz_, const[5].xyz_; 26: MUL temp[38].xyz, temp[34].xyz_, temp[37].xyz_; 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: MUL temp[40].xyz, temp[39].xyz_, const[6].xyz_; 29: MAX temp[41].x, temp[40].x___, temp[40].y___; 30: MAX temp[42].x, temp[41].x___, temp[40].z___; 31: MUL temp[43].x, temp[42].x___, temp[42].x___; 32: MUL_SAT temp[44].x, temp[43].x___, 32.000000 (0x60).w___; 33: MAD_SAT temp[45].xyz, temp[44].xxx_, temp[40].xyz_, temp[38].xyz_; 34: MAD_SAT temp[46].x, input[0].x___, const[7].x___, const[7].y___; 35: MAD output[0].xyz, temp[46].xxx_, (temp[45] - const[8]).xyz_, const[8].xyz_; 36: MOV output[0].w, temp[12].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[11], input[2].xy__, 2D[1]; 1: src0.w = temp[11], src1.w = const[0] MAD_SAT temp[12].w, src0.w, src1.w, src0.0 2: TEX temp[13].xyz, input[2].xy__, 2D[2]; 3: src0.xyz = temp[13], src0.w = 2.000000 (0x40) MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 5: src0.w = temp[15] RSQ temp[16].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 7: TEX temp[18].w, input[1].xy__, 2D[3]; 8: src0.xyz = const[2], src0.w = temp[18] MAD temp[19].w, src0.w, src0.x, src0.y 9: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ 10: TEX temp[21].w, temp[20].xy__, 2D[0]; 11: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 12: src0.xyz = temp[22], src0.w = 2.000000 (0x40) MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 13: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 14: src0.w = temp[25] RSQ temp[26].w, |src0.w| 15: src0.xyz = temp[17], src0.w = temp[26], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 16: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 17: src0.w = temp[28] LG2 temp[29].w, src0.w 18: src0.w = temp[29], src1.w = 128.000000 (0x70) MAD temp[30].w, src0.w, src1.w, src0.0 19: src0.w = temp[30] EX2 temp[31].w, src0.w 20: src0.w = temp[31], src1.w = 64.000000 (0x68) MAD_SAT temp[32].w, src0.w, src1.w, src0.0 21: src0.w = temp[32], src1.w = temp[21] MAD temp[33].w, src0.w, src1.w, src0.0 22: src0.xyz = const[4], src0.w = temp[33] MAD temp[34].xyz, src0.www, src0.xyz, src0.000 23: src0.xyz = temp[23], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 24: src0.xyz = temp[11], src0.w = temp[35] MAD temp[36].xyz, src0.xyz, src0.www, src0.000 25: src0.xyz = temp[36], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz 26: src0.xyz = temp[34], src1.xyz = temp[37] MAD temp[38].xyz, src0.xyz, src1.xyz, src0.000 27: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 28: src0.xyz = temp[39], src1.xyz = const[6] MAD temp[40].xyz, src0.xyz, src1.xyz, src0.000 29: src0.xyz = temp[40] MAX temp[41].x, src0.x__, src0.y__ 30: src0.xyz = temp[41], src1.xyz = temp[40] MAX temp[42].x, src0.x__, src1.z__ 31: src0.xyz = temp[42] MAD temp[43].x, src0.x__, src0.x__, src0.000 32: src0.xyz = temp[43], src0.w = 32.000000 (0x60) MAD_SAT temp[44].x, src0.x__, src0.w__, src0.000 33: src0.xyz = temp[44], src1.xyz = temp[40], src2.xyz = temp[38] MAD_SAT temp[45].xyz, src0.xxx, src1.xyz, src2.xyz 34: src0.xyz = input[0], src1.xyz = const[7] MAD_SAT temp[46].x, src0.x__, src1.x__, src1.y__ 35: src0.xyz = const[8], src1.xyz = temp[45], src2.xyz = temp[46], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 36: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[15], src1.xyz = const[7] MAD_SAT temp[46].x, src0.x__, src1.x__, src1.y__ RSQ temp[16].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[11], input[2].xy__, 2D[1]; 5: TEX temp[13].xyz, input[2].xy__, 2D[2]; 6: TEX temp[18].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[13], src0.w = 2.000000 (0x40), src1.w = temp[11], src2.w = const[0] SEM_WAIT MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[12].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[18] MAD temp[19].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1], src1.w = temp[25] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[26].w, |src1.w| 11: src0.xyz = temp[17], src0.w = temp[26], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 15: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 16: TEX temp[21].w, temp[20].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 17: src0.xyz = temp[39], src1.xyz = const[6] SEM_WAIT MAD temp[40].xyz, src0.xyz, src1.xyz, src0.000 18: src0.xyz = temp[22], src0.w = 2.000000 (0x40), src1.xyz = temp[40] MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 MAX temp[41].w, src1.x, src1.y 19: src0.xyz = temp[23], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 20: src0.xyz = temp[11], src0.w = temp[35], src1.xyz = temp[40], src1.w = temp[41] MAD temp[36].xyz, src0.xyz, src0.www, src0.000 MAX temp[42].w, src1.w, src1.z 21: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 22: src0.xyz = temp[36], src0.w = temp[42], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz MAD temp[43].w, src0.w, src0.w, src0.0 23: src0.xyz = temp[43], src0.w = 32.000000 (0x60), src1.w = temp[43], src2.w = temp[28] MAD_SAT temp[44].x, src1.w__, src0.w__, src0.000 LG2 temp[29].w, src2.w 24: src0.w = temp[29], src1.w = 128.000000 (0x70) MAD temp[30].w, src0.w, src1.w, src0.0 25: src0.w = temp[30] EX2 temp[31].w, src0.w 26: src0.w = temp[31], src1.w = 64.000000 (0x68) MAD_SAT temp[32].w, src0.w, src1.w, src0.0 27: src0.w = temp[32], src1.w = temp[21] MAD temp[33].w, src0.w, src1.w, src0.0 28: src0.xyz = const[4], src0.w = temp[33] MAD temp[34].xyz, src0.www, src0.xyz, src0.000 29: src0.xyz = temp[34], src1.xyz = temp[37] MAD temp[38].xyz, src0.xyz, src1.xyz, src0.000 30: src0.xyz = temp[44], src1.xyz = temp[40], src2.xyz = temp[38] MAD_SAT temp[45].xyz, src0.xxx, src1.xyz, src2.xyz 31: src0.xyz = const[8], src1.xyz = temp[45], src2.xyz = temp[46], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[15].w, src0._, src0._ 1: src0.xyz = input[0], src0.w = temp[15], src1.xyz = const[7] MAD_SAT temp[46].x, src0.x__, src1.x__, src1.y__ RSQ temp[16].w, |src0.w| 2: src0.xyz = input[3], src0.w = temp[16] MAD temp[17].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[11], input[2].xy__, 2D[1]; 5: TEX temp[13].xyz, input[2].xy__, 2D[2]; 6: TEX temp[18].w, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[13], src0.w = 2.000000 (0x40), src1.w = temp[11], src2.w = const[0] SEM_WAIT MAD temp[14].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[12].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[17], src1.xyz = temp[14], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[25].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[18] MAD temp[19].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[17], src0.w = temp[19], src1.xyz = input[1], src1.w = temp[25] MAD temp[20].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[26].w, |src1.w| 11: src0.xyz = temp[17], src0.w = temp[26], src1.xyz = temp[14], srcp.xyz = (src1 + src0) MAD temp[27].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[12] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[39].xyz, temp[20].xy__, 2D[4]; 15: TEX temp[22].xyz, temp[20].xy__, 2D[3]; 16: TEX temp[21].w, temp[20].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 17: src0.xyz = temp[39], src1.xyz = const[6] SEM_WAIT MAD temp[40].xyz, src0.xyz, src1.xyz, src0.000 18: src0.xyz = temp[22], src0.w = 2.000000 (0x40), src1.xyz = temp[40] MAD temp[23].xyz, src0.xyz, src0.www, -src0.111 MAX temp[41].w, src1.x, src1.y 19: src0.xyz = temp[23], src1.xyz = temp[14] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[35].w, src0._, src0._ 20: src0.xyz = temp[11], src0.w = temp[35], src1.xyz = temp[40], src1.w = temp[41] MAD temp[36].xyz, src0.xyz, src0.www, src0.000 MAX temp[42].w, src1.w, src1.z 21: src0.xyz = temp[27], src1.xyz = temp[23] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[28].w, src0._, src0._ 22: src0.xyz = temp[36], src0.w = temp[42], src1.xyz = const[5] MAX temp[37].xyz, src0.xyz, src1.xyz MAD temp[43].w, src0.w, src0.w, src0.0 23: src0.w = 32.000000 (0x60), src1.w = temp[43], src2.w = temp[28] MAD_SAT temp[44].x, src1.w__, src0.w__, src0.000 LG2 temp[29].w, src2.w 24: src0.w = temp[29], src1.w = 128.000000 (0x70) MAD temp[30].w, src0.w, src1.w, src0.0 25: src0.w = temp[30] EX2 temp[31].w, src0.w 26: src0.w = temp[31], src1.w = 64.000000 (0x68) MAD_SAT temp[32].w, src0.w, src1.w, src0.0 27: src0.w = temp[32], src1.w = temp[21] MAD temp[33].w, src0.w, src1.w, src0.0 28: src0.xyz = const[4], src0.w = temp[33] MAD temp[34].xyz, src0.www, src0.xyz, src0.000 29: src0.xyz = temp[34], src1.xyz = temp[37] MAD temp[38].xyz, src0.xyz, src1.xyz, src0.000 30: src0.xyz = temp[44], src1.xyz = temp[40], src2.xyz = temp[38] MAD_SAT temp[45].xyz, src0.xxx, src1.xyz, src2.xyz 31: src0.xyz = const[8], src1.xyz = temp[45], src2.xyz = temp[46], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[0].w, src0._, src0._ 1: src0.xyz = input[3], src0.w = temp[0], src1.xyz = const[7] MAD_SAT temp[0].z, src0.__x, src1.__x, src1.__y RSQ temp[0].w, |src0.w| 2: src0.xyz = input[2], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 3: BEGIN_TEX; 4: TEX temp[3], input[1].xy__, 2D[1]; 5: TEX temp[1].xyz, input[1].xy__, 2D[2]; 6: TEX temp[0].w, input[0].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 7: src0.xyz = temp[1], src0.w = 2.000000 (0x40), src1.w = temp[3], src2.w = const[0] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.www, -src0.111 MAD_SAT temp[1].w, src1.w, src2.w, src0.0 8: src0.xyz = temp[2], src1.xyz = temp[1], srcp.xyz = (src1 + src0) DP3, srcp.xyz, srcp.xyz DP3 temp[2].w, src0._, src0._ 9: src0.xyz = const[2], src0.w = temp[0] MAD temp[0].w, src0.w, src0.x, src0.y 10: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = input[0], src1.w = temp[2] MAD temp[0].xy, src0.ww_, src0.xy_, src1.xy_ RSQ temp[0].w, |src1.w| 11: src0.xyz = temp[2], src0.w = temp[0], src1.xyz = temp[1], srcp.xyz = (src1 + src0) MAD temp[2].xyz, src0.www, srcp.xyz, src0.000 12: src0.w = temp[1] MAD color[0].w, src0.w, src0.1, src0.0 13: BEGIN_TEX; 14: TEX temp[4].xyz, temp[0].xy__, 2D[4]; 15: TEX temp[5].xyz, temp[0].xy__, 2D[3]; 16: TEX temp[0].w, temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 17: src0.xyz = temp[4], src1.xyz = const[6] SEM_WAIT MAD temp[4].xyz, src0.xyz, src1.xyz, src0.000 18: src0.xyz = temp[5], src0.w = 2.000000 (0x40), src1.xyz = temp[4] MAD temp[5].xyz, src0.xyz, src0.www, -src0.111 MAX temp[1].w, src1.x, src1.y 19: src0.xyz = temp[5], src1.xyz = temp[1] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 20: src0.xyz = temp[3], src0.w = temp[2], src1.xyz = temp[4], src1.w = temp[1] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 MAX temp[1].w, src1.w, src1.z 21: src0.xyz = temp[2], src1.xyz = temp[5] DP3_SAT, src0.xyz, src1.xyz DP3_SAT temp[2].w, src0._, src0._ 22: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = const[5] MAX temp[1].xyz, src0.xyz, src1.xyz MAD temp[1].w, src0.w, src0.w, src0.0 23: src0.w = 32.000000 (0x60), src1.w = temp[1], src2.w = temp[2] MAD_SAT temp[0].x, src1.w__, src0.w__, src0.0__ LG2 temp[1].w, src2.w 24: src0.w = temp[1], src1.w = 128.000000 (0x70) MAD temp[1].w, src0.w, src1.w, src0.0 25: src0.w = temp[1] EX2 temp[1].w, src0.w 26: src0.w = temp[1], src1.w = 64.000000 (0x68) MAD_SAT temp[1].w, src0.w, src1.w, src0.0 27: src0.w = temp[1], src1.w = temp[0] MAD temp[0].w, src0.w, src1.w, src0.0 28: src0.xyz = const[4], src0.w = temp[0] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 29: src0.xyz = temp[2], src1.xyz = temp[1] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 30: src0.xyz = temp[0], src1.xyz = temp[4], src2.xyz = temp[1] MAD_SAT temp[1].xyz, src0.xxx, src1.xyz, src2.xyz 31: src0.xyz = const[8], src1.xyz = temp[1], src2.xyz = temp[0], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.zzz, srcp.xyz, src0.xyz R500 Fragment Program: -------- 0 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810001:DP dest:0 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000f1:DP3 dest:15 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00086000:ALU wmask: AB omask: NONE 1:RGB_ADDR 0x08041c03:Addr0: 3t, Addr1: 7c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122090:rgb_A_src:0 0/0/R 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x0004c00b:RSQ dest:0 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00191000:MAD dest:0 rgb_C_src:1 0/0/G 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xe403f401: src: 1 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00420000: id: 2 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00107a04:ALU TEX_WAIT NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x10000cc0:Addr0: 192t, Addr1: 3t, Addr2: 0c, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0070d010:MAD dest:1 alp_A_src:1 A 0 alp_B_src:2 A 0 targ 0 w:0 5 RGBA_INST: 0x20ed8010:MAD dest:1 rgb_C_src:0 1/1/1 1 alp_C_src:0 0 0 7 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000191:DP3 dest:25 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020102:Addr0: 2c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x08000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 G 0 9 0:CMN_INST 0x00005800:ALU wmask: ARG omask: NONE 1:RGB_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000800:Addr0: 0t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0084046c:rgb_A_src:0 A/A/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x0004d00b:RSQ dest:0 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00421000:MAD dest:0 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x88000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:2 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044636c:rgb_A_src:0 A/A/A 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00040001:OUT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 12 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00440000: id: 4 op:LD, , SCALED 2:TEX_ADDR: 0xe404f400: src: 0 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 13 0:CMN_INST 0x00003803:TEX wmask: RGB omask: NONE 1:TEX_INST: 0x00430000: id: 3 op:LD, , SCALED 2:TEX_ADDR: 0xe405f400: src: 0 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 14 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 15 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08041804:Addr0: 4t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001005:Addr0: 5t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00281013:MAX dest:1 alp_A_src:1 R 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x00ed8050:MAD dest:5 rgb_C_src:0 1/1/1 1 alp_C_src:0 R 0 17 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08000405:Addr0: 5t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000231:DP3 dest:35 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x0048d013:MAX dest:1 alp_A_src:1 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00184000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08001402:Addr0: 2t, Addr1: 5t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810021:DP dest:2 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000001c1:DP3 dest:28 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08041401:Addr0: 1t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0060c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 A 0 targ 0 w:0 5 RGBA_INST: 0x20000015:MAX dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 21 0:CMN_INST 0x00084800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x002004e0:Addr0: 224t, Addr1: 1t, Addr2: 2t, srcp:0 3 RGB_INST: 0x0091848d:rgb_A_src:1 A/0/0 0 rgb_B_src:0 A/0/0 0 targ: 0 4 ALPHA_INST:0x0000e019:LN2 dest:1 alp_A_src:2 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803c001:Addr0: 1t, Addr1: 240t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 23 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c018:EX2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00104000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803a001:Addr0: 1t, Addr1: 232t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 25 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 26 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020104:Addr0: 4c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 27 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 28 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00101000:Addr0: 0t, Addr1: 4t, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 29 0:CMN_INST 0x00038005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x40000508:Addr0: 8c, Addr1: 1t, Addr2: 0t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044624a:rgb_A_src:2 B/B/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 30 Instructions ~ 18 Vector Instructions (RGB) ~ 17 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 6 Texture Instructions ~ 3 Presub Operations ~ 0 OMOD Operations ~ 6 Temporary Registers ~ 5 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0..1] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MUL TEMP[1], IN[0].xxxx, CONST[0] 5: MAD TEMP[1], IN[0].yyyy, CONST[1], TEMP[1] 6: MAD TEMP[1], IN[0].zzzz, CONST[2], TEMP[1] 7: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[1] 8: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[2], input[0].wwww, const[3], temp[0]; 4: MUL temp[1], input[0].xxxx, const[0]; 5: MAD temp[1], input[0].yyyy, const[1], temp[1]; 6: MAD temp[1], input[0].zzzz, const[2], temp[1]; 7: MAD temp[2], input[0].wwww, const[3], temp[1]; 8: MOV output[0], temp[2]; 9: MOV output[1], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[2], input[0].wwww, const[3], temp[0]; 4: MUL temp[1], input[0].xxxx, const[0]; 5: MAD temp[1], input[0].yyyy, const[1], temp[1]; 6: MAD temp[1], input[0].zzzz, const[2], temp[1]; 7: MAD temp[2], input[0].wwww, const[3], temp[1]; 8: MOV output[0], temp[2]; 9: MOV output[1], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[2], input[0].wwww, const[3], temp[0]; 4: MUL temp[1], input[0].xxxx, const[0]; 5: MAD temp[1], input[0].yyyy, const[1], temp[1]; 6: MAD temp[1], input[0].zzzz, const[2], temp[1]; 7: MAD temp[2], input[0].wwww, const[3], temp[1]; 8: MOV output[0], temp[2]; 9: MOV output[1], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[0]; 1: MAD temp[1], input[0].yyyy, const[1], temp[1]; 2: MAD temp[1], input[0].zzzz, const[2], temp[1]; 3: MAD temp[2], input[0].wwww, const[3], temp[1]; 4: MOV output[0], temp[2]; 5: MOV output[1], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[0]; 1: MAD temp[1], input[0].yyyy, const[1], temp[1]; 2: MAD temp[1], input[0].zzzz, const[2], temp[1]; 3: MAD temp[2], input[0].wwww, const[3], temp[1]; 4: MOV output[0], temp[2]; 5: MOV output[1], temp[2]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[0]; 1: MAD temp[1], input[0].yyyy, const[1], temp[1]; 2: MAD temp[1], input[0].zzzz, const[2], temp[1]; 3: MAD temp[2], input[0].wwww, const[3], temp[1]; 4: MOV output[0], temp[2]; 5: MOV output[1], temp[2]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[2], IN[2] 5: MOV_SAT OUT[1], IN[1] 6: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 5: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 8 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0] IMM[0] FLT32 { 10.5600, 10.8800, -30.4000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MUL TEMP[0], TEMP[0], IN[0] 2: DPH OUT[0].xyz, TEMP[0], IMM[0].xyxz 3: MOV OUT[0].w, TEMP[0].wwww 4: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DPH output[0].xyz, temp[0], const[0].xyxz; 3: MOV output[0].w, temp[0].wwww; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DPH output[0].xyz, temp[0], const[0].xyxz; 3: MOV output[0].w, temp[0].wwww; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DPH output[0].xyz, temp[0], const[0].xyxz; 3: MOV output[0].w, temp[0].wwww; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DPH output[0].xyz, temp[0], const[0].xyxz; 3: MOV output[0].w, temp[0].wwww; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DPH output[0].xyz, temp[0], const[0].xyxz; 3: MOV output[0].w, temp[0].wwww; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DPH output[0].xyz, temp[0], const[0].xyxz; 3: MOV output[0].w, temp[0].wwww; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DP4 output[0].xyz, temp[0].xyz1, const[0].xyxz; 3: MOV output[0].w, temp[0].wwww; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MUL temp[0], temp[0], input[0]; 2: DP4 output[0].xyz, temp[0].xyz1, const[0].xyxz; 3: MOV output[0].w, temp[0].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL temp[2], temp[1], input[0]; 2: DP4 output[0].xyz, temp[2].xyz1, const[0].xyxz; 3: MOV output[0].w, temp[2].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL temp[2], temp[1], input[0]; 2: DP4 output[0].xyz, temp[2].xyz1, const[0].xyxz; 3: MOV output[0].w, temp[2].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL temp[2], temp[1], input[0]; 2: DP4 output[0].xyz, temp[2].xyz1, const[0].xyxz; 3: MOV output[0].w, temp[2].___w; CONST[0] = { 10.5600 10.8800 -30.4000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL temp[2], temp[1], input[0]; 2: DP4 output[0].xyz, temp[2].xyz1, const[0].xyxz; 3: MOV output[0].w, temp[2].___w; CONST[0] = { 10.5600 10.8800 -30.4000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: MUL temp[2], temp[1], input[0]; 2: DP4 output[0].xyz, temp[2].xyz1, const[0].xyxz; 3: MOV output[0].w, temp[2].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, 2D[0]; 1: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = input[0], src1.w = input[0] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[2].w, src0.w, src1.w, src0.0 2: src0.xyz = temp[2], src1.xyz = const[0] DP4 color[0].xyz, src0.xyz, src1.xyx DP4, src0.1, src1.z 3: src0.w = temp[2] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = input[0], src1.w = input[0] SEM_WAIT MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[2].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[2], src1.xyz = const[0] DP4 color[0].xyz, src0.xyz, src1.xyx DP4, src0.1, src1.z 4: src0.w = temp[2] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = input[0], src1.w = input[0] SEM_WAIT MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[2].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[2], src1.xyz = const[0] DP4 color[0].xyz, src0.xyz, src1.xyx DP4, src0.1, src1.z 4: src0.w = temp[2] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = input[0], src1.w = input[0] SEM_WAIT MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[0].w, src0.w, src1.w, src0.0 3: src0.xyz = temp[0], src1.xyz = const[0] DP4 color[0].xyz, src0.xyz, src1.xyx DP4, src0.1, src1.z 4: src0.w = temp[0] MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 2 0:CMN_INST 0x00038001:OUT wmask: NONE omask: RGB 1:RGB_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00042220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/R 0 targ: 0 4 ALPHA_INST:0x00498001:DP dest:0 alp_A_src:0 1 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x00000002:DP4 dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00040005:OUT TEX_WAIT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL OUT[4], GENERIC[1] DCL OUT[5], GENERIC[2] DCL CONST[0..7] DCL TEMP[0..1] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[2].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[4] 2: MAD TEMP[0], IN[0].yyyy, CONST[5], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[6], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[7], TEMP[0] 5: MOV_SAT OUT[1], IN[2] 6: ADD OUT[3].xy, IN[3], CONST[0].yzww 7: SUB TEMP[1], CONST[1], IN[0] 8: DP3 TEMP[1].w, TEMP[1], TEMP[1] 9: RSQ TEMP[1].w, |TEMP[1].wwww| 10: MUL TEMP[1].xyz, TEMP[1].wwww, TEMP[1] 11: MOV OUT[4].xyz, IN[1] 12: ADD OUT[5].xyz, CONST[2], TEMP[1] 13: DP4 OUT[2].x, -IN[0], CONST[3] 14: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3], const[0].yzww; 7: SUB temp[1], const[1], input[0]; 8: DP3 temp[1].w, temp[1], temp[1]; 9: RSQ temp[1].w, |temp[1].wwww|; 10: MUL temp[1].xyz, temp[1].wwww, temp[1]; 11: MOV output[4].xyz, input[1]; 12: ADD output[5].xyz, const[2], temp[1]; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3], const[0].yzww; 7: SUB temp[1], const[1], input[0]; 8: DP3 temp[1].w, temp[1], temp[1]; 9: RSQ temp[1].w, |temp[1].wwww|; 10: MUL temp[1].xyz, temp[1].wwww, temp[1]; 11: MOV output[4].xyz, input[1]; 12: ADD output[5].xyz, const[2], temp[1]; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3], const[0].yzww; 7: ADD temp[1], const[1], -input[0]; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].wwww|; 10: MUL temp[1].xyz, temp[1].wwww, temp[1]; 11: MOV output[4].xyz, input[1]; 12: ADD output[5].xyz, const[2], temp[1]; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0]._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[2], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[2]; 15: MOV output[6], temp[2]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[0]; 15: MOV output[6], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[0]; 15: MOV output[6], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[4]; 2: MAD temp[0], input[0].yyyy, const[5], temp[0]; 3: MAD temp[0], input[0].zzzz, const[6], temp[0]; 4: MAD temp[0], input[0].wwww, const[7], temp[0]; 5: MOV_SAT output[1], input[2]; 6: ADD output[3].xy, input[3].xy__, const[0].yz__; 7: ADD temp[1].xyz, const[1].xyz_, -input[0].xyz_; 8: DP4 temp[1].w, temp[1].xyz0, temp[1].xyz0; 9: RSQ temp[1].w, |temp[1].___w|; 10: MUL temp[1].xyz, temp[1].www_, temp[1].xyz_; 11: MOV output[4].xyz, input[1].xyz_; 12: ADD output[5].xyz, const[2].xyz_, temp[1].xyz_; 13: DP4 output[2].x, -input[0], const[3]; 14: MOV output[0], temp[0]; 15: MOV output[6], temp[0]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 6: op: 0x00304203 dst: 2o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01fa2002 reg: 0c swiz: Y/ Z/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 7: op: 0x00702003 dst: 1t op: VE_ADD src0: 0x01d10022 reg: 1c swiz: X/ Y/ Z/ U src1: 0x1fd10001 reg: 0i swiz: -X/-Y/-Z/-U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 8: op: 0x00802001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src1: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00802048 dst: 1t op: ME_RECIP_SQRT_DX src0: 0x00db6028 reg: 1t swiz: W/ W/ W/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x01db6020 reg: 1t swiz: W/ W/ W/ U src1: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 11: op: 0x00706203 dst: 3o op: VE_ADD src0: 0x01d10021 reg: 1i swiz: X/ Y/ Z/ U src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 12: op: 0x00708203 dst: 4o op: VE_ADD src0: 0x01d10042 reg: 2c swiz: X/ Y/ Z/ U src1: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 13: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 14: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 15: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 16 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], FOG, PERSPECTIVE DCL IN[2], GENERIC[0], PERSPECTIVE DCL IN[3], GENERIC[1], PERSPECTIVE DCL IN[4], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL CONST[1] DCL CONST[3..4] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 0.0000, 256.0000, 0.0000} 0: TEX TEMP[0], IN[2], SAMP[0], 2D 1: MUL TEMP[0].xyz, TEMP[0], IMM[0].xxxx 2: TEX TEMP[1].xy, IN[2], SAMP[1], 2D 3: MUL TEMP[2].xyz, TEMP[0], CONST[1].yyyy 4: DP3 TEMP[3].w, IN[3], IN[3] 5: RSQ TEMP[3].w, |TEMP[3].wwww| 6: MUL TEMP[3].xyz, TEMP[3].wwww, IN[3] 7: DP3 TEMP[4].w, IN[4], IN[4] 8: RSQ TEMP[4].w, |TEMP[4].wwww| 9: MUL TEMP[4].xyz, TEMP[4].wwww, IN[4] 10: DP3_SAT TEMP[5].x, TEMP[4], TEMP[3] 11: POW TEMP[5].x, TEMP[5].xxxx, IMM[0].zzzz 12: MUL TEMP[5].x, TEMP[5].xxxx, CONST[1].xxxx 13: MUL TEMP[5].x, TEMP[5].xxxx, TEMP[1].xxxx 14: MUL TEMP[5].x, TEMP[5].xxxx, CONST[1].zzzz 15: MUL TEMP[0].xyz, TEMP[5].xxxx, IN[0] 16: MUL TEMP[3].x, TEMP[1].yyyy, TEMP[1].yyyy 17: MUL_SAT TEMP[3].x, TEMP[3].xxxx, CONST[1].wwww 18: MAD_SAT TEMP[6].xyz, TEMP[3].xxxx, TEMP[2], TEMP[0] 19: MUL_SAT TEMP[6].w, TEMP[0].wwww, IN[0].wwww 20: MAD_SAT TEMP[7].x, IN[1].xxxx, CONST[3].xxxx, CONST[3].yyyy 21: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[6], CONST[4] 22: MOV OUT[0].w, TEMP[6] 23: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[5].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 15: MUL temp[0].xyz, temp[5].xxxx, input[0]; 16: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 17: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 18: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 19: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 20: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 21: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 22: MOV output[0].w, temp[6]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[5].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 15: MUL temp[0].xyz, temp[5].xxxx, input[0]; 16: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 17: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 18: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 19: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 20: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 21: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 22: MOV output[0].w, temp[6]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[5].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 15: MUL temp[0].xyz, temp[5].xxxx, input[0]; 16: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 17: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 18: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 19: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 20: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 21: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 22: MOV output[0].w, temp[6]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[5].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 15: MUL temp[0].xyz, temp[5].xxxx, input[0]; 16: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 17: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 18: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 19: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 20: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 21: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 22: MOV output[0].w, temp[6]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[5].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 15: MUL temp[0].xyz, temp[5].xxxx, input[0]; 16: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 17: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 18: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 19: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 20: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 21: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 22: MOV output[0].w, temp[6]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: POW temp[5].x, temp[5].xxxx, const[5].zzzz; 12: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 13: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 14: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 15: MUL temp[0].xyz, temp[5].xxxx, input[0]; 16: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 17: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 18: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 19: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 20: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 21: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 22: MOV output[0].w, temp[6]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: DP3 temp[3].w, input[3], input[3]; 5: RSQ temp[3].w, |temp[3].wwww|; 6: MUL temp[3].xyz, temp[3].wwww, input[3]; 7: DP3 temp[4].w, input[4], input[4]; 8: RSQ temp[4].w, |temp[4].wwww|; 9: MUL temp[4].xyz, temp[4].wwww, input[4]; 10: DP3_SAT temp[5].x, temp[4], temp[3]; 11: LG2 temp[8].w, temp[5].xxxx; 12: MUL temp[8].w, temp[8].wwww, const[5].zzzz; 13: EX2 temp[5].x, temp[8].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 17: MUL temp[0].xyz, temp[5].xxxx, input[0]; 18: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 19: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 20: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 21: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 22: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 23: ADD temp[9].xyz, temp[6], -const[4]; 24: MAD output[0].xyz, temp[7].xxxx, temp[9], const[4]; 25: MOV output[0].w, temp[6]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[0]; 1: MUL temp[0].xyz, temp[0].xyz_, const[5].xxx_; 2: TEX temp[1].xy, input[2].xy__, 2D[1]; 3: MUL temp[2].xyz, temp[0].xyz_, const[1].yyy_; 4: DP3 temp[3].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[3].w, |temp[3].___w|; 6: MUL temp[3].xyz, temp[3].www_, input[3].xyz_; 7: DP3 temp[4].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[4].w, |temp[4].___w|; 9: MUL temp[4].xyz, temp[4].www_, input[4].xyz_; 10: DP3_SAT temp[5].x, temp[4].xyz_, temp[3].xyz_; 11: LG2 temp[8].w, temp[5].___x; 12: MUL temp[8].w, temp[8].___w, const[5].___z; 13: EX2 temp[5].x, temp[8].w___; 14: MUL temp[5].x, temp[5].x___, const[1].x___; 15: MUL temp[5].x, temp[5].x___, temp[1].x___; 16: MUL temp[5].x, temp[5].x___, const[1].z___; 17: MUL temp[0].xyz, temp[5].xxx_, input[0].xyz_; 18: MUL temp[3].x, temp[1].y___, temp[1].y___; 19: MUL_SAT temp[3].x, temp[3].x___, const[1].w___; 20: MAD_SAT temp[6].xyz, temp[3].xxx_, temp[2].xyz_, temp[0].xyz_; 21: MUL_SAT temp[6].w, temp[0].___w, input[0].___w; 22: MAD_SAT temp[7].x, input[1].x___, const[3].x___, const[3].y___; 23: ADD temp[9].xyz, temp[6].xyz_, -const[4].xyz_; 24: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[4].xyz_; 25: MOV output[0].w, temp[6].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[5].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, const[5].___z; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: MUL temp[26].x, temp[25].x___, const[1].z___; 17: MUL temp[27].xyz, temp[26].xxx_, input[0].xyz_; 18: MUL temp[28].x, temp[12].y___, temp[12].y___; 19: MUL_SAT temp[29].x, temp[28].x___, const[1].w___; 20: MAD_SAT temp[30].xyz, temp[29].xxx_, temp[13].xyz_, temp[27].xyz_; 21: MUL_SAT temp[31].w, temp[10].___w, input[0].___w; 22: MAD_SAT temp[32].x, input[1].x___, const[3].x___, const[3].y___; 23: ADD temp[33].xyz, temp[30].xyz_, -const[4].xyz_; 24: MAD output[0].xyz, temp[32].xxx_, temp[33].xyz_, const[4].xyz_; 25: MOV output[0].w, temp[31].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[5].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, const[5].___z; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: MUL temp[26].x, temp[25].x___, const[1].z___; 17: MUL temp[27].xyz, temp[26].xxx_, input[0].xyz_; 18: MUL temp[28].x, temp[12].y___, temp[12].y___; 19: MUL_SAT temp[29].x, temp[28].x___, const[1].w___; 20: MAD_SAT temp[30].xyz, temp[29].xxx_, temp[13].xyz_, temp[27].xyz_; 21: MUL_SAT temp[31].w, temp[10].___w, input[0].___w; 22: MAD_SAT temp[32].x, input[1].x___, const[3].x___, const[3].y___; 23: MAD output[0].xyz, temp[32].xxx_, (temp[30] - const[4]).xyz_, const[4].xyz_; 24: MOV output[0].w, temp[31].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, 256.000000 (0x78).___w; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: MUL temp[26].x, temp[25].x___, const[1].z___; 17: MUL temp[27].xyz, temp[26].xxx_, input[0].xyz_; 18: MUL temp[28].x, temp[12].y___, temp[12].y___; 19: MUL_SAT temp[29].x, temp[28].x___, const[1].w___; 20: MAD_SAT temp[30].xyz, temp[29].xxx_, temp[13].xyz_, temp[27].xyz_; 21: MUL_SAT temp[31].w, temp[10].___w, input[0].___w; 22: MAD_SAT temp[32].x, input[1].x___, const[3].x___, const[3].y___; 23: MAD output[0].xyz, temp[32].xxx_, (temp[30] - const[4]).xyz_, const[4].xyz_; 24: MOV output[0].w, temp[31].___w; CONST[5] = { 2.0000 0.0000 256.0000 0.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, 256.000000 (0x78).___w; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: MUL temp[26].x, temp[25].x___, const[1].z___; 17: MUL temp[27].xyz, temp[26].xxx_, input[0].xyz_; 18: MUL temp[28].x, temp[12].y___, temp[12].y___; 19: MUL_SAT temp[29].x, temp[28].x___, const[1].w___; 20: MAD_SAT temp[30].xyz, temp[29].xxx_, temp[13].xyz_, temp[27].xyz_; 21: MUL_SAT temp[31].w, temp[10].___w, input[0].___w; 22: MAD_SAT temp[32].x, input[1].x___, const[3].x___, const[3].y___; 23: MAD output[0].xyz, temp[32].xxx_, (temp[30] - const[4]).xyz_, const[4].xyz_; 24: MOV output[0].w, temp[31].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: DP3 temp[14].w, input[3].xyz_, input[3].xyz_; 5: RSQ temp[15].w, |temp[14].___w|; 6: MUL temp[16].xyz, temp[15].www_, input[3].xyz_; 7: DP3 temp[17].w, input[4].xyz_, input[4].xyz_; 8: RSQ temp[18].w, |temp[17].___w|; 9: MUL temp[19].xyz, temp[18].www_, input[4].xyz_; 10: DP3_SAT temp[20].x, temp[19].xyz_, temp[16].xyz_; 11: LG2 temp[21].w, temp[20].___x; 12: MUL temp[22].w, temp[21].___w, 256.000000 (0x78).___w; 13: EX2 temp[23].x, temp[22].w___; 14: MUL temp[24].x, temp[23].x___, const[1].x___; 15: MUL temp[25].x, temp[24].x___, temp[12].x___; 16: MUL temp[26].x, temp[25].x___, const[1].z___; 17: MUL temp[27].xyz, temp[26].xxx_, input[0].xyz_; 18: MUL temp[28].x, temp[12].y___, temp[12].y___; 19: MUL_SAT temp[29].x, temp[28].x___, const[1].w___; 20: MAD_SAT temp[30].xyz, temp[29].xxx_, temp[13].xyz_, temp[27].xyz_; 21: MUL_SAT temp[31].w, temp[10].___w, input[0].___w; 22: MAD_SAT temp[32].x, input[1].x___, const[3].x___, const[3].y___; 23: MAD output[0].xyz, temp[32].xxx_, (temp[30] - const[4]).xyz_, const[4].xyz_; 24: MOV output[0].w, temp[31].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: src0.xyz = temp[10], src0.w = 2.000000 (0x40) MAD temp[11].xyz, src0.xyz, src0.www, src0.000 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: src0.xyz = temp[11], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 5: src0.w = temp[14] RSQ temp[15].w, |src0.w| 6: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 7: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[17].w, src0._, src0._ 8: src0.w = temp[17] RSQ temp[18].w, |src0.w| 9: src0.xyz = input[4], src0.w = temp[18] MAD temp[19].xyz, src0.www, src0.xyz, src0.000 10: src0.xyz = temp[19], src1.xyz = temp[16] DP3_SAT temp[20].x, src0.xyz, src1.xyz 11: src0.xyz = temp[20] LG2 temp[21].w, src0.x 12: src0.w = temp[21], src1.w = 256.000000 (0x78) MAD temp[22].w, src0.w, src1.w, src0.0 13: src0.w = temp[22] REPL_ALPHA temp[23].x EX2, src0.w 14: src0.xyz = temp[23], src1.xyz = const[1] MAD temp[24].x, src0.x__, src1.x__, src0.000 15: src0.xyz = temp[24], src1.xyz = temp[12] MAD temp[25].x, src0.x__, src1.x__, src0.000 16: src0.xyz = temp[25], src1.xyz = const[1] MAD temp[26].x, src0.x__, src1.z__, src0.000 17: src0.xyz = temp[26], src1.xyz = input[0] MAD temp[27].xyz, src0.xxx, src1.xyz, src0.000 18: src0.xyz = temp[12] MAD temp[28].x, src0.y__, src0.y__, src0.000 19: src0.xyz = temp[28], src0.w = const[1] MAD_SAT temp[29].x, src0.x__, src0.w__, src0.000 20: src0.xyz = temp[29], src1.xyz = temp[13], src2.xyz = temp[27] MAD_SAT temp[30].xyz, src0.xxx, src1.xyz, src2.xyz 21: src0.w = temp[10], src1.w = input[0] MAD_SAT temp[31].w, src0.w, src1.w, src0.0 22: src0.xyz = input[1], src1.xyz = const[3] MAD_SAT temp[32].x, src0.x__, src1.x__, src1.y__ 23: src0.xyz = const[4], src1.xyz = temp[30], src2.xyz = temp[32], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 24: src0.w = temp[31] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 4: src0.w = temp[14] RSQ temp[15].w, |src0.w| 5: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 6: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[17].w, src0._, src0._ 7: src0.w = temp[17] RSQ temp[18].w, |src0.w| 8: src0.xyz = input[4], src0.w = temp[18] MAD temp[19].xyz, src0.www, src0.xyz, src0.000 9: src0.xyz = temp[19], src1.xyz = temp[16] DP3_SAT temp[20].x, src0.xyz, src1.xyz 10: src0.xyz = temp[20] LG2 temp[21].w, src0.x 11: src0.w = temp[21], src1.w = 256.000000 (0x78) MAD temp[22].w, src0.w, src1.w, src0.0 12: src0.w = temp[22] EX2 temp[23].w, src0.w 13: src0.xyz = temp[23], src0.w = temp[23], src1.xyz = const[1] MAD temp[24].x, src0.w__, src1.x__, src0.000 14: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.xyz = temp[12] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 MAD temp[28].w, src1.y, src1.y, src0.0 15: src0.xyz = temp[28], src0.w = const[1], src1.xyz = temp[24], src1.w = temp[28], src2.xyz = temp[12] MAD_SAT temp[29].x, src1.w__, src0.w__, src0.000 MAD temp[25].w, src1.x, src2.x, src0.0 16: src0.xyz = temp[11], src0.w = temp[25], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 MAD temp[26].w, src0.w, src1.z, src0.0 17: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[3], src1.w = input[0] MAD_SAT temp[32].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[31].w, src0.w, src1.w, src0.0 18: src0.xyz = temp[26], src0.w = temp[26], src1.xyz = input[0] MAD temp[27].xyz, src0.www, src1.xyz, src0.000 19: src0.xyz = temp[29], src1.xyz = temp[13], src2.xyz = temp[27] MAD_SAT temp[30].xyz, src0.xxx, src1.xyz, src2.xyz 20: src0.xyz = const[4], src0.w = temp[31], src1.xyz = temp[30], src2.xyz = temp[32], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[14].w, src0._, src0._ 4: src0.w = temp[14] RSQ temp[15].w, |src0.w| 5: src0.xyz = input[3], src0.w = temp[15] MAD temp[16].xyz, src0.www, src0.xyz, src0.000 6: src0.xyz = input[4] DP3, src0.xyz, src0.xyz DP3 temp[17].w, src0._, src0._ 7: src0.w = temp[17] RSQ temp[18].w, |src0.w| 8: src0.xyz = input[4], src0.w = temp[18] MAD temp[19].xyz, src0.www, src0.xyz, src0.000 9: src0.xyz = temp[19], src1.xyz = temp[16] DP3_SAT temp[20].x, src0.xyz, src1.xyz 10: src0.xyz = temp[20] LG2 temp[21].w, src0.x 11: src0.w = temp[21], src1.w = 256.000000 (0x78) MAD temp[22].w, src0.w, src1.w, src0.0 12: src0.w = temp[22] EX2 temp[23].w, src0.w 13: src0.w = temp[23], src1.xyz = const[1] MAD temp[24].x, src0.w__, src1.x__, src0.000 14: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.xyz = temp[12] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 MAD temp[28].w, src1.y, src1.y, src0.0 15: src0.w = const[1], src1.xyz = temp[24], src1.w = temp[28], src2.xyz = temp[12] MAD_SAT temp[29].x, src1.w__, src0.w__, src0.000 MAD temp[25].w, src1.x, src2.x, src0.0 16: src0.xyz = temp[11], src0.w = temp[25], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 MAD temp[26].w, src0.w, src1.z, src0.0 17: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[3], src1.w = input[0] MAD_SAT temp[32].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[31].w, src0.w, src1.w, src0.0 18: src0.w = temp[26], src1.xyz = input[0] MAD temp[27].xyz, src0.www, src1.xyz, src0.000 19: src0.xyz = temp[29], src1.xyz = temp[13], src2.xyz = temp[27] MAD_SAT temp[30].xyz, src0.xxx, src1.xyz, src2.xyz 20: src0.xyz = const[4], src0.w = temp[31], src1.xyz = temp[30], src2.xyz = temp[32], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[5], input[1].xy__, 2D[0]; 2: TEX temp[1].xy, input[1].xy__, 2D[1] SEM_WAIT SEM_ACQUIRE; 3: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[1].w, src0._, src0._ 4: src0.w = temp[1] RSQ temp[1].w, |src0.w| 5: src0.xyz = input[2], src0.w = temp[1] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 6: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[1].w, src0._, src0._ 7: src0.w = temp[1] RSQ temp[1].w, |src0.w| 8: src0.xyz = input[3], src0.w = temp[1] MAD temp[3].xyz, src0.www, src0.xyz, src0.000 9: src0.xyz = temp[3], src1.xyz = temp[2] DP3_SAT temp[1].z, src0.xyz, src1.xyz 10: src0.xyz = temp[1] LG2 temp[1].w, src0.z 11: src0.w = temp[1], src1.w = 256.000000 (0x78) MAD temp[1].w, src0.w, src1.w, src0.0 12: src0.w = temp[1] EX2 temp[1].w, src0.w 13: src0.w = temp[1], src1.xyz = const[1] MAD temp[1].z, src0.__w, src1.__x, src0.__0 14: src0.xyz = temp[5], src0.w = 2.000000 (0x40), src1.xyz = temp[1] SEM_WAIT MAD temp[2].xyz, src0.xyz, src0.www, src0.000 MAD temp[1].w, src1.y, src1.y, src0.0 15: src0.w = const[1], src1.xyz = temp[1], src1.w = temp[1], src2.xyz = temp[1] MAD_SAT temp[1].x, src1.w__, src0.w__, src0.0__ MAD temp[1].w, src1.z, src2.x, src0.0 16: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[1] MAD temp[2].xyz, src0.xyz, src1.yyy, src0.000 MAD temp[1].w, src0.w, src1.z, src0.0 17: src0.xyz = input[4], src0.w = temp[5], src1.xyz = const[3], src1.w = input[0] MAD_SAT temp[1].y, src0._x_, src1._x_, src1._y_ MAD_SAT temp[2].w, src0.w, src1.w, src0.0 18: src0.w = temp[1], src1.xyz = input[0] MAD temp[0].xyz, src0.www, src1.xyz, src0.000 19: src0.xyz = temp[1], src1.xyz = temp[2], src2.xyz = temp[0] MAD_SAT temp[0].xyz, src0.xxx, src1.xyz, src2.xyz 20: src0.xyz = const[4], src0.w = temp[2], src1.xyz = temp[0], src2.xyz = temp[1], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.yyy, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f401: src: 1 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00001807:TEX TEX_WAIT wmask: RG omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x000000e1:DP3 dest:14 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c01b:RSQ dest:1 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000111:DP3 dest:17 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004c01b:RSQ dest:1 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00082000:ALU wmask: B omask: NONE 1:RGB_ADDR 0x08000803:Addr0: 3t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000011:DP3 dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00008019:LN2 dest:1 alp_A_src:0 B 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803e001:Addr0: 1t, Addr1: 248t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 11 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c018:EX2 dest:1 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00002000:ALU wmask: B omask: NONE 1:RGB_ADDR 0x08040480:Addr0: 128t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00122390:rgb_A_src:0 0/0/A 0 rgb_B_src:1 0/0/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000405:Addr0: 5t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00285010:MAD dest:1 alp_A_src:1 G 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 14 0:CMN_INST 0x00084800:ALU wmask: AR omask: NONE 1:RGB_ADDR 0x00100480:Addr0: 128t, Addr1: 1t, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08000501:Addr0: 1c, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0091848d:rgb_A_src:1 A/0/0 0 rgb_B_src:0 A/0/0 0 targ: 0 4 ALPHA_INST:0x00109010:MAD dest:1 alp_A_src:1 B 0 alp_B_src:2 R 0 targ 0 w:0 5 RGBA_INST: 0x20490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 15 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040402:Addr0: 2t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0048c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 16 0:CMN_INST 0x00185000:ALU wmask: AG omask: NONE 1:RGB_ADDR 0x08040c04:Addr0: 4t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000005:Addr0: 5t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00822410:rgb_A_src:0 0/R/0 0 rgb_B_src:1 0/R/0 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20431010:MAD dest:1 rgb_C_src:1 0/G/0 0 alp_C_src:0 0 0 17 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000080:Addr0: 128t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044236c:rgb_A_src:0 A/A/A 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00000801:Addr0: 1t, Addr1: 2t, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40100104:Addr0: 4c, Addr1: 0t, Addr2: 1t, srcp:1 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446126:rgb_A_src:2 G/G/G 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 20 Instructions ~ 13 Vector Instructions (RGB) ~ 12 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 2 Texture Instructions ~ 1 Presub Operations ~ 0 OMOD Operations ~ 6 Temporary Registers ~ 2 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL IN[4] DCL IN[5] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], FOG DCL OUT[3], GENERIC[0] DCL OUT[4], GENERIC[1] DCL OUT[5], GENERIC[2] DCL CONST[0..200] DCL TEMP[0..5] DCL ADDR[0] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[2].yzw, IMM[0].xxxy 1: ARL ADDR[0].x, IN[5].xxxx 2: MOV TEMP[0], CONST[ADDR[0].x] 3: MOV TEMP[1], CONST[ADDR[0].x+1] 4: XPD TEMP[2].xyz, TEMP[0], IN[0] 5: MAD TEMP[2].xyz, IN[0], TEMP[0].wwww, TEMP[2] 6: ADD TEMP[2].xyz, TEMP[2], TEMP[1] 7: XPD TEMP[2].xyz, TEMP[0], TEMP[2] 8: MAD TEMP[2].xyz, TEMP[1], TEMP[0].wwww, TEMP[2] 9: MAD TEMP[2].xyz, TEMP[0], -TEMP[1].wwww, TEMP[2] 10: MAD TEMP[2].xyz, CONST[192].xxxx, TEMP[2], IN[0] 11: MOV TEMP[2].w, IN[0].wwww 12: XPD TEMP[1].xyz, TEMP[0], IN[1] 13: MAD TEMP[1].xyz, TEMP[0].wwww, IN[1], TEMP[1] 14: XPD TEMP[1].xyz, TEMP[0], TEMP[1] 15: MAD TEMP[1].xyz, CONST[192].xxxx, TEMP[1], IN[1] 16: XPD TEMP[3].xyz, TEMP[0], IN[4] 17: MAD TEMP[3].xyz, TEMP[0].wwww, IN[4], TEMP[3] 18: XPD TEMP[3].xyz, TEMP[0], TEMP[3] 19: MAD TEMP[3].xyz, CONST[192].xxxx, TEMP[3], IN[4] 20: DP4 OUT[0].x, TEMP[2], CONST[193] 21: DP4 OUT[0].y, TEMP[2], CONST[194] 22: DP4 OUT[0].z, TEMP[2], CONST[195] 23: DP4 OUT[0].w, TEMP[2], CONST[196] 24: MOV_SAT OUT[1], IN[2] 25: ADD OUT[3].xy, IN[3], CONST[197].yzww 26: SUB TEMP[0], CONST[198], TEMP[2] 27: DP3 TEMP[0].w, TEMP[0], TEMP[0] 28: RSQ TEMP[0].w, |TEMP[0].wwww| 29: MUL TEMP[0].xyz, TEMP[0].wwww, TEMP[0] 30: XPD TEMP[4].xyz, TEMP[1], TEMP[3] 31: MUL TEMP[4].xyz, TEMP[4], IN[4].wwww 32: DP3 OUT[4].x, CONST[199], TEMP[3] 33: DP3 OUT[4].y, CONST[199], TEMP[4] 34: DP3 OUT[4].z, CONST[199], TEMP[1] 35: ADD TEMP[5].xyz, CONST[199], TEMP[0] 36: DP3 OUT[5].x, TEMP[5], TEMP[3] 37: DP3 OUT[5].y, TEMP[5], TEMP[4] 38: DP3 OUT[5].z, TEMP[5], TEMP[1] 39: DP4 OUT[2].x, -TEMP[2], CONST[200] 40: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: ARL addr[0].x, input[5].xxxx; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: XPD temp[2].xyz, temp[0], input[0]; 5: MAD temp[2].xyz, input[0], temp[0].wwww, temp[2]; 6: ADD temp[2].xyz, temp[2], temp[1]; 7: XPD temp[2].xyz, temp[0], temp[2]; 8: MAD temp[2].xyz, temp[1], temp[0].wwww, temp[2]; 9: MAD temp[2].xyz, temp[0], -temp[1].wwww, temp[2]; 10: MAD temp[2].xyz, const[192].xxxx, temp[2], input[0]; 11: MOV temp[2].w, input[0].wwww; 12: XPD temp[1].xyz, temp[0], input[1]; 13: MAD temp[1].xyz, temp[0].wwww, input[1], temp[1]; 14: XPD temp[1].xyz, temp[0], temp[1]; 15: MAD temp[1].xyz, const[192].xxxx, temp[1], input[1]; 16: XPD temp[3].xyz, temp[0], input[4]; 17: MAD temp[3].xyz, temp[0].wwww, input[4], temp[3]; 18: XPD temp[3].xyz, temp[0], temp[3]; 19: MAD temp[3].xyz, const[192].xxxx, temp[3], input[4]; 20: DP4 temp[6].x, temp[2], const[193]; 21: DP4 temp[6].y, temp[2], const[194]; 22: DP4 temp[6].z, temp[2], const[195]; 23: DP4 temp[6].w, temp[2], const[196]; 24: MOV_SAT output[1], input[2]; 25: ADD output[3].xy, input[3], const[197].yzww; 26: SUB temp[0], const[198], temp[2]; 27: DP3 temp[0].w, temp[0], temp[0]; 28: RSQ temp[0].w, |temp[0].wwww|; 29: MUL temp[0].xyz, temp[0].wwww, temp[0]; 30: XPD temp[4].xyz, temp[1], temp[3]; 31: MUL temp[4].xyz, temp[4], input[4].wwww; 32: DP3 output[4].x, const[199], temp[3]; 33: DP3 output[4].y, const[199], temp[4]; 34: DP3 output[4].z, const[199], temp[1]; 35: ADD temp[5].xyz, const[199], temp[0]; 36: DP3 output[5].x, temp[5], temp[3]; 37: DP3 output[5].y, temp[5], temp[4]; 38: DP3 output[5].z, temp[5], temp[1]; 39: DP4 output[2].x, -temp[2], const[200]; 40: MOV output[0], temp[6]; 41: MOV output[6], temp[6]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: ARL addr[0].x, input[5].xxxx; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: XPD temp[2].xyz, temp[0], input[0]; 5: MAD temp[2].xyz, input[0], temp[0].wwww, temp[2]; 6: ADD temp[2].xyz, temp[2], temp[1]; 7: XPD temp[2].xyz, temp[0], temp[2]; 8: MAD temp[2].xyz, temp[1], temp[0].wwww, temp[2]; 9: MAD temp[2].xyz, temp[0], -temp[1].wwww, temp[2]; 10: MAD temp[2].xyz, const[192].xxxx, temp[2], input[0]; 11: MOV temp[2].w, input[0].wwww; 12: XPD temp[1].xyz, temp[0], input[1]; 13: MAD temp[1].xyz, temp[0].wwww, input[1], temp[1]; 14: XPD temp[1].xyz, temp[0], temp[1]; 15: MAD temp[1].xyz, const[192].xxxx, temp[1], input[1]; 16: XPD temp[3].xyz, temp[0], input[4]; 17: MAD temp[3].xyz, temp[0].wwww, input[4], temp[3]; 18: XPD temp[3].xyz, temp[0], temp[3]; 19: MAD temp[3].xyz, const[192].xxxx, temp[3], input[4]; 20: DP4 temp[6].x, temp[2], const[193]; 21: DP4 temp[6].y, temp[2], const[194]; 22: DP4 temp[6].z, temp[2], const[195]; 23: DP4 temp[6].w, temp[2], const[196]; 24: MOV_SAT output[1], input[2]; 25: ADD output[3].xy, input[3], const[197].yzww; 26: SUB temp[0], const[198], temp[2]; 27: DP3 temp[0].w, temp[0], temp[0]; 28: RSQ temp[0].w, |temp[0].wwww|; 29: MUL temp[0].xyz, temp[0].wwww, temp[0]; 30: XPD temp[4].xyz, temp[1], temp[3]; 31: MUL temp[4].xyz, temp[4], input[4].wwww; 32: DP3 output[4].x, const[199], temp[3]; 33: DP3 output[4].y, const[199], temp[4]; 34: DP3 output[4].z, const[199], temp[1]; 35: ADD temp[5].xyz, const[199], temp[0]; 36: DP3 output[5].x, temp[5], temp[3]; 37: DP3 output[5].y, temp[5], temp[4]; 38: DP3 output[5].z, temp[5], temp[1]; 39: DP4 output[2].x, -temp[2], const[200]; 40: MOV output[0], temp[6]; 41: MOV output[6], temp[6]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0].0001; 1: ARL addr[0].x, input[5].xxxx; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: MUL temp[2].xyz, temp[0].zxyw, input[0].yzxw; 5: MAD temp[2].xyz, temp[0].yzxw, input[0].zxyw, -temp[2]; 6: MAD temp[2].xyz, input[0], temp[0].wwww, temp[2]; 7: ADD temp[2].xyz, temp[2], temp[1]; 8: MUL temp[7].xyz, temp[0].zxyw, temp[2].yzxw; 9: MAD temp[2].xyz, temp[0].yzxw, temp[2].zxyw, -temp[7]; 10: MAD temp[2].xyz, temp[1], temp[0].wwww, temp[2]; 11: MAD temp[2].xyz, temp[0], -temp[1].wwww, temp[2]; 12: MAD temp[2].xyz, const[192].xxxx, temp[2], input[0]; 13: MOV temp[2].w, input[0].wwww; 14: MUL temp[1].xyz, temp[0].zxyw, input[1].yzxw; 15: MAD temp[1].xyz, temp[0].yzxw, input[1].zxyw, -temp[1]; 16: MAD temp[1].xyz, temp[0].wwww, input[1], temp[1]; 17: MUL temp[8].xyz, temp[0].zxyw, temp[1].yzxw; 18: MAD temp[1].xyz, temp[0].yzxw, temp[1].zxyw, -temp[8]; 19: MAD temp[1].xyz, const[192].xxxx, temp[1], input[1]; 20: MUL temp[3].xyz, temp[0].zxyw, input[4].yzxw; 21: MAD temp[3].xyz, temp[0].yzxw, input[4].zxyw, -temp[3]; 22: MAD temp[3].xyz, temp[0].wwww, input[4], temp[3]; 23: MUL temp[9].xyz, temp[0].zxyw, temp[3].yzxw; 24: MAD temp[3].xyz, temp[0].yzxw, temp[3].zxyw, -temp[9]; 25: MAD temp[3].xyz, const[192].xxxx, temp[3], input[4]; 26: DP4 temp[6].x, temp[2], const[193]; 27: DP4 temp[6].y, temp[2], const[194]; 28: DP4 temp[6].z, temp[2], const[195]; 29: DP4 temp[6].w, temp[2], const[196]; 30: MOV_SAT output[1], input[2]; 31: ADD output[3].xy, input[3], const[197].yzww; 32: ADD temp[0], const[198], -temp[2]; 33: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 34: RSQ temp[0].w, |temp[0].wwww|; 35: MUL temp[0].xyz, temp[0].wwww, temp[0]; 36: MUL temp[4].xyz, temp[1].zxyw, temp[3].yzxw; 37: MAD temp[4].xyz, temp[1].yzxw, temp[3].zxyw, -temp[4]; 38: MUL temp[4].xyz, temp[4], input[4].wwww; 39: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 40: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 41: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 42: ADD temp[5].xyz, const[199], temp[0]; 43: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 44: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 45: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 46: DP4 output[2].x, -temp[2], const[200]; 47: MOV output[0], temp[6]; 48: MOV output[6], temp[6]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[2].yzw, temp[0]._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MOV temp[1], const[1 + addr[0]]; 4: MUL temp[2].xyz, temp[0].zxy_, input[0].yzx_; 5: MAD temp[2].xyz, temp[0].yzx_, input[0].zxy_, -temp[2].xyz_; 6: MAD temp[2].xyz, input[0].xyz_, temp[0].www_, temp[2].xyz_; 7: ADD temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 8: MUL temp[7].xyz, temp[0].zxy_, temp[2].yzx_; 9: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[7].xyz_; 10: MAD temp[2].xyz, temp[1].xyz_, temp[0].www_, temp[2].xyz_; 11: MAD temp[2].xyz, temp[0].xyz_, -temp[1].www_, temp[2].xyz_; 12: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[0].xyz_; 13: MOV temp[2].w, input[0].___w; 14: MUL temp[1].xyz, temp[0].zxy_, input[1].yzx_; 15: MAD temp[1].xyz, temp[0].yzx_, input[1].zxy_, -temp[1].xyz_; 16: MAD temp[1].xyz, temp[0].www_, input[1].xyz_, temp[1].xyz_; 17: MUL temp[8].xyz, temp[0].zxy_, temp[1].yzx_; 18: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[8].xyz_; 19: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[1].xyz_; 20: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 21: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 22: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 23: MUL temp[9].xyz, temp[0].zxy_, temp[3].yzx_; 24: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[9].xyz_; 25: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 26: DP4 temp[6].x, temp[2], const[193]; 27: DP4 temp[6].y, temp[2], const[194]; 28: DP4 temp[6].z, temp[2], const[195]; 29: DP4 temp[6].w, temp[2], const[196]; 30: MOV_SAT output[1], input[2]; 31: ADD output[3].xy, input[3].xy__, const[197].yz__; 32: ADD temp[0].xyz, const[198].xyz_, -temp[2].xyz_; 33: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 34: RSQ temp[0].w, |temp[0].___w|; 35: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 36: MUL temp[4].xyz, temp[1].zxy_, temp[3].yzx_; 37: MAD temp[4].xyz, temp[1].yzx_, temp[3].zxy_, -temp[4].xyz_; 38: MUL temp[4].xyz, temp[4].xyz_, input[4].www_; 39: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 40: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 41: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 42: ADD temp[5].xyz, const[199].xyz_, temp[0].xyz_; 43: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 44: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 45: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 46: DP4 output[2].x, -temp[2], const[200]; 47: MOV output[0], temp[6]; 48: MOV output[6], temp[6]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[2].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[2].xyz, temp[0].yzx_, input[0].zxy_, -temp[2].xyz_; 5: MAD temp[2].xyz, input[0].xyz_, temp[0].www_, temp[2].xyz_; 6: ADD temp[2].xyz, temp[2].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[7].xyz, temp[0].zxy_, temp[2].yzx_; 8: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[7].xyz_; 9: MAD temp[2].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[2].xyz_; 10: MAD temp[2].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[2].xyz_; 11: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[0].xyz_; 12: MOV temp[2].w, input[0].___w; 13: MUL temp[1].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[1].xyz, temp[0].yzx_, input[1].zxy_, -temp[1].xyz_; 15: MAD temp[1].xyz, temp[0].www_, input[1].xyz_, temp[1].xyz_; 16: MUL temp[8].xyz, temp[0].zxy_, temp[1].yzx_; 17: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[8].xyz_; 18: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[9].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[9].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[6].x, temp[2], const[193]; 26: DP4 temp[6].y, temp[2], const[194]; 27: DP4 temp[6].z, temp[2], const[195]; 28: DP4 temp[6].w, temp[2], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[2].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[4].xyz, temp[1].zxy_, temp[3].yzx_; 36: MAD temp[4].xyz, temp[1].yzx_, temp[3].zxy_, -temp[4].xyz_; 37: MUL temp[4].xyz, temp[4].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 41: ADD temp[5].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 44: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 45: DP4 output[2].x, -temp[2], const[200]; 46: MOV output[0], temp[6]; 47: MOV output[6], temp[6]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[2].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[2].xyz, temp[0].yzx_, input[0].zxy_, -temp[2].xyz_; 5: MAD temp[2].xyz, input[0].xyz_, temp[0].www_, temp[2].xyz_; 6: ADD temp[2].xyz, temp[2].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[7].xyz, temp[0].zxy_, temp[2].yzx_; 8: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[7].xyz_; 9: MAD temp[2].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[2].xyz_; 10: MAD temp[2].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[2].xyz_; 11: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[0].xyz_; 12: MOV temp[2].w, input[0].___w; 13: MUL temp[1].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[1].xyz, temp[0].yzx_, input[1].zxy_, -temp[1].xyz_; 15: MAD temp[1].xyz, temp[0].www_, input[1].xyz_, temp[1].xyz_; 16: MUL temp[8].xyz, temp[0].zxy_, temp[1].yzx_; 17: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[8].xyz_; 18: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[9].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[9].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[6].x, temp[2], const[193]; 26: DP4 temp[6].y, temp[2], const[194]; 27: DP4 temp[6].z, temp[2], const[195]; 28: DP4 temp[6].w, temp[2], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[2].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[4].xyz, temp[1].zxy_, temp[3].yzx_; 36: MAD temp[4].xyz, temp[1].yzx_, temp[3].zxy_, -temp[4].xyz_; 37: MUL temp[4].xyz, temp[4].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[4].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[1].xyz0; 41: ADD temp[5].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[5].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[5].xyz0, temp[4].xyz0; 44: DP4 output[5].z, temp[5].xyz0, temp[1].xyz0; 45: DP4 output[2].x, -temp[2], const[200]; 46: MOV output[0], temp[6]; 47: MOV output[6], temp[6]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[1].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[1].xyz, temp[0].yzx_, input[0].zxy_, -temp[1].xyz_; 5: MAD temp[1].xyz, input[0].xyz_, temp[0].www_, temp[1].xyz_; 6: ADD temp[1].xyz, temp[1].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[2].xyz, temp[0].zxy_, temp[1].yzx_; 8: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[2].xyz_; 9: MAD temp[1].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[1].xyz_; 10: MAD temp[1].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[1].xyz_; 11: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[0].xyz_; 12: MOV temp[1].w, input[0].___w; 13: MUL temp[2].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[2].xyz, temp[0].yzx_, input[1].zxy_, -temp[2].xyz_; 15: MAD temp[2].xyz, temp[0].www_, input[1].xyz_, temp[2].xyz_; 16: MUL temp[3].xyz, temp[0].zxy_, temp[2].yzx_; 17: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[3].xyz_; 18: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[4].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[4].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[4].x, temp[1], const[193]; 26: DP4 temp[4].y, temp[1], const[194]; 27: DP4 temp[4].z, temp[1], const[195]; 28: DP4 temp[4].w, temp[1], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[1].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[5].xyz, temp[2].zxy_, temp[3].yzx_; 36: MAD temp[5].xyz, temp[2].yzx_, temp[3].zxy_, -temp[5].xyz_; 37: MUL temp[5].xyz, temp[5].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[5].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[2].xyz0; 41: ADD temp[0].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[0].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[0].xyz0, temp[5].xyz0; 44: DP4 output[5].z, temp[0].xyz0, temp[2].xyz0; 45: DP4 output[2].x, -temp[1], const[200]; 46: MOV output[0], temp[4]; 47: MOV output[6], temp[4]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[1].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[1].xyz, temp[0].yzx_, input[0].zxy_, -temp[1].xyz_; 5: MAD temp[1].xyz, input[0].xyz_, temp[0].www_, temp[1].xyz_; 6: ADD temp[1].xyz, temp[1].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[2].xyz, temp[0].zxy_, temp[1].yzx_; 8: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[2].xyz_; 9: MAD temp[1].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[1].xyz_; 10: MAD temp[1].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[1].xyz_; 11: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[0].xyz_; 12: MOV temp[1].w, input[0].___w; 13: MUL temp[2].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[2].xyz, temp[0].yzx_, input[1].zxy_, -temp[2].xyz_; 15: MAD temp[2].xyz, temp[0].www_, input[1].xyz_, temp[2].xyz_; 16: MUL temp[3].xyz, temp[0].zxy_, temp[2].yzx_; 17: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[3].xyz_; 18: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[4].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[4].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[4].x, temp[1], const[193]; 26: DP4 temp[4].y, temp[1], const[194]; 27: DP4 temp[4].z, temp[1], const[195]; 28: DP4 temp[4].w, temp[1], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[1].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[5].xyz, temp[2].zxy_, temp[3].yzx_; 36: MAD temp[5].xyz, temp[2].yzx_, temp[3].zxy_, -temp[5].xyz_; 37: MUL temp[5].xyz, temp[5].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[5].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[2].xyz0; 41: ADD temp[0].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[0].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[0].xyz0, temp[5].xyz0; 44: DP4 output[5].z, temp[0].xyz0, temp[2].xyz0; 45: DP4 output[2].x, -temp[1], const[200]; 46: MOV output[0], temp[4]; 47: MOV output[6], temp[4]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[2].yzw, none._001; 1: ARL addr[0].x, input[5].x___; 2: MOV temp[0], const[0 + addr[0]]; 3: MUL temp[1].xyz, temp[0].zxy_, input[0].yzx_; 4: MAD temp[1].xyz, temp[0].yzx_, input[0].zxy_, -temp[1].xyz_; 5: MAD temp[1].xyz, input[0].xyz_, temp[0].www_, temp[1].xyz_; 6: ADD temp[1].xyz, temp[1].xyz_, const[1 + addr[0]].xyz_; 7: MUL temp[2].xyz, temp[0].zxy_, temp[1].yzx_; 8: MAD temp[1].xyz, temp[0].yzx_, temp[1].zxy_, -temp[2].xyz_; 9: MAD temp[1].xyz, const[1 + addr[0]].xyz_, temp[0].www_, temp[1].xyz_; 10: MAD temp[1].xyz, temp[0].xyz_, -const[1 + addr[0]].www_, temp[1].xyz_; 11: MAD temp[1].xyz, const[192].xxx_, temp[1].xyz_, input[0].xyz_; 12: MOV temp[1].w, input[0].___w; 13: MUL temp[2].xyz, temp[0].zxy_, input[1].yzx_; 14: MAD temp[2].xyz, temp[0].yzx_, input[1].zxy_, -temp[2].xyz_; 15: MAD temp[2].xyz, temp[0].www_, input[1].xyz_, temp[2].xyz_; 16: MUL temp[3].xyz, temp[0].zxy_, temp[2].yzx_; 17: MAD temp[2].xyz, temp[0].yzx_, temp[2].zxy_, -temp[3].xyz_; 18: MAD temp[2].xyz, const[192].xxx_, temp[2].xyz_, input[1].xyz_; 19: MUL temp[3].xyz, temp[0].zxy_, input[4].yzx_; 20: MAD temp[3].xyz, temp[0].yzx_, input[4].zxy_, -temp[3].xyz_; 21: MAD temp[3].xyz, temp[0].www_, input[4].xyz_, temp[3].xyz_; 22: MUL temp[4].xyz, temp[0].zxy_, temp[3].yzx_; 23: MAD temp[3].xyz, temp[0].yzx_, temp[3].zxy_, -temp[4].xyz_; 24: MAD temp[3].xyz, const[192].xxx_, temp[3].xyz_, input[4].xyz_; 25: DP4 temp[4].x, temp[1], const[193]; 26: DP4 temp[4].y, temp[1], const[194]; 27: DP4 temp[4].z, temp[1], const[195]; 28: DP4 temp[4].w, temp[1], const[196]; 29: MOV_SAT output[1], input[2]; 30: ADD output[3].xy, input[3].xy__, const[197].yz__; 31: ADD temp[0].xyz, const[198].xyz_, -temp[1].xyz_; 32: DP4 temp[0].w, temp[0].xyz0, temp[0].xyz0; 33: RSQ temp[0].w, |temp[0].___w|; 34: MUL temp[0].xyz, temp[0].www_, temp[0].xyz_; 35: MUL temp[5].xyz, temp[2].zxy_, temp[3].yzx_; 36: MAD temp[5].xyz, temp[2].yzx_, temp[3].zxy_, -temp[5].xyz_; 37: MUL temp[5].xyz, temp[5].xyz_, input[4].www_; 38: DP4 output[4].x, const[199].xyz0, temp[3].xyz0; 39: DP4 output[4].y, const[199].xyz0, temp[5].xyz0; 40: DP4 output[4].z, const[199].xyz0, temp[2].xyz0; 41: ADD temp[0].xyz, const[199].xyz_, temp[0].xyz_; 42: DP4 output[5].x, temp[0].xyz0, temp[3].xyz0; 43: DP4 output[5].y, temp[0].xyz0, temp[5].xyz0; 44: DP4 output[5].z, temp[0].xyz0, temp[2].xyz0; 45: DP4 output[2].x, -temp[1], const[200]; 46: MOV output[0], temp[4]; 47: MOV output[6], temp[4]; Final vertex program code: 0: op: 0x00e0a203 dst: 5o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x0010010d dst: 0a0 op: VE_FLT2FIX_DX src0: 0x01ff00a1 reg: 5i swiz: X/ U/ U/ U src1: 0x012480a1 reg: 5i swiz: 0/ 0/ 0/ 0 src2: 0x012480a1 reg: 5i swiz: 0/ 0/ 0/ 0 2: op: 0x00f00003 dst: 0t op: VE_ADD src0: 0x00d10012 reg: 0c swiz: X/ Y/ Z/ W src1: 0x01248012 reg: 0c swiz: 0/ 0/ 0/ 0 src2: 0x01248012 reg: 0c swiz: 0/ 0/ 0/ 0 3: op: 0x00702002 dst: 1t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22001 reg: 0i swiz: Y/ Z/ X/ U src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84001 reg: 0i swiz: Z/ X/ Y/ U src2: 0x1fd10020 reg: 1t swiz: -X/-Y/-Z/-U 5: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01d10001 reg: 0i swiz: X/ Y/ Z/ U src1: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src2: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U 6: op: 0x00702003 dst: 1t op: VE_ADD src0: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src1: 0x01d10032 reg: 1c swiz: X/ Y/ Z/ U src2: 0x01248032 reg: 1c swiz: 0/ 0/ 0/ 0 7: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22020 reg: 1t swiz: Y/ Z/ X/ U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 8: op: 0x00702080 dst: 1t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84020 reg: 1t swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 9: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01d10032 reg: 1c swiz: X/ Y/ Z/ U src1: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src2: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U 10: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01d10000 reg: 0t swiz: X/ Y/ Z/ U src1: 0x1fdb6032 reg: 1c swiz: -W/-W/-W/-U src2: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U 11: op: 0x00702004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x01c01802 reg: 192c swiz: X/ X/ X/ U src1: 0x01d10020 reg: 1t swiz: X/ Y/ Z/ U src2: 0x01d10001 reg: 0i swiz: X/ Y/ Z/ U 12: op: 0x00802003 dst: 1t op: VE_ADD src0: 0x00ffe001 reg: 0i swiz: U/ U/ U/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00704002 dst: 2t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22021 reg: 1i swiz: Y/ Z/ X/ U src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 14: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84021 reg: 1i swiz: Z/ X/ Y/ U src2: 0x1fd10040 reg: 2t swiz: -X/-Y/-Z/-U 15: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src1: 0x01d10021 reg: 1i swiz: X/ Y/ Z/ U src2: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U 16: op: 0x00706002 dst: 3t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22040 reg: 2t swiz: Y/ Z/ X/ U src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 17: op: 0x00704080 dst: 2t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84040 reg: 2t swiz: Z/ X/ Y/ U src2: 0x1fd10060 reg: 3t swiz: -X/-Y/-Z/-U 18: op: 0x00704004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x01c01802 reg: 192c swiz: X/ X/ X/ U src1: 0x01d10040 reg: 2t swiz: X/ Y/ Z/ U src2: 0x01d10021 reg: 1i swiz: X/ Y/ Z/ U 19: op: 0x00706002 dst: 3t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22081 reg: 4i swiz: Y/ Z/ X/ U src2: 0x01248081 reg: 4i swiz: 0/ 0/ 0/ 0 20: op: 0x00706004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84081 reg: 4i swiz: Z/ X/ Y/ U src2: 0x1fd10060 reg: 3t swiz: -X/-Y/-Z/-U 21: op: 0x00706004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src1: 0x01d10081 reg: 4i swiz: X/ Y/ Z/ U src2: 0x01d10060 reg: 3t swiz: X/ Y/ Z/ U 22: op: 0x00708002 dst: 4t op: VE_MULTIPLY src0: 0x01c84000 reg: 0t swiz: Z/ X/ Y/ U src1: 0x01c22060 reg: 3t swiz: Y/ Z/ X/ U src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 23: op: 0x00706080 dst: 3t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22000 reg: 0t swiz: Y/ Z/ X/ U src1: 0x01c84060 reg: 3t swiz: Z/ X/ Y/ U src2: 0x1fd10080 reg: 4t swiz: -X/-Y/-Z/-U 24: op: 0x00706004 dst: 3t op: VE_MULTIPLY_ADD src0: 0x01c01802 reg: 192c swiz: X/ X/ X/ U src1: 0x01d10060 reg: 3t swiz: X/ Y/ Z/ U src2: 0x01d10081 reg: 4i swiz: X/ Y/ Z/ U 25: op: 0x00108001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11822 reg: 193c swiz: X/ Y/ Z/ W src2: 0x01249822 reg: 193c swiz: 0/ 0/ 0/ 0 26: op: 0x00208001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11842 reg: 194c swiz: X/ Y/ Z/ W src2: 0x01249842 reg: 194c swiz: 0/ 0/ 0/ 0 27: op: 0x00408001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11862 reg: 195c swiz: X/ Y/ Z/ W src2: 0x01249862 reg: 195c swiz: 0/ 0/ 0/ 0 28: op: 0x00808001 dst: 4t op: VE_DOT_PRODUCT src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d11882 reg: 196c swiz: X/ Y/ Z/ W src2: 0x01249882 reg: 196c swiz: 0/ 0/ 0/ 0 29: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 30: op: 0x00304203 dst: 2o op: VE_ADD src0: 0x01f90061 reg: 3i swiz: X/ Y/ U/ U src1: 0x01fa38a2 reg: 197c swiz: Y/ Z/ U/ U src2: 0x012498a2 reg: 197c swiz: 0/ 0/ 0/ 0 31: op: 0x00700003 dst: 0t op: VE_ADD src0: 0x01d118c2 reg: 198c swiz: X/ Y/ Z/ U src1: 0x1fd10020 reg: 1t swiz: -X/-Y/-Z/-U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 32: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 33: op: 0x00800048 dst: 0t op: ME_RECIP_SQRT_DX src0: 0x00db6008 reg: 0t swiz: W/ W/ W/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 34: op: 0x00700002 dst: 0t op: VE_MULTIPLY src0: 0x01db6000 reg: 0t swiz: W/ W/ W/ U src1: 0x01d10000 reg: 0t swiz: X/ Y/ Z/ U src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 35: op: 0x0070a002 dst: 5t op: VE_MULTIPLY src0: 0x01c84040 reg: 2t swiz: Z/ X/ Y/ U src1: 0x01c22060 reg: 3t swiz: Y/ Z/ X/ U src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 36: op: 0x0070a080 dst: 5t op: PVS_MACRO_OP_2CLK_MADD src0: 0x01c22040 reg: 2t swiz: Y/ Z/ X/ U src1: 0x01c84060 reg: 3t swiz: Z/ X/ Y/ U src2: 0x1fd100a0 reg: 5t swiz: -X/-Y/-Z/-U 37: op: 0x0070a002 dst: 5t op: VE_MULTIPLY src0: 0x01d100a0 reg: 5t swiz: X/ Y/ Z/ U src1: 0x01db6081 reg: 4i swiz: W/ W/ W/ U src2: 0x01248081 reg: 4i swiz: 0/ 0/ 0/ 0 38: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x011118e2 reg: 199c swiz: X/ Y/ Z/ 0 src1: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 39: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x011118e2 reg: 199c swiz: X/ Y/ Z/ 0 src1: 0x011100a0 reg: 5t swiz: X/ Y/ Z/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 40: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x011118e2 reg: 199c swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 41: op: 0x00700003 dst: 0t op: VE_ADD src0: 0x01d118e2 reg: 199c swiz: X/ Y/ Z/ U src1: 0x01d10000 reg: 0t swiz: X/ Y/ Z/ U src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 42: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x01110060 reg: 3t swiz: X/ Y/ Z/ 0 src2: 0x01248060 reg: 3t swiz: 0/ 0/ 0/ 0 43: op: 0x00208201 dst: 4o op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x011100a0 reg: 5t swiz: X/ Y/ Z/ 0 src2: 0x012480a0 reg: 5t swiz: 0/ 0/ 0/ 0 44: op: 0x00408201 dst: 4o op: VE_DOT_PRODUCT src0: 0x01110000 reg: 0t swiz: X/ Y/ Z/ 0 src1: 0x01110040 reg: 2t swiz: X/ Y/ Z/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 45: op: 0x0010a201 dst: 5o op: VE_DOT_PRODUCT src0: 0x1ed10020 reg: 1t swiz: -X/-Y/-Z/-W src1: 0x00d11902 reg: 200c swiz: X/ Y/ Z/ W src2: 0x01249902 reg: 200c swiz: 0/ 0/ 0/ 0 46: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 47: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10080 reg: 4t swiz: X/ Y/ Z/ W src1: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 src2: 0x01248080 reg: 4t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 48 Instructions ~ 0 Flow Control Instructions ~ 6 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], FOG, PERSPECTIVE DCL IN[2], GENERIC[0], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[3] DCL CONST[1] DCL CONST[3..4] DCL TEMP[0..7] IMM[0] FLT32 { 2.0000, 0.5000, 0.0000, 256.0000} 0: TEX TEMP[0], IN[2], SAMP[0], 2D 1: MUL TEMP[0].xyz, TEMP[0], IMM[0].xxxx 2: TEX TEMP[1].xy, IN[2], SAMP[1], 2D 3: MUL TEMP[2].xyz, TEMP[0], CONST[1].yyyy 4: TEX TEMP[3], IN[2], SAMP[3], 2D 5: SUB TEMP[3].xyz, TEMP[3], IMM[0].yyyy 6: DP3 TEMP[3].w, TEMP[3], TEMP[3] 7: RSQ TEMP[3].w, |TEMP[3].wwww| 8: MUL TEMP[3].xyz, TEMP[3].wwww, TEMP[3] 9: DP3 TEMP[4].w, IN[3], IN[3] 10: RSQ TEMP[4].w, |TEMP[4].wwww| 11: MUL TEMP[4].xyz, TEMP[4].wwww, IN[3] 12: DP3_SAT TEMP[5].x, TEMP[4], TEMP[3] 13: POW TEMP[5].x, TEMP[5].xxxx, IMM[0].wwww 14: MUL TEMP[5].x, TEMP[5].xxxx, CONST[1].xxxx 15: MUL TEMP[5].x, TEMP[5].xxxx, TEMP[1].xxxx 16: MUL TEMP[5].x, TEMP[5].xxxx, CONST[1].zzzz 17: MUL TEMP[0].xyz, TEMP[5].xxxx, IN[0] 18: MUL TEMP[3].x, TEMP[1].yyyy, TEMP[1].yyyy 19: MUL_SAT TEMP[3].x, TEMP[3].xxxx, CONST[1].wwww 20: MAD_SAT TEMP[6].xyz, TEMP[3].xxxx, TEMP[2], TEMP[0] 21: MUL_SAT TEMP[6].w, TEMP[0].wwww, IN[0].wwww 22: MAD_SAT TEMP[7].x, IN[1].xxxx, CONST[3].xxxx, CONST[3].yyyy 23: LRP OUT[0].xyz, TEMP[7].xxxx, TEMP[6], CONST[4] 24: MOV OUT[0].w, TEMP[6] 25: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[5].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[3], input[3]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[3]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[5].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 17: MUL temp[0].xyz, temp[5].xxxx, input[0]; 18: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 19: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 20: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 21: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 22: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 23: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 24: MOV output[0].w, temp[6]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[5].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[3], input[3]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[3]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[5].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 17: MUL temp[0].xyz, temp[5].xxxx, input[0]; 18: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 19: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 20: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 21: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 22: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 23: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 24: MOV output[0].w, temp[6]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[5].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[3], input[3]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[3]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[5].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 17: MUL temp[0].xyz, temp[5].xxxx, input[0]; 18: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 19: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 20: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 21: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 22: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 23: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 24: MOV output[0].w, temp[6]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[5].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[3], input[3]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[3]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[5].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 17: MUL temp[0].xyz, temp[5].xxxx, input[0]; 18: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 19: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 20: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 21: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 22: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 23: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 24: MOV output[0].w, temp[6]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[5].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[3], input[3]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[3]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[5].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 17: MUL temp[0].xyz, temp[5].xxxx, input[0]; 18: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 19: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 20: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 21: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 22: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 23: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 24: MOV output[0].w, temp[6]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: SUB temp[3].xyz, temp[3], const[5].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[3], input[3]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[3]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: POW temp[5].x, temp[5].xxxx, const[5].wwww; 14: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 15: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 16: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 17: MUL temp[0].xyz, temp[5].xxxx, input[0]; 18: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 19: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 20: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 21: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 22: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 23: LRP output[0].xyz, temp[7].xxxx, temp[6], const[4]; 24: MOV output[0].w, temp[6]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[2], 2D[0]; 1: MUL temp[0].xyz, temp[0], const[5].xxxx; 2: TEX temp[1].xy, input[2], 2D[1]; 3: MUL temp[2].xyz, temp[0], const[1].yyyy; 4: TEX temp[3], input[2], 2D[3]; 5: ADD temp[3].xyz, temp[3], -const[5].yyyy; 6: DP3 temp[3].w, temp[3], temp[3]; 7: RSQ temp[3].w, |temp[3].wwww|; 8: MUL temp[3].xyz, temp[3].wwww, temp[3]; 9: DP3 temp[4].w, input[3], input[3]; 10: RSQ temp[4].w, |temp[4].wwww|; 11: MUL temp[4].xyz, temp[4].wwww, input[3]; 12: DP3_SAT temp[5].x, temp[4], temp[3]; 13: LG2 temp[8].w, temp[5].xxxx; 14: MUL temp[8].w, temp[8].wwww, const[5].wwww; 15: EX2 temp[5].x, temp[8].wwww; 16: MUL temp[5].x, temp[5].xxxx, const[1].xxxx; 17: MUL temp[5].x, temp[5].xxxx, temp[1].xxxx; 18: MUL temp[5].x, temp[5].xxxx, const[1].zzzz; 19: MUL temp[0].xyz, temp[5].xxxx, input[0]; 20: MUL temp[3].x, temp[1].yyyy, temp[1].yyyy; 21: MUL_SAT temp[3].x, temp[3].xxxx, const[1].wwww; 22: MAD_SAT temp[6].xyz, temp[3].xxxx, temp[2], temp[0]; 23: MUL_SAT temp[6].w, temp[0].wwww, input[0].wwww; 24: MAD_SAT temp[7].x, input[1].xxxx, const[3].xxxx, const[3].yyyy; 25: ADD temp[9].xyz, temp[6], -const[4]; 26: MAD output[0].xyz, temp[7].xxxx, temp[9], const[4]; 27: MOV output[0].w, temp[6]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[2].xy__, 2D[0]; 1: MUL temp[0].xyz, temp[0].xyz_, const[5].xxx_; 2: TEX temp[1].xy, input[2].xy__, 2D[1]; 3: MUL temp[2].xyz, temp[0].xyz_, const[1].yyy_; 4: TEX temp[3].xyz, input[2].xy__, 2D[3]; 5: ADD temp[3].xyz, temp[3].xyz_, -const[5].yyy_; 6: DP3 temp[3].w, temp[3].xyz_, temp[3].xyz_; 7: RSQ temp[3].w, |temp[3].___w|; 8: MUL temp[3].xyz, temp[3].www_, temp[3].xyz_; 9: DP3 temp[4].w, input[3].xyz_, input[3].xyz_; 10: RSQ temp[4].w, |temp[4].___w|; 11: MUL temp[4].xyz, temp[4].www_, input[3].xyz_; 12: DP3_SAT temp[5].x, temp[4].xyz_, temp[3].xyz_; 13: LG2 temp[8].w, temp[5].___x; 14: MUL temp[8].w, temp[8].___w, const[5].___w; 15: EX2 temp[5].x, temp[8].w___; 16: MUL temp[5].x, temp[5].x___, const[1].x___; 17: MUL temp[5].x, temp[5].x___, temp[1].x___; 18: MUL temp[5].x, temp[5].x___, const[1].z___; 19: MUL temp[0].xyz, temp[5].xxx_, input[0].xyz_; 20: MUL temp[3].x, temp[1].y___, temp[1].y___; 21: MUL_SAT temp[3].x, temp[3].x___, const[1].w___; 22: MAD_SAT temp[6].xyz, temp[3].xxx_, temp[2].xyz_, temp[0].xyz_; 23: MUL_SAT temp[6].w, temp[0].___w, input[0].___w; 24: MAD_SAT temp[7].x, input[1].x___, const[3].x___, const[3].y___; 25: ADD temp[9].xyz, temp[6].xyz_, -const[4].xyz_; 26: MAD output[0].xyz, temp[7].xxx_, temp[9].xyz_, const[4].xyz_; 27: MOV output[0].w, temp[6].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[5].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -const[5].yyy_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[3].xyz_, input[3].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[3].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, const[5].___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: MUL temp[28].x, temp[27].x___, const[1].z___; 19: MUL temp[29].xyz, temp[28].xxx_, input[0].xyz_; 20: MUL temp[30].x, temp[12].y___, temp[12].y___; 21: MUL_SAT temp[31].x, temp[30].x___, const[1].w___; 22: MAD_SAT temp[32].xyz, temp[31].xxx_, temp[13].xyz_, temp[29].xyz_; 23: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 24: MAD_SAT temp[34].x, input[1].x___, const[3].x___, const[3].y___; 25: ADD temp[35].xyz, temp[32].xyz_, -const[4].xyz_; 26: MAD output[0].xyz, temp[34].xxx_, temp[35].xyz_, const[4].xyz_; 27: MOV output[0].w, temp[33].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, const[5].xxx_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[3].xyz_, input[3].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[3].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, const[5].___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: MUL temp[28].x, temp[27].x___, const[1].z___; 19: MUL temp[29].xyz, temp[28].xxx_, input[0].xyz_; 20: MUL temp[30].x, temp[12].y___, temp[12].y___; 21: MUL_SAT temp[31].x, temp[30].x___, const[1].w___; 22: MAD_SAT temp[32].xyz, temp[31].xxx_, temp[13].xyz_, temp[29].xyz_; 23: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 24: MAD_SAT temp[34].x, input[1].x___, const[3].x___, const[3].y___; 25: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[4]).xyz_, const[4].xyz_; 26: MOV output[0].w, temp[33].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[3].xyz_, input[3].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[3].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, 256.000000 (0x78).___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: MUL temp[28].x, temp[27].x___, const[1].z___; 19: MUL temp[29].xyz, temp[28].xxx_, input[0].xyz_; 20: MUL temp[30].x, temp[12].y___, temp[12].y___; 21: MUL_SAT temp[31].x, temp[30].x___, const[1].w___; 22: MAD_SAT temp[32].xyz, temp[31].xxx_, temp[13].xyz_, temp[29].xyz_; 23: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 24: MAD_SAT temp[34].x, input[1].x___, const[3].x___, const[3].y___; 25: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[4]).xyz_, const[4].xyz_; 26: MOV output[0].w, temp[33].___w; CONST[5] = { 2.0000 0.5000 0.0000 256.0000 } Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[3].xyz_, input[3].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[3].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, 256.000000 (0x78).___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: MUL temp[28].x, temp[27].x___, const[1].z___; 19: MUL temp[29].xyz, temp[28].xxx_, input[0].xyz_; 20: MUL temp[30].x, temp[12].y___, temp[12].y___; 21: MUL_SAT temp[31].x, temp[30].x___, const[1].w___; 22: MAD_SAT temp[32].xyz, temp[31].xxx_, temp[13].xyz_, temp[29].xyz_; 23: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 24: MAD_SAT temp[34].x, input[1].x___, const[3].x___, const[3].y___; 25: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[4]).xyz_, const[4].xyz_; 26: MOV output[0].w, temp[33].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: MUL temp[11].xyz, temp[10].xyz_, 2.000000 (0x40).www_; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: MUL temp[13].xyz, temp[11].xyz_, const[1].yyy_; 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: ADD temp[15].xyz, temp[14].xyz_, -none.HHH_; 6: DP3 temp[16].w, temp[15].xyz_, temp[15].xyz_; 7: RSQ temp[17].w, |temp[16].___w|; 8: MUL temp[18].xyz, temp[17].www_, temp[15].xyz_; 9: DP3 temp[19].w, input[3].xyz_, input[3].xyz_; 10: RSQ temp[20].w, |temp[19].___w|; 11: MUL temp[21].xyz, temp[20].www_, input[3].xyz_; 12: DP3_SAT temp[22].x, temp[21].xyz_, temp[18].xyz_; 13: LG2 temp[23].w, temp[22].___x; 14: MUL temp[24].w, temp[23].___w, 256.000000 (0x78).___w; 15: EX2 temp[25].x, temp[24].w___; 16: MUL temp[26].x, temp[25].x___, const[1].x___; 17: MUL temp[27].x, temp[26].x___, temp[12].x___; 18: MUL temp[28].x, temp[27].x___, const[1].z___; 19: MUL temp[29].xyz, temp[28].xxx_, input[0].xyz_; 20: MUL temp[30].x, temp[12].y___, temp[12].y___; 21: MUL_SAT temp[31].x, temp[30].x___, const[1].w___; 22: MAD_SAT temp[32].xyz, temp[31].xxx_, temp[13].xyz_, temp[29].xyz_; 23: MUL_SAT temp[33].w, temp[10].___w, input[0].___w; 24: MAD_SAT temp[34].x, input[1].x___, const[3].x___, const[3].y___; 25: MAD output[0].xyz, temp[34].xxx_, (temp[32] - const[4]).xyz_, const[4].xyz_; 26: MOV output[0].w, temp[33].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[10], input[2].xy__, 2D[0]; 1: src0.xyz = temp[10], src0.w = 2.000000 (0x40) MAD temp[11].xyz, src0.xyz, src0.www, src0.000 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: src0.xyz = temp[11], src1.xyz = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 4: TEX temp[14].xyz, input[2].xy__, 2D[3]; 5: src0.xyz = temp[14] MAD temp[15].xyz, src0.xyz, src0.111, -src0.HHH 6: src0.xyz = temp[15] DP3, src0.xyz, src0.xyz DP3 temp[16].w, src0._, src0._ 7: src0.w = temp[16] RSQ temp[17].w, |src0.w| 8: src0.xyz = temp[15], src0.w = temp[17] MAD temp[18].xyz, src0.www, src0.xyz, src0.000 9: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[19].w, src0._, src0._ 10: src0.w = temp[19] RSQ temp[20].w, |src0.w| 11: src0.xyz = input[3], src0.w = temp[20] MAD temp[21].xyz, src0.www, src0.xyz, src0.000 12: src0.xyz = temp[21], src1.xyz = temp[18] DP3_SAT temp[22].x, src0.xyz, src1.xyz 13: src0.xyz = temp[22] LG2 temp[23].w, src0.x 14: src0.w = temp[23], src1.w = 256.000000 (0x78) MAD temp[24].w, src0.w, src1.w, src0.0 15: src0.w = temp[24] REPL_ALPHA temp[25].x EX2, src0.w 16: src0.xyz = temp[25], src1.xyz = const[1] MAD temp[26].x, src0.x__, src1.x__, src0.000 17: src0.xyz = temp[26], src1.xyz = temp[12] MAD temp[27].x, src0.x__, src1.x__, src0.000 18: src0.xyz = temp[27], src1.xyz = const[1] MAD temp[28].x, src0.x__, src1.z__, src0.000 19: src0.xyz = temp[28], src1.xyz = input[0] MAD temp[29].xyz, src0.xxx, src1.xyz, src0.000 20: src0.xyz = temp[12] MAD temp[30].x, src0.y__, src0.y__, src0.000 21: src0.xyz = temp[30], src0.w = const[1] MAD_SAT temp[31].x, src0.x__, src0.w__, src0.000 22: src0.xyz = temp[31], src1.xyz = temp[13], src2.xyz = temp[29] MAD_SAT temp[32].xyz, src0.xxx, src1.xyz, src2.xyz 23: src0.w = temp[10], src1.w = input[0] MAD_SAT temp[33].w, src0.w, src1.w, src0.0 24: src0.xyz = input[1], src1.xyz = const[3] MAD_SAT temp[34].x, src0.x__, src1.x__, src1.y__ 25: src0.xyz = const[4], src1.xyz = temp[32], src2.xyz = temp[34], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 26: src0.w = temp[33] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: TEX temp[14].xyz, input[2].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[19].w, src0._, src0._ 5: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.xyz = temp[12] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 MAD temp[30].w, src1.y, src1.y, src0.0 6: src0.xyz = temp[11], src0.w = temp[30], src1.xyz = const[1], src1.w = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 MAD_SAT temp[31].w, src0.w, src1.w, src0.0 7: src0.xyz = temp[14], src0.w = temp[19] MAD temp[15].xyz, src0.xyz, src0.111, -src0.HHH RSQ temp[20].w, |src0.w| 8: src0.xyz = temp[15] DP3, src0.xyz, src0.xyz DP3 temp[16].w, src0._, src0._ 9: src0.xyz = input[3], src0.w = temp[20], src1.w = temp[16] MAD temp[21].xyz, src0.www, src0.xyz, src0.000 RSQ temp[17].w, |src1.w| 10: src0.xyz = temp[15], src0.w = temp[17] MAD temp[18].xyz, src0.www, src0.xyz, src0.000 11: src0.xyz = temp[21], src1.xyz = temp[18] DP3_SAT temp[22].x, src0.xyz, src1.xyz 12: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[3], src1.w = input[0] MAD_SAT temp[34].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[33].w, src0.w, src1.w, src0.0 13: src0.xyz = temp[22] LG2 temp[23].w, src0.x 14: src0.w = temp[23], src1.w = 256.000000 (0x78) MAD temp[24].w, src0.w, src1.w, src0.0 15: src0.w = temp[24] REPL_ALPHA temp[25].x EX2, src0.w 16: src0.xyz = temp[25], src1.xyz = const[1] MAD temp[26].x, src0.x__, src1.x__, src0.000 17: src0.xyz = temp[26], src1.xyz = temp[12] MAD temp[27].x, src0.x__, src1.x__, src0.000 18: src0.xyz = temp[27], src1.xyz = const[1] MAD temp[28].x, src0.x__, src1.z__, src0.000 19: src0.xyz = temp[28], src1.xyz = input[0] MAD temp[29].xyz, src0.xxx, src1.xyz, src0.000 20: src0.xyz = temp[31], src0.w = temp[31], src1.xyz = temp[13], src2.xyz = temp[29] MAD_SAT temp[32].xyz, src0.www, src1.xyz, src2.xyz 21: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[32], src2.xyz = temp[34], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[10], input[2].xy__, 2D[0]; 2: TEX temp[12].xy, input[2].xy__, 2D[1]; 3: TEX temp[14].xyz, input[2].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[3] DP3, src0.xyz, src0.xyz DP3 temp[19].w, src0._, src0._ 5: src0.xyz = temp[10], src0.w = 2.000000 (0x40), src1.xyz = temp[12] SEM_WAIT MAD temp[11].xyz, src0.xyz, src0.www, src0.000 MAD temp[30].w, src1.y, src1.y, src0.0 6: src0.xyz = temp[11], src0.w = temp[30], src1.xyz = const[1], src1.w = const[1] MAD temp[13].xyz, src0.xyz, src1.yyy, src0.000 MAD_SAT temp[31].w, src0.w, src1.w, src0.0 7: src0.xyz = temp[14], src0.w = temp[19] MAD temp[15].xyz, src0.xyz, src0.111, -src0.HHH RSQ temp[20].w, |src0.w| 8: src0.xyz = temp[15] DP3, src0.xyz, src0.xyz DP3 temp[16].w, src0._, src0._ 9: src0.xyz = input[3], src0.w = temp[20], src1.w = temp[16] MAD temp[21].xyz, src0.www, src0.xyz, src0.000 RSQ temp[17].w, |src1.w| 10: src0.xyz = temp[15], src0.w = temp[17] MAD temp[18].xyz, src0.www, src0.xyz, src0.000 11: src0.xyz = temp[21], src1.xyz = temp[18] DP3_SAT temp[22].x, src0.xyz, src1.xyz 12: src0.xyz = input[1], src0.w = temp[10], src1.xyz = const[3], src1.w = input[0] MAD_SAT temp[34].x, src0.x__, src1.x__, src1.y__ MAD_SAT temp[33].w, src0.w, src1.w, src0.0 13: src0.xyz = temp[22] LG2 temp[23].w, src0.x 14: src0.w = temp[23], src1.w = 256.000000 (0x78) MAD temp[24].w, src0.w, src1.w, src0.0 15: src0.w = temp[24] REPL_ALPHA temp[25].x EX2, src0.w 16: src0.xyz = temp[25], src1.xyz = const[1] MAD temp[26].x, src0.x__, src1.x__, src0.000 17: src0.xyz = temp[26], src1.xyz = temp[12] MAD temp[27].x, src0.x__, src1.x__, src0.000 18: src0.xyz = temp[27], src1.xyz = const[1] MAD temp[28].x, src0.x__, src1.z__, src0.000 19: src0.xyz = temp[28], src1.xyz = input[0] MAD temp[29].xyz, src0.xxx, src1.xyz, src0.000 20: src0.w = temp[31], src1.xyz = temp[13], src2.xyz = temp[29] MAD_SAT temp[32].xyz, src0.www, src1.xyz, src2.xyz 21: src0.xyz = const[4], src0.w = temp[33], src1.xyz = temp[32], src2.xyz = temp[34], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xy__, 2D[0]; 2: TEX temp[3].yz, input[1].xy__, 2D[1]; 3: TEX temp[1].xyz, input[1].xy__, 2D[3] SEM_WAIT SEM_ACQUIRE; 4: src0.xyz = input[2] DP3, src0.xyz, src0.xyz DP3 temp[1].w, src0._, src0._ 5: src0.xyz = temp[4], src0.w = 2.000000 (0x40), src1.xyz = temp[3] SEM_WAIT MAD temp[5].xyz, src0.xyz, src0.www, src0.000 MAD temp[2].w, src1.z, src1.z, src0.0 6: src0.xyz = temp[5], src0.w = temp[2], src1.xyz = const[1], src1.w = const[1] MAD temp[5].xyz, src0.xyz, src1.yyy, src0.000 MAD_SAT temp[2].w, src0.w, src1.w, src0.0 7: src0.xyz = temp[1], src0.w = temp[1] MAD temp[1].xyz, src0.xyz, src0.111, -src0.HHH RSQ temp[1].w, |src0.w| 8: src0.xyz = temp[1] DP3, src0.xyz, src0.xyz DP3 temp[3].w, src0._, src0._ 9: src0.xyz = input[2], src0.w = temp[1], src1.w = temp[3] MAD temp[2].xyz, src0.www, src0.xyz, src0.000 RSQ temp[1].w, |src1.w| 10: src0.xyz = temp[1], src0.w = temp[1] MAD temp[1].xyz, src0.www, src0.xyz, src0.000 11: src0.xyz = temp[2], src1.xyz = temp[1] DP3_SAT temp[1].x, src0.xyz, src1.xyz 12: src0.xyz = input[3], src0.w = temp[4], src1.xyz = const[3], src1.w = input[0] MAD_SAT temp[1].y, src0._x_, src1._x_, src1._y_ MAD_SAT temp[1].w, src0.w, src1.w, src0.0 13: src0.xyz = temp[1] LG2 temp[3].w, src0.x 14: src0.w = temp[3], src1.w = 256.000000 (0x78) MAD temp[3].w, src0.w, src1.w, src0.0 15: src0.w = temp[3] REPL_ALPHA temp[1].x EX2, src0.w 16: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[1].x, src0.x__, src1.x__, src0.0__ 17: src0.xyz = temp[1], src1.xyz = temp[3] MAD temp[1].x, src0.x__, src1.y__, src0.0__ 18: src0.xyz = temp[1], src1.xyz = const[1] MAD temp[1].x, src0.x__, src1.z__, src0.0__ 19: src0.xyz = temp[1], src1.xyz = input[0] MAD temp[0].xyz, src0.xxx, src1.xyz, src0.000 20: src0.w = temp[2], src1.xyz = temp[5], src2.xyz = temp[0] MAD_SAT temp[0].xyz, src0.www, src1.xyz, src2.xyz 21: src0.xyz = const[4], src0.w = temp[1], src1.xyz = temp[0], src2.xyz = temp[1], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.yyy, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe404f401: src: 1 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003003:TEX wmask: GB omask: NONE 1:TEX_INST: 0x00410000: id: 1 op:LD, , SCALED 2:TEX_ADDR: 0xd003f401: src: 1 R/G/A/A dst: 3 R/R/G/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810011:DP dest:1 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000131:DP3 dest:19 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000c04:Addr0: 4t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x080200c0:Addr0: 192t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00489020:MAD dest:2 alp_A_src:1 B 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x20490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 5 0:CMN_INST 0x00107800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040405:Addr0: 5t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040402:Addr0: 2t, Addr1: 1c, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0068c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 6 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x0004c01b:RSQ dest:1 alp_A_src:0 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00db4010:MAD dest:1 rgb_C_src:0 H/H/H 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00810031:DP dest:3 alp_A_src:0 0 0 alp_B_src:0 0 0 targ 0 w:0 5 RGBA_INST: 0x00000101:DP3 dest:16 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000c01:Addr0: 1t, Addr1: 3t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x0004d01b:RSQ dest:1 alp_A_src:1 A 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000402:Addr0: 2t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000011:DP3 dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00185000:ALU wmask: AG omask: NONE 1:RGB_ADDR 0x08040c03:Addr0: 3t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000004:Addr0: 4t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00822410:rgb_A_src:0 0/R/0 0 rgb_B_src:1 0/R/0 0 targ: 0 4 ALPHA_INST:0x0068c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20431010:MAD dest:1 rgb_C_src:1 0/G/0 0 alp_C_src:0 0 0 12 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000039:LN2 dest:3 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00004000:ALU wmask: A omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0803e003:Addr0: 3t, Addr1: 248t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0068c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 14 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000001a:SOP dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08040401:Addr0: 1t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08000c01:Addr0: 1t, Addr1: 3t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0090a480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 G/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00000800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08040401:Addr0: 1t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00912480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 B/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00003800:ALU wmask: RGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 19 0:CMN_INST 0x00083a00:ALU NOP wmask: RGB omask: NONE 1:RGB_ADDR 0x00001480:Addr0: 128t, Addr1: 5t, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044236c:rgb_A_src:0 A/A/A 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40100104:Addr0: 4c, Addr1: 0t, Addr2: 1t, srcp:1 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446126:rgb_A_src:2 G/G/G 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 21 Instructions ~ 16 Vector Instructions (RGB) ~ 11 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 1 Presub Operations ~ 0 OMOD Operations ~ 6 Temporary Registers ~ 2 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], FOG DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL CONST[0..5] DCL TEMP[0] IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[1].yzw, IMM[0].xxxy 1: MUL TEMP[0], IN[0].xxxx, CONST[2] 2: MAD TEMP[0], IN[0].yyyy, CONST[3], TEMP[0] 3: MAD TEMP[0], IN[0].zzzz, CONST[4], TEMP[0] 4: MAD OUT[0], IN[0].wwww, CONST[5], TEMP[0] 5: MOV OUT[2], IN[2] 6: MUL OUT[3], IN[1], CONST[0] 7: DP4 OUT[1].x, -IN[0], CONST[1] 8: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[1].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[1], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[1].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[1], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[1].yzw, temp[0].0001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[1], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[1].yzw, temp[0]._001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[1], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[1], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[1], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[1]; 9: MOV output[4], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[0], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[0]; 9: MOV output[4], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[0], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[0]; 9: MOV output[4], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1].yzw, none._001; 1: MUL temp[0], input[0].xxxx, const[2]; 2: MAD temp[0], input[0].yyyy, const[3], temp[0]; 3: MAD temp[0], input[0].zzzz, const[4], temp[0]; 4: MAD temp[0], input[0].wwww, const[5], temp[0]; 5: MOV output[2], input[2]; 6: MUL output[3], input[1], const[0]; 7: DP4 output[1].x, -input[0], const[1]; 8: MOV output[0], temp[0]; 9: MOV output[4], temp[0]; Final vertex program code: 0: op: 0x00e06203 dst: 3o op: VE_ADD src0: 0x0164e000 reg: 0t swiz: U/ 0/ 0/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 1: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 6: op: 0x00f04202 dst: 2o op: VE_MULTIPLY src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 7: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x1ed10001 reg: 0i swiz: -X/-Y/-Z/-W src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x01248022 reg: 1c swiz: 0/ 0/ 0/ 0 8: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 9: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 10 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], FOG, PERSPECTIVE DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0..1] DCL TEMP[0..2] 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MUL_SAT TEMP[1], IN[2], TEMP[0] 2: MAD_SAT TEMP[2].x, IN[0].xxxx, CONST[0].xxxx, CONST[0].yyyy 3: LRP OUT[0].xyz, TEMP[2].xxxx, TEMP[1], CONST[1] 4: MOV OUT[0].w, TEMP[1] 5: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].xxxx, const[0].xxxx, const[0].yyyy; 3: LRP output[0].xyz, temp[2].xxxx, temp[1], const[1]; 4: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].xxxx, const[0].xxxx, const[0].yyyy; 3: LRP output[0].xyz, temp[2].xxxx, temp[1], const[1]; 4: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].xxxx, const[0].xxxx, const[0].yyyy; 3: LRP output[0].xyz, temp[2].xxxx, temp[1], const[1]; 4: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].xxxx, const[0].xxxx, const[0].yyyy; 3: LRP output[0].xyz, temp[2].xxxx, temp[1], const[1]; 4: MOV output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].xxxx, const[0].xxxx, const[0].yyyy; 3: LRP output[0].xyz, temp[2].xxxx, temp[1], const[1]; 4: MOV output[0].w, temp[1]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].xxxx, const[0].xxxx, const[0].yyyy; 3: LRP output[0].xyz, temp[2].xxxx, temp[1], const[1]; 4: MOV output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].xxxx, const[0].xxxx, const[0].yyyy; 3: ADD temp[3].xyz, temp[1], -const[1]; 4: MAD output[0].xyz, temp[2].xxxx, temp[3], const[1]; 5: MOV output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MUL_SAT temp[1], input[2], temp[0]; 2: MAD_SAT temp[2].x, input[0].x___, const[0].x___, const[0].y___; 3: ADD temp[3].xyz, temp[1].xyz_, -const[1].xyz_; 4: MAD output[0].xyz, temp[2].xxx_, temp[3].xyz_, const[1].xyz_; 5: MOV output[0].w, temp[1].___w; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[4], input[1].xy__, 2D[0]; 1: MUL_SAT temp[5], input[2], temp[4]; 2: MAD_SAT temp[6].x, input[0].x___, const[0].x___, const[0].y___; 3: ADD temp[7].xyz, temp[5].xyz_, -const[1].xyz_; 4: MAD output[0].xyz, temp[6].xxx_, temp[7].xyz_, const[1].xyz_; 5: MOV output[0].w, temp[5].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[4], input[1].xy__, 2D[0]; 1: MUL_SAT temp[5], input[2], temp[4]; 2: MAD_SAT temp[6].x, input[0].x___, const[0].x___, const[0].y___; 3: MAD output[0].xyz, temp[6].xxx_, (temp[5] - const[1]).xyz_, const[1].xyz_; 4: MOV output[0].w, temp[5].___w; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[4], input[1].xy__, 2D[0]; 1: MUL_SAT temp[5], input[2], temp[4]; 2: MAD_SAT temp[6].x, input[0].x___, const[0].x___, const[0].y___; 3: MAD output[0].xyz, temp[6].xxx_, (temp[5] - const[1]).xyz_, const[1].xyz_; 4: MOV output[0].w, temp[5].___w; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[4], input[1].xy__, 2D[0]; 1: MUL_SAT temp[5], input[2], temp[4]; 2: MAD_SAT temp[6].x, input[0].x___, const[0].x___, const[0].y___; 3: MAD output[0].xyz, temp[6].xxx_, (temp[5] - const[1]).xyz_, const[1].xyz_; 4: MOV output[0].w, temp[5].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[4], input[1].xy__, 2D[0]; 1: MUL_SAT temp[5], input[2], temp[4]; 2: MAD_SAT temp[6].x, input[0].x___, const[0].x___, const[0].y___; 3: MAD output[0].xyz, temp[6].xxx_, (temp[5] - const[1]).xyz_, const[1].xyz_; 4: MOV output[0].w, temp[5].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[4], input[1].xy__, 2D[0]; 1: src0.xyz = input[2], src0.w = input[2], src1.xyz = temp[4], src1.w = temp[4] MAD_SAT temp[5].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[5].w, src0.w, src1.w, src0.0 2: src0.xyz = input[0], src1.xyz = const[0] MAD_SAT temp[6].x, src0.x__, src1.x__, src1.y__ 3: src0.xyz = const[1], src1.xyz = temp[5], src2.xyz = temp[6], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz 4: src0.w = temp[5] MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src1.xyz = const[0] MAD_SAT temp[6].x, src0.x__, src1.x__, src1.y__ 3: src0.xyz = input[2], src0.w = input[2], src1.xyz = temp[4], src1.w = temp[4] SEM_WAIT MAD_SAT temp[5].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[5].w, src0.w, src1.w, src0.0 4: src0.xyz = const[1], src0.w = temp[5], src1.xyz = temp[5], src2.xyz = temp[6], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[4], input[1].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src1.xyz = const[0] MAD_SAT temp[6].x, src0.x__, src1.x__, src1.y__ 3: src0.xyz = input[2], src0.w = input[2], src1.xyz = temp[4], src1.w = temp[4] SEM_WAIT MAD_SAT temp[5].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[5].w, src0.w, src1.w, src0.0 4: src0.xyz = const[1], src0.w = temp[5], src1.xyz = temp[5], src2.xyz = temp[6], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[2], src1.xyz = const[0] MAD_SAT temp[2].x, src0.x__, src1.x__, src1.y__ 3: src0.xyz = input[1], src0.w = input[1], src1.xyz = temp[0], src1.w = temp[0] SEM_WAIT MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT temp[0].w, src0.w, src1.w, src0.0 4: src0.xyz = const[1], src0.w = temp[0], src1.xyz = temp[0], src2.xyz = temp[2], srcp.xyz = (src1 - src0) MAD color[0].xyz, src2.xxx, srcp.xyz, src0.xyz MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00080800:ALU wmask: R omask: NONE 1:RGB_ADDR 0x08040002:Addr0: 2t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00485020:MAD dest:2 rgb_C_src:1 G/0/0 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00187a04:ALU TEX_WAIT NOP wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x40200101:Addr0: 1c, Addr1: 0t, Addr2: 2t, srcp:1 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446002:rgb_A_src:2 R/R/R 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20220000:MAD dest:0 rgb_C_src:0 R/G/B 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL OUT[2], GENERIC[1] DCL OUT[3], GENERIC[2] DCL OUT[4], GENERIC[3] DCL OUT[5], GENERIC[4] DCL OUT[6], GENERIC[5] DCL OUT[7], GENERIC[6] DCL CONST[0] DCL TEMP[0..1] IMM[0] FLT32 { 1.0000, 0.0000, -1.0000, 0.0000} 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: MAD TEMP[0], CONST[0], IMM[0].xxyy, IN[1] 3: MAD TEMP[1], CONST[0], IMM[0].zzyy, IN[1] 4: MOV OUT[2], TEMP[0] 5: MOV OUT[3], TEMP[1] 6: ADD TEMP[0].x, TEMP[0], CONST[0].zzzz 7: SUB TEMP[1].x, TEMP[1], CONST[0].zzzz 8: MOV OUT[4], TEMP[0] 9: MOV OUT[5], TEMP[1] 10: ADD TEMP[0].x, TEMP[0], CONST[0].wwww 11: SUB TEMP[1].x, TEMP[1], CONST[0].wwww 12: MOV OUT[6], TEMP[0] 13: MOV OUT[7], TEMP[1] 14: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0], const[0].zzzz; 7: SUB temp[1].x, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].x, temp[0], const[0].wwww; 11: SUB temp[1].x, temp[1], const[0].wwww; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0], const[0].zzzz; 7: SUB temp[1].x, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].x, temp[0], const[0].wwww; 11: SUB temp[1].x, temp[1], const[0].wwww; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0], const[0].zzzz; 7: ADD temp[1].x, temp[1], -const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].x, temp[0], const[0].wwww; 11: ADD temp[1].x, temp[1], -const[0].wwww; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].x, temp[0].x___, const[0].z___; 7: ADD temp[1].x, temp[1].x___, -const[0].z___; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].x, temp[0].x___, const[0].w___; 11: ADD temp[1].x, temp[1].x___, -const[0].w___; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].x, temp[0].x___, const[0].w___; 10: ADD temp[1].x, temp[1].x___, -const[0].w___; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].x, temp[0].x___, const[0].w___; 10: ADD temp[1].x, temp[1].x___, -const[0].w___; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].x, temp[0].x___, const[0].w___; 10: ADD temp[1].x, temp[1].x___, -const[0].w___; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].x, temp[0].x___, const[0].w___; 10: ADD temp[1].x, temp[1].x___, -const[0].w___; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].x, temp[0].x___, const[0].z___; 6: ADD temp[1].x, temp[1].x___, -const[0].z___; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].x, temp[0].x___, const[0].w___; 10: ADD temp[1].x, temp[1].x___, -const[0].w___; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0125a000 reg: 0t swiz: 1/ 1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 2: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0725a000 reg: 0t swiz: -1/-1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 5: op: 0x00100003 dst: 0t op: VE_ADD src0: 0x01ff0000 reg: 0t swiz: X/ U/ U/ U src1: 0x01ff4002 reg: 0c swiz: Z/ U/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x01ff0020 reg: 1t swiz: X/ U/ U/ U src1: 0x1fff4002 reg: 0c swiz: -Z/-U/-U/-U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 7: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00100003 dst: 0t op: VE_ADD src0: 0x01ff0000 reg: 0t swiz: X/ U/ U/ U src1: 0x01ff6002 reg: 0c swiz: W/ U/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 10: op: 0x00102003 dst: 1t op: VE_ADD src0: 0x01ff0020 reg: 1t swiz: X/ U/ U/ U src1: 0x1fff6002 reg: 0c swiz: -W/-U/-U/-U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 11: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 12: op: 0x00f0e203 dst: 7o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 13: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 14: op: 0x00f10203 dst: 8o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 15 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL IN[1], GENERIC[1], PERSPECTIVE DCL IN[2], GENERIC[2], PERSPECTIVE DCL IN[3], GENERIC[3], PERSPECTIVE DCL IN[4], GENERIC[4], PERSPECTIVE DCL IN[5], GENERIC[5], PERSPECTIVE DCL IN[6], GENERIC[6], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0..5] DCL TEMP[0..4] 0: TEX TEMP[0], IN[0], SAMP[0], 2D 1: MUL TEMP[0], TEMP[0], CONST[0].xxxx 2: TEX TEMP[1], IN[1], SAMP[0], 2D 3: TEX TEMP[2], IN[2], SAMP[0], 2D 4: ADD TEMP[1], TEMP[1], TEMP[2] 5: MAD TEMP[0], TEMP[1], CONST[0].yyyy, TEMP[0] 6: TEX TEMP[1], IN[3], SAMP[0], 2D 7: TEX TEMP[2], IN[4], SAMP[0], 2D 8: ADD TEMP[1], TEMP[1], TEMP[2] 9: MAD TEMP[0], TEMP[1], CONST[0].zzzz, TEMP[0] 10: TEX TEMP[1], IN[5], SAMP[0], 2D 11: TEX TEMP[2], IN[6], SAMP[0], 2D 12: ADD TEMP[1], TEMP[1], TEMP[2] 13: MAD TEMP[0], TEMP[1], CONST[0].wwww, TEMP[0] 14: ADD TEMP[3], IN[0], CONST[1] 15: SUB TEMP[4], IN[0], CONST[1] 16: TEX TEMP[1], TEMP[3], SAMP[0], 2D 17: TEX TEMP[2], TEMP[4], SAMP[0], 2D 18: ADD TEMP[1], TEMP[1], TEMP[2] 19: MAD TEMP[0], TEMP[1], CONST[2].xxxx, TEMP[0] 20: ADD TEMP[3], IN[0], CONST[3] 21: SUB TEMP[4], IN[0], CONST[3] 22: TEX TEMP[1], TEMP[3], SAMP[0], 2D 23: TEX TEMP[2], TEMP[4], SAMP[0], 2D 24: ADD TEMP[1], TEMP[1], TEMP[2] 25: MAD TEMP[0], TEMP[1], CONST[2].yyyy, TEMP[0] 26: ADD TEMP[3], IN[0], CONST[4] 27: SUB TEMP[4], IN[0], CONST[4] 28: TEX TEMP[1], TEMP[3], SAMP[0], 2D 29: TEX TEMP[2], TEMP[4], SAMP[0], 2D 30: ADD TEMP[1], TEMP[1], TEMP[2] 31: MAD TEMP[0], TEMP[1], CONST[2].zzzz, TEMP[0] 32: ADD TEMP[3], IN[0], CONST[5] 33: SUB TEMP[4], IN[0], CONST[5] 34: TEX TEMP[1], TEMP[3], SAMP[0], 2D 35: TEX TEMP[2], TEMP[4], SAMP[0], 2D 36: ADD TEMP[1], TEMP[1], TEMP[2] 37: MAD OUT[0], TEMP[1], CONST[2].wwww, TEMP[0] 38: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: ADD temp[4], input[0], -const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: ADD temp[4], input[0], -const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: ADD temp[4], input[0], -const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: ADD temp[4], input[0], -const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1].xy__, 2D[0]; 3: TEX temp[2], input[2].xy__, 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3].xy__, 2D[0]; 7: TEX temp[2], input[4].xy__, 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5].xy__, 2D[0]; 11: TEX temp[2], input[6].xy__, 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3].xy, input[0].xy__, const[1].xy__; 15: ADD temp[4].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[1], temp[3].xy__, 2D[0]; 17: TEX temp[2], temp[4].xy__, 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3].xy, input[0].xy__, const[3].xy__; 21: ADD temp[4].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[1], temp[3].xy__, 2D[0]; 23: TEX temp[2], temp[4].xy__, 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3].xy, input[0].xy__, const[4].xy__; 27: ADD temp[4].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[1], temp[3].xy__, 2D[0]; 29: TEX temp[2], temp[4].xy__, 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3].xy, input[0].xy__, const[5].xy__; 33: ADD temp[4].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[1], temp[3].xy__, 2D[0]; 35: TEX temp[2], temp[4].xy__, 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0] MAD temp[6].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[6].w, src0.w, src1.x, src0.0 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[8], src1.w = temp[8] MAD temp[9].xyz, src0.xyz, src0.111, src1.xyz MAD temp[9].w, src0.w, src0.1, src1.w 5: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = const[0], src1.w = temp[6], src2.xyz = temp[6] MAD temp[10].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[10].w, src0.w, src1.y, src1.w 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = temp[12], src1.w = temp[12] MAD temp[13].xyz, src0.xyz, src0.111, src1.xyz MAD temp[13].w, src0.w, src0.1, src1.w 9: src0.xyz = temp[13], src0.w = temp[13], src1.xyz = const[0], src1.w = temp[10], src2.xyz = temp[10] MAD temp[14].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[14].w, src0.w, src1.z, src1.w 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: src0.xyz = temp[15], src0.w = temp[15], src1.xyz = temp[16], src1.w = temp[16] MAD temp[17].xyz, src0.xyz, src0.111, src1.xyz MAD temp[17].w, src0.w, src0.1, src1.w 13: src0.xyz = temp[17], src0.w = temp[17], src1.xyz = temp[14], src1.w = const[0], src2.w = temp[14] MAD temp[18].xyz, src0.xyz, src1.www, src1.xyz MAD temp[18].w, src0.w, src1.w, src2.w 14: src0.xyz = input[0], src1.xyz = const[1] MAD temp[19].xy, src0.xy_, src0.111, src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[1] MAD temp[20].xy, src0.xy_, src0.111, -src1.xy_ 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: src0.xyz = temp[21], src0.w = temp[21], src1.xyz = temp[22], src1.w = temp[22] MAD temp[23].xyz, src0.xyz, src0.111, src1.xyz MAD temp[23].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[23], src0.w = temp[23], src1.xyz = const[2], src1.w = temp[18], src2.xyz = temp[18] MAD temp[24].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[24].w, src0.w, src1.x, src1.w 20: src0.xyz = input[0], src1.xyz = const[3] MAD temp[25].xy, src0.xy_, src0.111, src1.xy_ 21: src0.xyz = input[0], src1.xyz = const[3] MAD temp[26].xy, src0.xy_, src0.111, -src1.xy_ 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: src0.xyz = temp[27], src0.w = temp[27], src1.xyz = temp[28], src1.w = temp[28] MAD temp[29].xyz, src0.xyz, src0.111, src1.xyz MAD temp[29].w, src0.w, src0.1, src1.w 25: src0.xyz = temp[29], src0.w = temp[29], src1.xyz = const[2], src1.w = temp[24], src2.xyz = temp[24] MAD temp[30].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[30].w, src0.w, src1.y, src1.w 26: src0.xyz = input[0], src1.xyz = const[4] MAD temp[31].xy, src0.xy_, src0.111, src1.xy_ 27: src0.xyz = input[0], src1.xyz = const[4] MAD temp[32].xy, src0.xy_, src0.111, -src1.xy_ 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: src0.xyz = temp[33], src0.w = temp[33], src1.xyz = temp[34], src1.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.111, src1.xyz MAD temp[35].w, src0.w, src0.1, src1.w 31: src0.xyz = temp[35], src0.w = temp[35], src1.xyz = const[2], src1.w = temp[30], src2.xyz = temp[30] MAD temp[36].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[36].w, src0.w, src1.z, src1.w 32: src0.xyz = input[0], src1.xyz = const[5] MAD temp[37].xy, src0.xy_, src0.111, src1.xy_ 33: src0.xyz = input[0], src1.xyz = const[5] MAD temp[38].xy, src0.xy_, src0.111, -src1.xy_ 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: src0.xyz = temp[39], src0.w = temp[39], src1.xyz = temp[40], src1.w = temp[40] MAD temp[41].xyz, src0.xyz, src0.111, src1.xyz MAD temp[41].w, src0.w, src0.1, src1.w 37: src0.xyz = temp[41], src0.w = temp[41], src1.xyz = temp[36], src1.w = const[2], src2.w = temp[36] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src1.xyz = const[1] MAD temp[19].xy, src0.xy_, src0.111, src1.xy_ 1: BEGIN_TEX; 2: TEX temp[5], input[0].xy__, 2D[0]; 3: TEX temp[7], input[1].xy__, 2D[0]; 4: TEX temp[8], input[2].xy__, 2D[0]; 5: TEX temp[11], input[3].xy__, 2D[0]; 6: TEX temp[12], input[4].xy__, 2D[0]; 7: TEX temp[15], input[5].xy__, 2D[0]; 8: TEX temp[16], input[6].xy__, 2D[0]; 9: TEX temp[21], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 10: src0.xyz = input[0], src1.xyz = const[1] MAD temp[20].xy, src0.xy_, src0.111, -src1.xy_ 11: src0.xyz = input[0], src1.xyz = const[3] MAD temp[25].xy, src0.xy_, src0.111, src1.xy_ 12: src0.xyz = input[0], src1.xyz = const[3] MAD temp[26].xy, src0.xy_, src0.111, -src1.xy_ 13: src0.xyz = input[0], src1.xyz = const[4] MAD temp[31].xy, src0.xy_, src0.111, src1.xy_ 14: src0.xyz = input[0], src1.xyz = const[4] MAD temp[32].xy, src0.xy_, src0.111, -src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[5] MAD temp[37].xy, src0.xy_, src0.111, src1.xy_ 16: src0.xyz = input[0], src1.xyz = const[5] MAD temp[38].xy, src0.xy_, src0.111, -src1.xy_ 17: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[8], src1.w = temp[8] SEM_WAIT MAD temp[9].xyz, src0.xyz, src0.111, src1.xyz MAD temp[9].w, src0.w, src0.1, src1.w 18: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = temp[12], src1.w = temp[12] MAD temp[13].xyz, src0.xyz, src0.111, src1.xyz MAD temp[13].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[15], src0.w = temp[15], src1.xyz = temp[16], src1.w = temp[16] MAD temp[17].xyz, src0.xyz, src0.111, src1.xyz MAD temp[17].w, src0.w, src0.1, src1.w 20: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0] MAD temp[6].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[6].w, src0.w, src1.x, src0.0 21: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = const[0], src1.w = temp[6], src2.xyz = temp[6] MAD temp[10].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[10].w, src0.w, src1.y, src1.w 22: src0.xyz = temp[13], src0.w = temp[13], src1.xyz = const[0], src1.w = temp[10], src2.xyz = temp[10] MAD temp[14].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[14].w, src0.w, src1.z, src1.w 23: src0.xyz = temp[17], src0.w = temp[17], src1.xyz = temp[14], src1.w = const[0], src2.w = temp[14] MAD temp[18].xyz, src0.xyz, src1.www, src1.xyz MAD temp[18].w, src0.w, src1.w, src2.w 24: BEGIN_TEX; 25: TEX temp[22], temp[20].xy__, 2D[0]; 26: TEX temp[27], temp[25].xy__, 2D[0]; 27: TEX temp[28], temp[26].xy__, 2D[0]; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: TEX temp[39], temp[37].xy__, 2D[0]; 31: TEX temp[40], temp[38].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 32: src0.xyz = temp[21], src0.w = temp[21], src1.xyz = temp[22], src1.w = temp[22] SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.111, src1.xyz MAD temp[23].w, src0.w, src0.1, src1.w 33: src0.xyz = temp[27], src0.w = temp[27], src1.xyz = temp[28], src1.w = temp[28] MAD temp[29].xyz, src0.xyz, src0.111, src1.xyz MAD temp[29].w, src0.w, src0.1, src1.w 34: src0.xyz = temp[33], src0.w = temp[33], src1.xyz = temp[34], src1.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.111, src1.xyz MAD temp[35].w, src0.w, src0.1, src1.w 35: src0.xyz = temp[39], src0.w = temp[39], src1.xyz = temp[40], src1.w = temp[40] MAD temp[41].xyz, src0.xyz, src0.111, src1.xyz MAD temp[41].w, src0.w, src0.1, src1.w 36: src0.xyz = temp[23], src0.w = temp[23], src1.xyz = const[2], src1.w = temp[18], src2.xyz = temp[18] MAD temp[24].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[24].w, src0.w, src1.x, src1.w 37: src0.xyz = temp[29], src0.w = temp[29], src1.xyz = const[2], src1.w = temp[24], src2.xyz = temp[24] MAD temp[30].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[30].w, src0.w, src1.y, src1.w 38: src0.xyz = temp[35], src0.w = temp[35], src1.xyz = const[2], src1.w = temp[30], src2.xyz = temp[30] MAD temp[36].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[36].w, src0.w, src1.z, src1.w 39: src0.xyz = temp[41], src0.w = temp[41], src1.xyz = temp[36], src1.w = const[2], src2.w = temp[36] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src1.xyz = const[1] MAD temp[19].xy, src0.xy_, src0.111, src1.xy_ 1: BEGIN_TEX; 2: TEX temp[5], input[0].xy__, 2D[0]; 3: TEX temp[7], input[1].xy__, 2D[0]; 4: TEX temp[8], input[2].xy__, 2D[0]; 5: TEX temp[11], input[3].xy__, 2D[0]; 6: TEX temp[12], input[4].xy__, 2D[0]; 7: TEX temp[15], input[5].xy__, 2D[0]; 8: TEX temp[16], input[6].xy__, 2D[0]; 9: TEX temp[21], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 10: src0.xyz = input[0], src1.xyz = const[1] MAD temp[20].xy, src0.xy_, src0.111, -src1.xy_ 11: src0.xyz = input[0], src1.xyz = const[3] MAD temp[25].xy, src0.xy_, src0.111, src1.xy_ 12: src0.xyz = input[0], src1.xyz = const[3] MAD temp[26].xy, src0.xy_, src0.111, -src1.xy_ 13: src0.xyz = input[0], src1.xyz = const[4] MAD temp[31].xy, src0.xy_, src0.111, src1.xy_ 14: src0.xyz = input[0], src1.xyz = const[4] MAD temp[32].xy, src0.xy_, src0.111, -src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[5] MAD temp[37].xy, src0.xy_, src0.111, src1.xy_ 16: src0.xyz = input[0], src1.xyz = const[5] MAD temp[38].xy, src0.xy_, src0.111, -src1.xy_ 17: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[8], src1.w = temp[8] SEM_WAIT MAD temp[9].xyz, src0.xyz, src0.111, src1.xyz MAD temp[9].w, src0.w, src0.1, src1.w 18: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = temp[12], src1.w = temp[12] MAD temp[13].xyz, src0.xyz, src0.111, src1.xyz MAD temp[13].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[15], src0.w = temp[15], src1.xyz = temp[16], src1.w = temp[16] MAD temp[17].xyz, src0.xyz, src0.111, src1.xyz MAD temp[17].w, src0.w, src0.1, src1.w 20: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0] MAD temp[6].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[6].w, src0.w, src1.x, src0.0 21: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = const[0], src1.w = temp[6], src2.xyz = temp[6] MAD temp[10].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[10].w, src0.w, src1.y, src1.w 22: src0.xyz = temp[13], src0.w = temp[13], src1.xyz = const[0], src1.w = temp[10], src2.xyz = temp[10] MAD temp[14].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[14].w, src0.w, src1.z, src1.w 23: src0.xyz = temp[17], src0.w = temp[17], src1.xyz = temp[14], src1.w = const[0], src2.w = temp[14] MAD temp[18].xyz, src0.xyz, src1.www, src1.xyz MAD temp[18].w, src0.w, src1.w, src2.w 24: BEGIN_TEX; 25: TEX temp[22], temp[20].xy__, 2D[0]; 26: TEX temp[27], temp[25].xy__, 2D[0]; 27: TEX temp[28], temp[26].xy__, 2D[0]; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: TEX temp[39], temp[37].xy__, 2D[0]; 31: TEX temp[40], temp[38].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 32: src0.xyz = temp[21], src0.w = temp[21], src1.xyz = temp[22], src1.w = temp[22] SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.111, src1.xyz MAD temp[23].w, src0.w, src0.1, src1.w 33: src0.xyz = temp[27], src0.w = temp[27], src1.xyz = temp[28], src1.w = temp[28] MAD temp[29].xyz, src0.xyz, src0.111, src1.xyz MAD temp[29].w, src0.w, src0.1, src1.w 34: src0.xyz = temp[33], src0.w = temp[33], src1.xyz = temp[34], src1.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.111, src1.xyz MAD temp[35].w, src0.w, src0.1, src1.w 35: src0.xyz = temp[39], src0.w = temp[39], src1.xyz = temp[40], src1.w = temp[40] MAD temp[41].xyz, src0.xyz, src0.111, src1.xyz MAD temp[41].w, src0.w, src0.1, src1.w 36: src0.xyz = temp[23], src0.w = temp[23], src1.xyz = const[2], src1.w = temp[18], src2.xyz = temp[18] MAD temp[24].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[24].w, src0.w, src1.x, src1.w 37: src0.xyz = temp[29], src0.w = temp[29], src1.xyz = const[2], src1.w = temp[24], src2.xyz = temp[24] MAD temp[30].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[30].w, src0.w, src1.y, src1.w 38: src0.xyz = temp[35], src0.w = temp[35], src1.xyz = const[2], src1.w = temp[30], src2.xyz = temp[30] MAD temp[36].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[36].w, src0.w, src1.z, src1.w 39: src0.xyz = temp[41], src0.w = temp[41], src1.xyz = temp[36], src1.w = const[2], src2.w = temp[36] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src1.xyz = const[1] MAD temp[7].xy, src0.xy_, src0.11_, src1.xy_ 1: BEGIN_TEX; 2: TEX temp[8], input[0].xy__, 2D[0]; 3: TEX temp[1], input[1].xy__, 2D[0]; 4: TEX temp[2], input[2].xy__, 2D[0]; 5: TEX temp[3], input[3].xy__, 2D[0]; 6: TEX temp[4], input[4].xy__, 2D[0]; 7: TEX temp[5], input[5].xy__, 2D[0]; 8: TEX temp[6], input[6].xy__, 2D[0]; 9: TEX temp[7], temp[7].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 10: src0.xyz = input[0], src1.xyz = const[1] MAD temp[9].xy, src0.xy_, src0.11_, -src1.xy_ 11: src0.xyz = input[0], src1.xyz = const[3] MAD temp[10].xy, src0.xy_, src0.11_, src1.xy_ 12: src0.xyz = input[0], src1.xyz = const[3] MAD temp[11].xy, src0.xy_, src0.11_, -src1.xy_ 13: src0.xyz = input[0], src1.xyz = const[4] MAD temp[12].xy, src0.xy_, src0.11_, src1.xy_ 14: src0.xyz = input[0], src1.xyz = const[4] MAD temp[13].xy, src0.xy_, src0.11_, -src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[5] MAD temp[14].xy, src0.xy_, src0.11_, src1.xy_ 16: src0.xyz = input[0], src1.xyz = const[5] MAD temp[0].xy, src0.xy_, src0.11_, -src1.xy_ 17: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = temp[2], src1.w = temp[2] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src1.xyz MAD temp[0].w, src0.w, src0.1, src1.w 18: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = temp[4], src1.w = temp[4] MAD temp[2].xyz, src0.xyz, src0.111, src1.xyz MAD temp[1].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] MAD temp[3].xyz, src0.xyz, src0.111, src1.xyz MAD temp[2].w, src0.w, src0.1, src1.w 20: src0.xyz = temp[8], src0.w = temp[8], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[3].w, src0.w, src1.x, src0.0 21: src0.xyz = temp[1], src0.w = temp[0], src1.xyz = const[0], src1.w = temp[3], src2.xyz = temp[4] MAD temp[1].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[0].w, src0.w, src1.y, src1.w 22: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[0], src1.w = temp[0], src2.xyz = temp[1] MAD temp[1].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[0].w, src0.w, src1.z, src1.w 23: src0.xyz = temp[3], src0.w = temp[2], src1.xyz = temp[1], src1.w = const[0], src2.w = temp[0] MAD temp[1].xyz, src0.xyz, src1.www, src1.xyz MAD temp[0].w, src0.w, src1.w, src2.w 24: BEGIN_TEX; 25: TEX temp[2], temp[9].xy__, 2D[0]; 26: TEX temp[3], temp[10].xy__, 2D[0]; 27: TEX temp[4], temp[11].xy__, 2D[0]; 28: TEX temp[5], temp[12].xy__, 2D[0]; 29: TEX temp[6], temp[13].xy__, 2D[0]; 30: TEX temp[8], temp[14].xy__, 2D[0]; 31: TEX temp[9], temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 32: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[2], src1.w = temp[2] SEM_WAIT MAD temp[0].xyz, src0.xyz, src0.111, src1.xyz MAD temp[1].w, src0.w, src0.1, src1.w 33: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = temp[4], src1.w = temp[4] MAD temp[2].xyz, src0.xyz, src0.111, src1.xyz MAD temp[2].w, src0.w, src0.1, src1.w 34: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] MAD temp[3].xyz, src0.xyz, src0.111, src1.xyz MAD temp[3].w, src0.w, src0.1, src1.w 35: src0.xyz = temp[8], src0.w = temp[8], src1.xyz = temp[9], src1.w = temp[9] MAD temp[4].xyz, src0.xyz, src0.111, src1.xyz MAD temp[4].w, src0.w, src0.1, src1.w 36: src0.xyz = temp[0], src0.w = temp[1], src1.xyz = const[2], src1.w = temp[0], src2.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[0].w, src0.w, src1.x, src1.w 37: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[2], src1.w = temp[0], src2.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[0].w, src0.w, src1.y, src1.w 38: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[2], src1.w = temp[0], src2.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[0].w, src0.w, src1.z, src1.w 39: src0.xyz = temp[4], src0.w = temp[4], src1.xyz = temp[0], src1.w = const[2], src2.w = temp[0] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w R500 Fragment Program: -------- 0 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040400:Addr0: 0t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00421070:MAD dest:7 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe408f400: src: 0 R/G/A/A dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe402f402: src: 2 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe403f403: src: 3 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe404f404: src: 4 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f405: src: 5 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 7 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe406f406: src: 6 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 8 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe407f407: src: 7 R/G/A/A dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 9 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040400:Addr0: 0t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c21090:MAD dest:9 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 10 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040c00:Addr0: 0t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004210a0:MAD dest:10 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040c00:Addr0: 0t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c210b0:MAD dest:11 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 12 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004210c0:MAD dest:12 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c210d0:MAD dest:13 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 14 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041400:Addr0: 0t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004210e0:MAD dest:14 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041400:Addr0: 0t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c21000:MAD dest:0 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 16 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 17 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 18 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221030:MAD dest:3 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 19 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040008:Addr0: 8t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x0008c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:1 R 0 targ 0 w:0 5 RGBA_INST: 0x20490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 20 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00440001:Addr0: 1t, Addr1: 0c, Addr2: 4t, srcp:0 2:ALPHA_ADDR 0x08000c00:Addr0: 0t, Addr1: 3t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0028c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x1a222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 21 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00140002:Addr0: 2t, Addr1: 0c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x0048c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x1a222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 22 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000403:Addr0: 3t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x00040002:Addr0: 2t, Addr1: 0c, Addr2: 0t, srcp:0 3 RGB_INST: 0x006da220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 A/A/A 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:2 A 0 23 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe402f409: src: 9 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 24 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe403f40a: src: 10 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 25 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe404f40b: src: 11 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 26 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f40c: src: 12 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 27 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe406f40d: src: 13 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 28 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe408f40e: src: 14 R/G/A/A dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 29 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe409f400: src: 0 R/G/A/A dst: 9 R/G/B/A 3:TEX_DXDY: 0x00000000 30 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000807:Addr0: 7t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000807:Addr0: 7t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221000:MAD dest:0 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 31 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 32 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221030:MAD dest:3 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 33 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08002408:Addr0: 8t, Addr1: 9t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08002408:Addr0: 8t, Addr1: 9t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c040:MAD dest:4 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221040:MAD dest:4 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 34 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00140800:Addr0: 0t, Addr1: 2c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x0008c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 R 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 35 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00040802:Addr0: 2t, Addr1: 2c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0028c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 36 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00040803:Addr0: 3t, Addr1: 2c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000003:Addr0: 3t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x0048c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 37 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08000004:Addr0: 4t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x00040804:Addr0: 4t, Addr1: 2c, Addr2: 0t, srcp:0 3 RGB_INST: 0x006da220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 A/A/A 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c221000:MAD dest:0 rgb_C_src:1 R/G/B 0 alp_C_src:2 A 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 38 Instructions ~ 23 Vector Instructions (RGB) ~ 15 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 15 Texture Instructions ~ 0 Presub Operations ~ 0 OMOD Operations ~ 15 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL OUT[2], GENERIC[1] DCL OUT[3], GENERIC[2] DCL OUT[4], GENERIC[3] DCL OUT[5], GENERIC[4] DCL OUT[6], GENERIC[5] DCL OUT[7], GENERIC[6] DCL CONST[0] DCL TEMP[0..1] IMM[0] FLT32 { 1.0000, 0.0000, -1.0000, 0.0000} 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: MAD TEMP[0], CONST[0], IMM[0].xxyy, IN[1] 3: MAD TEMP[1], CONST[0], IMM[0].zzyy, IN[1] 4: MOV OUT[2], TEMP[0] 5: MOV OUT[3], TEMP[1] 6: ADD TEMP[0].y, TEMP[0], CONST[0].zzzz 7: SUB TEMP[1].y, TEMP[1], CONST[0].zzzz 8: MOV OUT[4], TEMP[0] 9: MOV OUT[5], TEMP[1] 10: ADD TEMP[0].y, TEMP[0], CONST[0].wwww 11: SUB TEMP[1].y, TEMP[1], CONST[0].wwww 12: MOV OUT[6], TEMP[0] 13: MOV OUT[7], TEMP[1] 14: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0], const[0].zzzz; 7: SUB temp[1].y, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].y, temp[0], const[0].wwww; 11: SUB temp[1].y, temp[1], const[0].wwww; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0], const[0].zzzz; 7: SUB temp[1].y, temp[1], const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].y, temp[0], const[0].wwww; 11: SUB temp[1].y, temp[1], const[0].wwww; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0], const[0].zzzz; 7: ADD temp[1].y, temp[1], -const[0].zzzz; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].y, temp[0], const[0].wwww; 11: ADD temp[1].y, temp[1], -const[0].wwww; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[2], input[0]; 1: MOV output[1], input[1]; 2: MAD temp[0], const[0], const[1].xxyy, input[1]; 3: MAD temp[1], const[0], const[1].zzyy, input[1]; 4: MOV output[2], temp[0]; 5: MOV output[3], temp[1]; 6: ADD temp[0].y, temp[0]._y__, const[0]._z__; 7: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 8: MOV output[4], temp[0]; 9: MOV output[5], temp[1]; 10: ADD temp[0].y, temp[0]._y__, const[0]._w__; 11: ADD temp[1].y, temp[1]._y__, -const[0]._w__; 12: MOV output[6], temp[0]; 13: MOV output[7], temp[1]; 14: MOV output[0], temp[2]; 15: MOV output[8], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].y, temp[0]._y__, const[0]._w__; 10: ADD temp[1].y, temp[1]._y__, -const[0]._w__; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].y, temp[0]._y__, const[0]._w__; 10: ADD temp[1].y, temp[1]._y__, -const[0]._w__; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].y, temp[0]._y__, const[0]._w__; 10: ADD temp[1].y, temp[1]._y__, -const[0]._w__; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].y, temp[0]._y__, const[0]._w__; 10: ADD temp[1].y, temp[1]._y__, -const[0]._w__; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MAD temp[0], const[0], none.1100, input[1]; 2: MAD temp[1], const[0], none.-1-100, input[1]; 3: MOV output[2], temp[0]; 4: MOV output[3], temp[1]; 5: ADD temp[0].y, temp[0]._y__, const[0]._z__; 6: ADD temp[1].y, temp[1]._y__, -const[0]._z__; 7: MOV output[4], temp[0]; 8: MOV output[5], temp[1]; 9: ADD temp[0].y, temp[0]._y__, const[0]._w__; 10: ADD temp[1].y, temp[1]._y__, -const[0]._w__; 11: MOV output[6], temp[0]; 12: MOV output[7], temp[1]; 13: MOV output[0], input[0]; 14: MOV output[8], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0125a000 reg: 0t swiz: 1/ 1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 2: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x0725a000 reg: 0t swiz: -1/-1/ 0/ 0 src2: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W 3: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 5: op: 0x00200003 dst: 0t op: VE_ADD src0: 0x01f9e000 reg: 0t swiz: U/ Y/ U/ U src1: 0x01fae002 reg: 0c swiz: U/ Z/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 6: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x01f9e020 reg: 1t swiz: U/ Y/ U/ U src1: 0x1ffae002 reg: 0c swiz: -U/-Z/-U/-U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 7: op: 0x00f08203 dst: 4o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00200003 dst: 0t op: VE_ADD src0: 0x01f9e000 reg: 0t swiz: U/ Y/ U/ U src1: 0x01fbe002 reg: 0c swiz: U/ W/ U/ U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 10: op: 0x00202003 dst: 1t op: VE_ADD src0: 0x01f9e020 reg: 1t swiz: U/ Y/ U/ U src1: 0x1ffbe002 reg: 0c swiz: -U/-W/-U/-U src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 11: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 12: op: 0x00f0e203 dst: 7o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 13: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 14: op: 0x00f10203 dst: 8o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 15 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL IN[1], GENERIC[1], PERSPECTIVE DCL IN[2], GENERIC[2], PERSPECTIVE DCL IN[3], GENERIC[3], PERSPECTIVE DCL IN[4], GENERIC[4], PERSPECTIVE DCL IN[5], GENERIC[5], PERSPECTIVE DCL IN[6], GENERIC[6], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0..5] DCL TEMP[0..4] 0: TEX TEMP[0], IN[0], SAMP[0], 2D 1: MUL TEMP[0], TEMP[0], CONST[0].xxxx 2: TEX TEMP[1], IN[1], SAMP[0], 2D 3: TEX TEMP[2], IN[2], SAMP[0], 2D 4: ADD TEMP[1], TEMP[1], TEMP[2] 5: MAD TEMP[0], TEMP[1], CONST[0].yyyy, TEMP[0] 6: TEX TEMP[1], IN[3], SAMP[0], 2D 7: TEX TEMP[2], IN[4], SAMP[0], 2D 8: ADD TEMP[1], TEMP[1], TEMP[2] 9: MAD TEMP[0], TEMP[1], CONST[0].zzzz, TEMP[0] 10: TEX TEMP[1], IN[5], SAMP[0], 2D 11: TEX TEMP[2], IN[6], SAMP[0], 2D 12: ADD TEMP[1], TEMP[1], TEMP[2] 13: MAD TEMP[0], TEMP[1], CONST[0].wwww, TEMP[0] 14: ADD TEMP[3], IN[0], CONST[1] 15: SUB TEMP[4], IN[0], CONST[1] 16: TEX TEMP[1], TEMP[3], SAMP[0], 2D 17: TEX TEMP[2], TEMP[4], SAMP[0], 2D 18: ADD TEMP[1], TEMP[1], TEMP[2] 19: MAD TEMP[0], TEMP[1], CONST[2].xxxx, TEMP[0] 20: ADD TEMP[3], IN[0], CONST[3] 21: SUB TEMP[4], IN[0], CONST[3] 22: TEX TEMP[1], TEMP[3], SAMP[0], 2D 23: TEX TEMP[2], TEMP[4], SAMP[0], 2D 24: ADD TEMP[1], TEMP[1], TEMP[2] 25: MAD TEMP[0], TEMP[1], CONST[2].yyyy, TEMP[0] 26: ADD TEMP[3], IN[0], CONST[4] 27: SUB TEMP[4], IN[0], CONST[4] 28: TEX TEMP[1], TEMP[3], SAMP[0], 2D 29: TEX TEMP[2], TEMP[4], SAMP[0], 2D 30: ADD TEMP[1], TEMP[1], TEMP[2] 31: MAD TEMP[0], TEMP[1], CONST[2].zzzz, TEMP[0] 32: ADD TEMP[3], IN[0], CONST[5] 33: SUB TEMP[4], IN[0], CONST[5] 34: TEX TEMP[1], TEMP[3], SAMP[0], 2D 35: TEX TEMP[2], TEMP[4], SAMP[0], 2D 36: ADD TEMP[1], TEMP[1], TEMP[2] 37: MAD OUT[0], TEMP[1], CONST[2].wwww, TEMP[0] 38: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: SUB temp[4], input[0], const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: SUB temp[4], input[0], const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: SUB temp[4], input[0], const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: SUB temp[4], input[0], const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[2], 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3], 2D[0]; 7: TEX temp[2], input[4], 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5], 2D[0]; 11: TEX temp[2], input[6], 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3], input[0], const[1]; 15: ADD temp[4], input[0], -const[1]; 16: TEX temp[1], temp[3], 2D[0]; 17: TEX temp[2], temp[4], 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3], input[0], const[3]; 21: ADD temp[4], input[0], -const[3]; 22: TEX temp[1], temp[3], 2D[0]; 23: TEX temp[2], temp[4], 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3], input[0], const[4]; 27: ADD temp[4], input[0], -const[4]; 28: TEX temp[1], temp[3], 2D[0]; 29: TEX temp[2], temp[4], 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3], input[0], const[5]; 33: ADD temp[4], input[0], -const[5]; 34: TEX temp[1], temp[3], 2D[0]; 35: TEX temp[2], temp[4], 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MUL temp[0], temp[0], const[0].xxxx; 2: TEX temp[1], input[1].xy__, 2D[0]; 3: TEX temp[2], input[2].xy__, 2D[0]; 4: ADD temp[1], temp[1], temp[2]; 5: MAD temp[0], temp[1], const[0].yyyy, temp[0]; 6: TEX temp[1], input[3].xy__, 2D[0]; 7: TEX temp[2], input[4].xy__, 2D[0]; 8: ADD temp[1], temp[1], temp[2]; 9: MAD temp[0], temp[1], const[0].zzzz, temp[0]; 10: TEX temp[1], input[5].xy__, 2D[0]; 11: TEX temp[2], input[6].xy__, 2D[0]; 12: ADD temp[1], temp[1], temp[2]; 13: MAD temp[0], temp[1], const[0].wwww, temp[0]; 14: ADD temp[3].xy, input[0].xy__, const[1].xy__; 15: ADD temp[4].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[1], temp[3].xy__, 2D[0]; 17: TEX temp[2], temp[4].xy__, 2D[0]; 18: ADD temp[1], temp[1], temp[2]; 19: MAD temp[0], temp[1], const[2].xxxx, temp[0]; 20: ADD temp[3].xy, input[0].xy__, const[3].xy__; 21: ADD temp[4].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[1], temp[3].xy__, 2D[0]; 23: TEX temp[2], temp[4].xy__, 2D[0]; 24: ADD temp[1], temp[1], temp[2]; 25: MAD temp[0], temp[1], const[2].yyyy, temp[0]; 26: ADD temp[3].xy, input[0].xy__, const[4].xy__; 27: ADD temp[4].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[1], temp[3].xy__, 2D[0]; 29: TEX temp[2], temp[4].xy__, 2D[0]; 30: ADD temp[1], temp[1], temp[2]; 31: MAD temp[0], temp[1], const[2].zzzz, temp[0]; 32: ADD temp[3].xy, input[0].xy__, const[5].xy__; 33: ADD temp[4].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[1], temp[3].xy__, 2D[0]; 35: TEX temp[2], temp[4].xy__, 2D[0]; 36: ADD temp[1], temp[1], temp[2]; 37: MAD output[0], temp[1], const[2].wwww, temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: MUL temp[6], temp[5], const[0].xxxx; 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: ADD temp[9], temp[7], temp[8]; 5: MAD temp[10], temp[9], const[0].yyyy, temp[6]; 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: ADD temp[13], temp[11], temp[12]; 9: MAD temp[14], temp[13], const[0].zzzz, temp[10]; 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: ADD temp[17], temp[15], temp[16]; 13: MAD temp[18], temp[17], const[0].wwww, temp[14]; 14: ADD temp[19].xy, input[0].xy__, const[1].xy__; 15: ADD temp[20].xy, input[0].xy__, -const[1].xy__; 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: ADD temp[23], temp[21], temp[22]; 19: MAD temp[24], temp[23], const[2].xxxx, temp[18]; 20: ADD temp[25].xy, input[0].xy__, const[3].xy__; 21: ADD temp[26].xy, input[0].xy__, -const[3].xy__; 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: ADD temp[29], temp[27], temp[28]; 25: MAD temp[30], temp[29], const[2].yyyy, temp[24]; 26: ADD temp[31].xy, input[0].xy__, const[4].xy__; 27: ADD temp[32].xy, input[0].xy__, -const[4].xy__; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: ADD temp[35], temp[33], temp[34]; 31: MAD temp[36], temp[35], const[2].zzzz, temp[30]; 32: ADD temp[37].xy, input[0].xy__, const[5].xy__; 33: ADD temp[38].xy, input[0].xy__, -const[5].xy__; 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: ADD temp[41], temp[39], temp[40]; 37: MAD output[0], temp[41], const[2].wwww, temp[36]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[5], input[0].xy__, 2D[0]; 1: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0] MAD temp[6].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[6].w, src0.w, src1.x, src0.0 2: TEX temp[7], input[1].xy__, 2D[0]; 3: TEX temp[8], input[2].xy__, 2D[0]; 4: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[8], src1.w = temp[8] MAD temp[9].xyz, src0.xyz, src0.111, src1.xyz MAD temp[9].w, src0.w, src0.1, src1.w 5: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = const[0], src1.w = temp[6], src2.xyz = temp[6] MAD temp[10].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[10].w, src0.w, src1.y, src1.w 6: TEX temp[11], input[3].xy__, 2D[0]; 7: TEX temp[12], input[4].xy__, 2D[0]; 8: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = temp[12], src1.w = temp[12] MAD temp[13].xyz, src0.xyz, src0.111, src1.xyz MAD temp[13].w, src0.w, src0.1, src1.w 9: src0.xyz = temp[13], src0.w = temp[13], src1.xyz = const[0], src1.w = temp[10], src2.xyz = temp[10] MAD temp[14].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[14].w, src0.w, src1.z, src1.w 10: TEX temp[15], input[5].xy__, 2D[0]; 11: TEX temp[16], input[6].xy__, 2D[0]; 12: src0.xyz = temp[15], src0.w = temp[15], src1.xyz = temp[16], src1.w = temp[16] MAD temp[17].xyz, src0.xyz, src0.111, src1.xyz MAD temp[17].w, src0.w, src0.1, src1.w 13: src0.xyz = temp[17], src0.w = temp[17], src1.xyz = temp[14], src1.w = const[0], src2.w = temp[14] MAD temp[18].xyz, src0.xyz, src1.www, src1.xyz MAD temp[18].w, src0.w, src1.w, src2.w 14: src0.xyz = input[0], src1.xyz = const[1] MAD temp[19].xy, src0.xy_, src0.111, src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[1] MAD temp[20].xy, src0.xy_, src0.111, -src1.xy_ 16: TEX temp[21], temp[19].xy__, 2D[0]; 17: TEX temp[22], temp[20].xy__, 2D[0]; 18: src0.xyz = temp[21], src0.w = temp[21], src1.xyz = temp[22], src1.w = temp[22] MAD temp[23].xyz, src0.xyz, src0.111, src1.xyz MAD temp[23].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[23], src0.w = temp[23], src1.xyz = const[2], src1.w = temp[18], src2.xyz = temp[18] MAD temp[24].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[24].w, src0.w, src1.x, src1.w 20: src0.xyz = input[0], src1.xyz = const[3] MAD temp[25].xy, src0.xy_, src0.111, src1.xy_ 21: src0.xyz = input[0], src1.xyz = const[3] MAD temp[26].xy, src0.xy_, src0.111, -src1.xy_ 22: TEX temp[27], temp[25].xy__, 2D[0]; 23: TEX temp[28], temp[26].xy__, 2D[0]; 24: src0.xyz = temp[27], src0.w = temp[27], src1.xyz = temp[28], src1.w = temp[28] MAD temp[29].xyz, src0.xyz, src0.111, src1.xyz MAD temp[29].w, src0.w, src0.1, src1.w 25: src0.xyz = temp[29], src0.w = temp[29], src1.xyz = const[2], src1.w = temp[24], src2.xyz = temp[24] MAD temp[30].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[30].w, src0.w, src1.y, src1.w 26: src0.xyz = input[0], src1.xyz = const[4] MAD temp[31].xy, src0.xy_, src0.111, src1.xy_ 27: src0.xyz = input[0], src1.xyz = const[4] MAD temp[32].xy, src0.xy_, src0.111, -src1.xy_ 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: src0.xyz = temp[33], src0.w = temp[33], src1.xyz = temp[34], src1.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.111, src1.xyz MAD temp[35].w, src0.w, src0.1, src1.w 31: src0.xyz = temp[35], src0.w = temp[35], src1.xyz = const[2], src1.w = temp[30], src2.xyz = temp[30] MAD temp[36].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[36].w, src0.w, src1.z, src1.w 32: src0.xyz = input[0], src1.xyz = const[5] MAD temp[37].xy, src0.xy_, src0.111, src1.xy_ 33: src0.xyz = input[0], src1.xyz = const[5] MAD temp[38].xy, src0.xy_, src0.111, -src1.xy_ 34: TEX temp[39], temp[37].xy__, 2D[0]; 35: TEX temp[40], temp[38].xy__, 2D[0]; 36: src0.xyz = temp[39], src0.w = temp[39], src1.xyz = temp[40], src1.w = temp[40] MAD temp[41].xyz, src0.xyz, src0.111, src1.xyz MAD temp[41].w, src0.w, src0.1, src1.w 37: src0.xyz = temp[41], src0.w = temp[41], src1.xyz = temp[36], src1.w = const[2], src2.w = temp[36] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src1.xyz = const[1] MAD temp[19].xy, src0.xy_, src0.111, src1.xy_ 1: BEGIN_TEX; 2: TEX temp[5], input[0].xy__, 2D[0]; 3: TEX temp[7], input[1].xy__, 2D[0]; 4: TEX temp[8], input[2].xy__, 2D[0]; 5: TEX temp[11], input[3].xy__, 2D[0]; 6: TEX temp[12], input[4].xy__, 2D[0]; 7: TEX temp[15], input[5].xy__, 2D[0]; 8: TEX temp[16], input[6].xy__, 2D[0]; 9: TEX temp[21], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 10: src0.xyz = input[0], src1.xyz = const[1] MAD temp[20].xy, src0.xy_, src0.111, -src1.xy_ 11: src0.xyz = input[0], src1.xyz = const[3] MAD temp[25].xy, src0.xy_, src0.111, src1.xy_ 12: src0.xyz = input[0], src1.xyz = const[3] MAD temp[26].xy, src0.xy_, src0.111, -src1.xy_ 13: src0.xyz = input[0], src1.xyz = const[4] MAD temp[31].xy, src0.xy_, src0.111, src1.xy_ 14: src0.xyz = input[0], src1.xyz = const[4] MAD temp[32].xy, src0.xy_, src0.111, -src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[5] MAD temp[37].xy, src0.xy_, src0.111, src1.xy_ 16: src0.xyz = input[0], src1.xyz = const[5] MAD temp[38].xy, src0.xy_, src0.111, -src1.xy_ 17: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[8], src1.w = temp[8] SEM_WAIT MAD temp[9].xyz, src0.xyz, src0.111, src1.xyz MAD temp[9].w, src0.w, src0.1, src1.w 18: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = temp[12], src1.w = temp[12] MAD temp[13].xyz, src0.xyz, src0.111, src1.xyz MAD temp[13].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[15], src0.w = temp[15], src1.xyz = temp[16], src1.w = temp[16] MAD temp[17].xyz, src0.xyz, src0.111, src1.xyz MAD temp[17].w, src0.w, src0.1, src1.w 20: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0] MAD temp[6].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[6].w, src0.w, src1.x, src0.0 21: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = const[0], src1.w = temp[6], src2.xyz = temp[6] MAD temp[10].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[10].w, src0.w, src1.y, src1.w 22: src0.xyz = temp[13], src0.w = temp[13], src1.xyz = const[0], src1.w = temp[10], src2.xyz = temp[10] MAD temp[14].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[14].w, src0.w, src1.z, src1.w 23: src0.xyz = temp[17], src0.w = temp[17], src1.xyz = temp[14], src1.w = const[0], src2.w = temp[14] MAD temp[18].xyz, src0.xyz, src1.www, src1.xyz MAD temp[18].w, src0.w, src1.w, src2.w 24: BEGIN_TEX; 25: TEX temp[22], temp[20].xy__, 2D[0]; 26: TEX temp[27], temp[25].xy__, 2D[0]; 27: TEX temp[28], temp[26].xy__, 2D[0]; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: TEX temp[39], temp[37].xy__, 2D[0]; 31: TEX temp[40], temp[38].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 32: src0.xyz = temp[21], src0.w = temp[21], src1.xyz = temp[22], src1.w = temp[22] SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.111, src1.xyz MAD temp[23].w, src0.w, src0.1, src1.w 33: src0.xyz = temp[27], src0.w = temp[27], src1.xyz = temp[28], src1.w = temp[28] MAD temp[29].xyz, src0.xyz, src0.111, src1.xyz MAD temp[29].w, src0.w, src0.1, src1.w 34: src0.xyz = temp[33], src0.w = temp[33], src1.xyz = temp[34], src1.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.111, src1.xyz MAD temp[35].w, src0.w, src0.1, src1.w 35: src0.xyz = temp[39], src0.w = temp[39], src1.xyz = temp[40], src1.w = temp[40] MAD temp[41].xyz, src0.xyz, src0.111, src1.xyz MAD temp[41].w, src0.w, src0.1, src1.w 36: src0.xyz = temp[23], src0.w = temp[23], src1.xyz = const[2], src1.w = temp[18], src2.xyz = temp[18] MAD temp[24].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[24].w, src0.w, src1.x, src1.w 37: src0.xyz = temp[29], src0.w = temp[29], src1.xyz = const[2], src1.w = temp[24], src2.xyz = temp[24] MAD temp[30].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[30].w, src0.w, src1.y, src1.w 38: src0.xyz = temp[35], src0.w = temp[35], src1.xyz = const[2], src1.w = temp[30], src2.xyz = temp[30] MAD temp[36].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[36].w, src0.w, src1.z, src1.w 39: src0.xyz = temp[41], src0.w = temp[41], src1.xyz = temp[36], src1.w = const[2], src2.w = temp[36] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w Fragment Program: after 'dead sources' # Radeon Compiler Program 0: src0.xyz = input[0], src1.xyz = const[1] MAD temp[19].xy, src0.xy_, src0.111, src1.xy_ 1: BEGIN_TEX; 2: TEX temp[5], input[0].xy__, 2D[0]; 3: TEX temp[7], input[1].xy__, 2D[0]; 4: TEX temp[8], input[2].xy__, 2D[0]; 5: TEX temp[11], input[3].xy__, 2D[0]; 6: TEX temp[12], input[4].xy__, 2D[0]; 7: TEX temp[15], input[5].xy__, 2D[0]; 8: TEX temp[16], input[6].xy__, 2D[0]; 9: TEX temp[21], temp[19].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 10: src0.xyz = input[0], src1.xyz = const[1] MAD temp[20].xy, src0.xy_, src0.111, -src1.xy_ 11: src0.xyz = input[0], src1.xyz = const[3] MAD temp[25].xy, src0.xy_, src0.111, src1.xy_ 12: src0.xyz = input[0], src1.xyz = const[3] MAD temp[26].xy, src0.xy_, src0.111, -src1.xy_ 13: src0.xyz = input[0], src1.xyz = const[4] MAD temp[31].xy, src0.xy_, src0.111, src1.xy_ 14: src0.xyz = input[0], src1.xyz = const[4] MAD temp[32].xy, src0.xy_, src0.111, -src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[5] MAD temp[37].xy, src0.xy_, src0.111, src1.xy_ 16: src0.xyz = input[0], src1.xyz = const[5] MAD temp[38].xy, src0.xy_, src0.111, -src1.xy_ 17: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[8], src1.w = temp[8] SEM_WAIT MAD temp[9].xyz, src0.xyz, src0.111, src1.xyz MAD temp[9].w, src0.w, src0.1, src1.w 18: src0.xyz = temp[11], src0.w = temp[11], src1.xyz = temp[12], src1.w = temp[12] MAD temp[13].xyz, src0.xyz, src0.111, src1.xyz MAD temp[13].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[15], src0.w = temp[15], src1.xyz = temp[16], src1.w = temp[16] MAD temp[17].xyz, src0.xyz, src0.111, src1.xyz MAD temp[17].w, src0.w, src0.1, src1.w 20: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = const[0] MAD temp[6].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[6].w, src0.w, src1.x, src0.0 21: src0.xyz = temp[9], src0.w = temp[9], src1.xyz = const[0], src1.w = temp[6], src2.xyz = temp[6] MAD temp[10].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[10].w, src0.w, src1.y, src1.w 22: src0.xyz = temp[13], src0.w = temp[13], src1.xyz = const[0], src1.w = temp[10], src2.xyz = temp[10] MAD temp[14].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[14].w, src0.w, src1.z, src1.w 23: src0.xyz = temp[17], src0.w = temp[17], src1.xyz = temp[14], src1.w = const[0], src2.w = temp[14] MAD temp[18].xyz, src0.xyz, src1.www, src1.xyz MAD temp[18].w, src0.w, src1.w, src2.w 24: BEGIN_TEX; 25: TEX temp[22], temp[20].xy__, 2D[0]; 26: TEX temp[27], temp[25].xy__, 2D[0]; 27: TEX temp[28], temp[26].xy__, 2D[0]; 28: TEX temp[33], temp[31].xy__, 2D[0]; 29: TEX temp[34], temp[32].xy__, 2D[0]; 30: TEX temp[39], temp[37].xy__, 2D[0]; 31: TEX temp[40], temp[38].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 32: src0.xyz = temp[21], src0.w = temp[21], src1.xyz = temp[22], src1.w = temp[22] SEM_WAIT MAD temp[23].xyz, src0.xyz, src0.111, src1.xyz MAD temp[23].w, src0.w, src0.1, src1.w 33: src0.xyz = temp[27], src0.w = temp[27], src1.xyz = temp[28], src1.w = temp[28] MAD temp[29].xyz, src0.xyz, src0.111, src1.xyz MAD temp[29].w, src0.w, src0.1, src1.w 34: src0.xyz = temp[33], src0.w = temp[33], src1.xyz = temp[34], src1.w = temp[34] MAD temp[35].xyz, src0.xyz, src0.111, src1.xyz MAD temp[35].w, src0.w, src0.1, src1.w 35: src0.xyz = temp[39], src0.w = temp[39], src1.xyz = temp[40], src1.w = temp[40] MAD temp[41].xyz, src0.xyz, src0.111, src1.xyz MAD temp[41].w, src0.w, src0.1, src1.w 36: src0.xyz = temp[23], src0.w = temp[23], src1.xyz = const[2], src1.w = temp[18], src2.xyz = temp[18] MAD temp[24].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[24].w, src0.w, src1.x, src1.w 37: src0.xyz = temp[29], src0.w = temp[29], src1.xyz = const[2], src1.w = temp[24], src2.xyz = temp[24] MAD temp[30].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[30].w, src0.w, src1.y, src1.w 38: src0.xyz = temp[35], src0.w = temp[35], src1.xyz = const[2], src1.w = temp[30], src2.xyz = temp[30] MAD temp[36].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[36].w, src0.w, src1.z, src1.w 39: src0.xyz = temp[41], src0.w = temp[41], src1.xyz = temp[36], src1.w = const[2], src2.w = temp[36] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = input[0], src1.xyz = const[1] MAD temp[7].xy, src0.xy_, src0.11_, src1.xy_ 1: BEGIN_TEX; 2: TEX temp[8], input[0].xy__, 2D[0]; 3: TEX temp[1], input[1].xy__, 2D[0]; 4: TEX temp[2], input[2].xy__, 2D[0]; 5: TEX temp[3], input[3].xy__, 2D[0]; 6: TEX temp[4], input[4].xy__, 2D[0]; 7: TEX temp[5], input[5].xy__, 2D[0]; 8: TEX temp[6], input[6].xy__, 2D[0]; 9: TEX temp[7], temp[7].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 10: src0.xyz = input[0], src1.xyz = const[1] MAD temp[9].xy, src0.xy_, src0.11_, -src1.xy_ 11: src0.xyz = input[0], src1.xyz = const[3] MAD temp[10].xy, src0.xy_, src0.11_, src1.xy_ 12: src0.xyz = input[0], src1.xyz = const[3] MAD temp[11].xy, src0.xy_, src0.11_, -src1.xy_ 13: src0.xyz = input[0], src1.xyz = const[4] MAD temp[12].xy, src0.xy_, src0.11_, src1.xy_ 14: src0.xyz = input[0], src1.xyz = const[4] MAD temp[13].xy, src0.xy_, src0.11_, -src1.xy_ 15: src0.xyz = input[0], src1.xyz = const[5] MAD temp[14].xy, src0.xy_, src0.11_, src1.xy_ 16: src0.xyz = input[0], src1.xyz = const[5] MAD temp[0].xy, src0.xy_, src0.11_, -src1.xy_ 17: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = temp[2], src1.w = temp[2] SEM_WAIT MAD temp[1].xyz, src0.xyz, src0.111, src1.xyz MAD temp[0].w, src0.w, src0.1, src1.w 18: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = temp[4], src1.w = temp[4] MAD temp[2].xyz, src0.xyz, src0.111, src1.xyz MAD temp[1].w, src0.w, src0.1, src1.w 19: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] MAD temp[3].xyz, src0.xyz, src0.111, src1.xyz MAD temp[2].w, src0.w, src0.1, src1.w 20: src0.xyz = temp[8], src0.w = temp[8], src1.xyz = const[0] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD temp[3].w, src0.w, src1.x, src0.0 21: src0.xyz = temp[1], src0.w = temp[0], src1.xyz = const[0], src1.w = temp[3], src2.xyz = temp[4] MAD temp[1].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[0].w, src0.w, src1.y, src1.w 22: src0.xyz = temp[2], src0.w = temp[1], src1.xyz = const[0], src1.w = temp[0], src2.xyz = temp[1] MAD temp[1].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[0].w, src0.w, src1.z, src1.w 23: src0.xyz = temp[3], src0.w = temp[2], src1.xyz = temp[1], src1.w = const[0], src2.w = temp[0] MAD temp[1].xyz, src0.xyz, src1.www, src1.xyz MAD temp[0].w, src0.w, src1.w, src2.w 24: BEGIN_TEX; 25: TEX temp[2], temp[9].xy__, 2D[0]; 26: TEX temp[3], temp[10].xy__, 2D[0]; 27: TEX temp[4], temp[11].xy__, 2D[0]; 28: TEX temp[5], temp[12].xy__, 2D[0]; 29: TEX temp[6], temp[13].xy__, 2D[0]; 30: TEX temp[8], temp[14].xy__, 2D[0]; 31: TEX temp[9], temp[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 32: src0.xyz = temp[7], src0.w = temp[7], src1.xyz = temp[2], src1.w = temp[2] SEM_WAIT MAD temp[0].xyz, src0.xyz, src0.111, src1.xyz MAD temp[1].w, src0.w, src0.1, src1.w 33: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = temp[4], src1.w = temp[4] MAD temp[2].xyz, src0.xyz, src0.111, src1.xyz MAD temp[2].w, src0.w, src0.1, src1.w 34: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6], src1.w = temp[6] MAD temp[3].xyz, src0.xyz, src0.111, src1.xyz MAD temp[3].w, src0.w, src0.1, src1.w 35: src0.xyz = temp[8], src0.w = temp[8], src1.xyz = temp[9], src1.w = temp[9] MAD temp[4].xyz, src0.xyz, src0.111, src1.xyz MAD temp[4].w, src0.w, src0.1, src1.w 36: src0.xyz = temp[0], src0.w = temp[1], src1.xyz = const[2], src1.w = temp[0], src2.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz MAD temp[0].w, src0.w, src1.x, src1.w 37: src0.xyz = temp[2], src0.w = temp[2], src1.xyz = const[2], src1.w = temp[0], src2.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.yyy, src2.xyz MAD temp[0].w, src0.w, src1.y, src1.w 38: src0.xyz = temp[3], src0.w = temp[3], src1.xyz = const[2], src1.w = temp[0], src2.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.zzz, src2.xyz MAD temp[0].w, src0.w, src1.z, src1.w 39: src0.xyz = temp[4], src0.w = temp[4], src1.xyz = temp[0], src1.w = const[2], src2.w = temp[0] MAD color[0].xyz, src0.xyz, src1.www, src1.xyz MAD color[0].w, src0.w, src1.w, src2.w R500 Fragment Program: -------- 0 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040400:Addr0: 0t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00421070:MAD dest:7 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 1 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe408f400: src: 0 R/G/A/A dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe402f402: src: 2 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 4 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe403f403: src: 3 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 5 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe404f404: src: 4 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 6 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f405: src: 5 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 7 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe406f406: src: 6 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 8 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe407f407: src: 7 R/G/A/A dst: 7 R/G/B/A 3:TEX_DXDY: 0x00000000 9 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040400:Addr0: 0t, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c21090:MAD dest:9 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 10 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040c00:Addr0: 0t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004210a0:MAD dest:10 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08040c00:Addr0: 0t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c210b0:MAD dest:11 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 12 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004210c0:MAD dest:12 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c210d0:MAD dest:13 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 14 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041400:Addr0: 0t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004210e0:MAD dest:14 rgb_C_src:1 R/G/0 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00001800:ALU wmask: RG omask: NONE 1:RGB_ADDR 0x08041400:Addr0: 0t, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009b0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c21000:MAD dest:0 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 16 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000801:Addr0: 1t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 17 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 18 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221030:MAD dest:3 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 19 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08040008:Addr0: 8t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x0008c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:1 R 0 targ 0 w:0 5 RGBA_INST: 0x20490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 20 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00440001:Addr0: 1t, Addr1: 0c, Addr2: 4t, srcp:0 2:ALPHA_ADDR 0x08000c00:Addr0: 0t, Addr1: 3t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0028c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x1a222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 21 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00140002:Addr0: 2t, Addr1: 0c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x0048c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x1a222010:MAD dest:1 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 22 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000403:Addr0: 3t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x00040002:Addr0: 2t, Addr1: 0c, Addr2: 0t, srcp:0 3 RGB_INST: 0x006da220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 A/A/A 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c221010:MAD dest:1 rgb_C_src:1 R/G/B 0 alp_C_src:2 A 0 23 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe402f409: src: 9 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 24 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe403f40a: src: 10 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 25 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe404f40b: src: 11 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 26 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe405f40c: src: 12 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 27 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe406f40d: src: 13 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 28 0:CMN_INST 0x00007803:TEX wmask: ARGB omask: NONE 1:TEX_INST: 0x00400000: id: 0 op:LD, , SCALED 2:TEX_ADDR: 0xe408f40e: src: 14 R/G/A/A dst: 8 R/G/B/A 3:TEX_DXDY: 0x00000000 29 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe409f400: src: 0 R/G/A/A dst: 9 R/G/B/A 3:TEX_DXDY: 0x00000000 30 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000807:Addr0: 7t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000807:Addr0: 7t, Addr1: 2t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221000:MAD dest:0 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 31 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001003:Addr0: 3t, Addr1: 4t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c020:MAD dest:2 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221020:MAD dest:2 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 32 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08001805:Addr0: 5t, Addr1: 6t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c030:MAD dest:3 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221030:MAD dest:3 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 33 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x08002408:Addr0: 8t, Addr1: 9t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08002408:Addr0: 8t, Addr1: 9t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c040:MAD dest:4 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x1a221040:MAD dest:4 rgb_C_src:1 R/G/B 0 alp_C_src:1 A 0 34 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00140800:Addr0: 0t, Addr1: 2c, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x0008c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 R 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 35 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00040802:Addr0: 2t, Addr1: 2c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000002:Addr0: 2t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x0028c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 G 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 36 0:CMN_INST 0x00007800:ALU wmask: ARGB omask: NONE 1:RGB_ADDR 0x00040803:Addr0: 3t, Addr1: 2c, Addr2: 0t, srcp:0 2:ALPHA_ADDR 0x08000003:Addr0: 3t, Addr1: 0t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00492220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 B/B/B 0 targ: 0 4 ALPHA_INST:0x0048c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 B 0 targ 0 w:0 5 RGBA_INST: 0x1a222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:1 A 0 37 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08000004:Addr0: 4t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x00040804:Addr0: 4t, Addr1: 2c, Addr2: 0t, srcp:0 3 RGB_INST: 0x006da220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 A/A/A 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x1c221000:MAD dest:0 rgb_C_src:1 R/G/B 0 alp_C_src:2 A 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 38 Instructions ~ 23 Vector Instructions (RGB) ~ 15 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 15 Texture Instructions ~ 0 Presub Operations ~ 0 OMOD Operations ~ 15 Temporary Registers ~ 0 Inline Literals ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[2], IN[2] 5: MOV_SAT OUT[1], IN[1] 6: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[1]; 7: MOV output[3], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[2], input[2]; 5: MOV_SAT output[1], input[1]; 6: MOV output[0], temp[0]; 7: MOV output[3], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 5: op: 0x01f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 8 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, COLOR DCL IN[1], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0] 0: TEX TEMP[0], IN[1], SAMP[0], RECT 1: MUL OUT[0], IN[0], TEMP[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[1], RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, RECT[0]; 1: MUL output[0], input[0], temp[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, RECT[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, RECT[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, RECT[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, RECT[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, RECT[0]; 1: MUL output[0], input[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[1].xy__, RECT[0]; 1: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, RECT[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, RECT[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[1].xy__, RECT[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = input[0], src0.w = input[0], src1.xyz = temp[1], src1.w = temp[1] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x0a400000: id: 0 op:LD, ACQ, UNSCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08000400:Addr0: 0t, Addr1: 1t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0] 0: TEX TEMP[0], IN[0], SAMP[0], 2D 1: MUL OUT[0], TEMP[0], CONST[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'transform IF' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[0].xy__, 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'register rename' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MUL output[0], temp[1], const[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MUL output[0], temp[1], const[0]; Fragment Program: after 'inline literals' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MUL output[0], temp[1], const[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MUL output[0], temp[1], const[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MUL output[0], temp[1], const[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = const[0], src1.w = const[0] MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = const[0], src1.w = const[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'dead sources' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[1], src0.w = temp[1], src1.xyz = const[0], src1.w = const[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[0].xy__, 2D[0] SEM_WAIT SEM_ACQUIRE; 2: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[0], src1.w = const[0] SEM_WAIT MAD color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD color[0].w, src0.w, src1.w, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0..1] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MUL TEMP[1], IN[0].xxxx, CONST[0] 5: MAD TEMP[1], IN[0].yyyy, CONST[1], TEMP[1] 6: MAD TEMP[1], IN[0].zzzz, CONST[2], TEMP[1] 7: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[1] 8: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[2], input[0].wwww, const[3], temp[0]; 4: MUL temp[1], input[0].xxxx, const[0]; 5: MAD temp[1], input[0].yyyy, const[1], temp[1]; 6: MAD temp[1], input[0].zzzz, const[2], temp[1]; 7: MAD temp[2], input[0].wwww, const[3], temp[1]; 8: MOV output[0], temp[2]; 9: MOV output[1], temp[2]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[2], input[0].wwww, const[3], temp[0]; 4: MUL temp[1], input[0].xxxx, const[0]; 5: MAD temp[1], input[0].yyyy, const[1], temp[1]; 6: MAD temp[1], input[0].zzzz, const[2], temp[1]; 7: MAD temp[2], input[0].wwww, const[3], temp[1]; 8: MOV output[0], temp[2]; 9: MOV output[1], temp[2]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[2], input[0].wwww, const[3], temp[0]; 4: MUL temp[1], input[0].xxxx, const[0]; 5: MAD temp[1], input[0].yyyy, const[1], temp[1]; 6: MAD temp[1], input[0].zzzz, const[2], temp[1]; 7: MAD temp[2], input[0].wwww, const[3], temp[1]; 8: MOV output[0], temp[2]; 9: MOV output[1], temp[2]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[0]; 1: MAD temp[1], input[0].yyyy, const[1], temp[1]; 2: MAD temp[1], input[0].zzzz, const[2], temp[1]; 3: MAD temp[2], input[0].wwww, const[3], temp[1]; 4: MOV output[0], temp[2]; 5: MOV output[1], temp[2]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[0]; 1: MAD temp[1], input[0].yyyy, const[1], temp[1]; 2: MAD temp[1], input[0].zzzz, const[2], temp[1]; 3: MAD temp[2], input[0].wwww, const[3], temp[1]; 4: MOV output[0], temp[2]; 5: MOV output[1], temp[2]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[0]; 1: MAD temp[1], input[0].yyyy, const[1], temp[1]; 2: MAD temp[1], input[0].zzzz, const[2], temp[1]; 3: MAD temp[2], input[0].wwww, const[3], temp[1]; 4: MOV output[0], temp[2]; 5: MOV output[1], temp[2]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0] DCL CONST[2..6] DCL TEMP[0..2] IMM[0] FLT32 { 1.0000, 0.0000, 0.0000, 0.0000} 0: MUL TEMP[0], IN[0].xxxx, CONST[3] 1: MAD TEMP[0], IN[0].yyyy, CONST[4], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[5], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[6], TEMP[0] 4: MUL TEMP[1], IN[0].xxxx, CONST[3] 5: MAD TEMP[1], IN[0].yyyy, CONST[4], TEMP[1] 6: MAD TEMP[1], IN[0].zzzz, CONST[5], TEMP[1] 7: MAD OUT[0], IN[0].wwww, CONST[6], TEMP[1] 8: DP4 TEMP[2].x, IN[0], CONST[0] 9: SUB OUT[1].x, IMM[0].xxxx, TEMP[2].xxxx 10: MOV OUT[1].y, IMM[0].xxxx 11: MOV OUT[1].z, IMM[0].yyyy 12: MOV OUT[1].w, CONST[2].xxxx 13: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[3], input[0].wwww, const[6], temp[0]; 4: MUL temp[1], input[0].xxxx, const[3]; 5: MAD temp[1], input[0].yyyy, const[4], temp[1]; 6: MAD temp[1], input[0].zzzz, const[5], temp[1]; 7: MAD temp[3], input[0].wwww, const[6], temp[1]; 8: DP4 temp[2].x, input[0], const[0]; 9: SUB output[1].x, temp[0].1111, temp[2].xxxx; 10: MOV output[1].y, temp[0].1111; 11: MOV output[1].z, temp[0].0000; 12: MOV output[1].w, const[2].xxxx; 13: MOV output[0], temp[3]; 14: MOV output[2], temp[3]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[3], input[0].wwww, const[6], temp[0]; 4: MUL temp[1], input[0].xxxx, const[3]; 5: MAD temp[1], input[0].yyyy, const[4], temp[1]; 6: MAD temp[1], input[0].zzzz, const[5], temp[1]; 7: MAD temp[3], input[0].wwww, const[6], temp[1]; 8: DP4 temp[2].x, input[0], const[0]; 9: SUB output[1].x, temp[0].1111, temp[2].xxxx; 10: MOV output[1].y, temp[0].1111; 11: MOV output[1].z, temp[0].0000; 12: MOV output[1].w, const[2].xxxx; 13: MOV output[0], temp[3]; 14: MOV output[2], temp[3]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[3], input[0].wwww, const[6], temp[0]; 4: MUL temp[1], input[0].xxxx, const[3]; 5: MAD temp[1], input[0].yyyy, const[4], temp[1]; 6: MAD temp[1], input[0].zzzz, const[5], temp[1]; 7: MAD temp[3], input[0].wwww, const[6], temp[1]; 8: DP4 temp[2].x, input[0], const[0]; 9: ADD output[1].x, temp[0].1111, -temp[2].xxxx; 10: MOV output[1].y, temp[0].1111; 11: MOV output[1].z, temp[0].0000; 12: MOV output[1].w, const[2].xxxx; 13: MOV output[0], temp[3]; 14: MOV output[2], temp[3]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[3]; 1: MAD temp[1], input[0].yyyy, const[4], temp[1]; 2: MAD temp[1], input[0].zzzz, const[5], temp[1]; 3: MAD temp[3], input[0].wwww, const[6], temp[1]; 4: DP4 temp[2].x, input[0], const[0]; 5: ADD output[1].x, temp[0].1___, -temp[2].x___; 6: MOV output[1].y, temp[0]._1__; 7: MOV output[1].z, temp[0].__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[3]; 10: MOV output[2], temp[3]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[3]; 1: MAD temp[1], input[0].yyyy, const[4], temp[1]; 2: MAD temp[1], input[0].zzzz, const[5], temp[1]; 3: MAD temp[3], input[0].wwww, const[6], temp[1]; 4: DP4 temp[2].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[2].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[3]; 10: MOV output[2], temp[3]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[3]; 1: MAD temp[1], input[0].yyyy, const[4], temp[1]; 2: MAD temp[1], input[0].zzzz, const[5], temp[1]; 3: MAD temp[3], input[0].wwww, const[6], temp[1]; 4: DP4 temp[2].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[2].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[3]; 10: MOV output[2], temp[3]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[0], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[0]; 10: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[0], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[0]; 10: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[3]; 1: MAD temp[0], input[0].yyyy, const[4], temp[0]; 2: MAD temp[0], input[0].zzzz, const[5], temp[0]; 3: MAD temp[0], input[0].wwww, const[6], temp[0]; 4: DP4 temp[1].x, input[0], const[0]; 5: ADD output[1].x, none.1___, -temp[1].x___; 6: MOV output[1].y, none._1__; 7: MOV output[1].z, none.__0_; 8: MOV output[1].w, const[2].___x; 9: MOV output[0], temp[0]; 10: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x01248062 reg: 3c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 5: op: 0x00102203 dst: 1o op: VE_ADD src0: 0x01ffa000 reg: 0t swiz: 1/ U/ U/ U src1: 0x1fff0020 reg: 1t swiz: -X/-U/-U/-U src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00202203 dst: 1o op: VE_ADD src0: 0x01fde000 reg: 0t swiz: U/ 1/ U/ U src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00402203 dst: 1o op: VE_ADD src0: 0x01e7e000 reg: 0t swiz: U/ U/ 0/ U src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00802203 dst: 1o op: VE_ADD src0: 0x003fe042 reg: 2c swiz: U/ U/ U/ X src1: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 9: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 10: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 11 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..5] DCL TEMP[0..2] 0: MUL TEMP[0], IN[0].xxxx, CONST[2] 1: MAD TEMP[0], IN[0].yyyy, CONST[3], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[4], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[5], TEMP[0] 4: MUL TEMP[1], IN[0].xxxx, CONST[2] 5: MAD TEMP[1], IN[0].yyyy, CONST[3], TEMP[1] 6: MAD TEMP[1], IN[0].zzzz, CONST[4], TEMP[1] 7: MAD OUT[0], IN[0].wwww, CONST[5], TEMP[1] 8: DP4 TEMP[2], CONST[0], IN[0] 9: SUB OUT[1], CONST[1].yyyy, TEMP[2] 10: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[3], input[0].wwww, const[5], temp[0]; 4: MUL temp[1], input[0].xxxx, const[2]; 5: MAD temp[1], input[0].yyyy, const[3], temp[1]; 6: MAD temp[1], input[0].zzzz, const[4], temp[1]; 7: MAD temp[3], input[0].wwww, const[5], temp[1]; 8: DP4 temp[2], const[0], input[0]; 9: SUB output[1], const[1].yyyy, temp[2]; 10: MOV output[0], temp[3]; 11: MOV output[2], temp[3]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[3], input[0].wwww, const[5], temp[0]; 4: MUL temp[1], input[0].xxxx, const[2]; 5: MAD temp[1], input[0].yyyy, const[3], temp[1]; 6: MAD temp[1], input[0].zzzz, const[4], temp[1]; 7: MAD temp[3], input[0].wwww, const[5], temp[1]; 8: DP4 temp[2], const[0], input[0]; 9: SUB output[1], const[1].yyyy, temp[2]; 10: MOV output[0], temp[3]; 11: MOV output[2], temp[3]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[3], input[0].wwww, const[5], temp[0]; 4: MUL temp[1], input[0].xxxx, const[2]; 5: MAD temp[1], input[0].yyyy, const[3], temp[1]; 6: MAD temp[1], input[0].zzzz, const[4], temp[1]; 7: MAD temp[3], input[0].wwww, const[5], temp[1]; 8: DP4 temp[2], const[0], input[0]; 9: ADD output[1], const[1].yyyy, -temp[2]; 10: MOV output[0], temp[3]; 11: MOV output[2], temp[3]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[2]; 1: MAD temp[1], input[0].yyyy, const[3], temp[1]; 2: MAD temp[1], input[0].zzzz, const[4], temp[1]; 3: MAD temp[3], input[0].wwww, const[5], temp[1]; 4: DP4 temp[2], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[2]; 6: MOV output[0], temp[3]; 7: MOV output[2], temp[3]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[2]; 1: MAD temp[1], input[0].yyyy, const[3], temp[1]; 2: MAD temp[1], input[0].zzzz, const[4], temp[1]; 3: MAD temp[3], input[0].wwww, const[5], temp[1]; 4: DP4 temp[2], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[2]; 6: MOV output[0], temp[3]; 7: MOV output[2], temp[3]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[1], input[0].xxxx, const[2]; 1: MAD temp[1], input[0].yyyy, const[3], temp[1]; 2: MAD temp[1], input[0].zzzz, const[4], temp[1]; 3: MAD temp[3], input[0].wwww, const[5], temp[1]; 4: DP4 temp[2], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[2]; 6: MOV output[0], temp[3]; 7: MOV output[2], temp[3]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[0], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[0]; 7: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[0], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[0]; 7: MOV output[2], temp[0]; Vertex Program: after 'lower control flow opcodes' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[2]; 1: MAD temp[0], input[0].yyyy, const[3], temp[0]; 2: MAD temp[0], input[0].zzzz, const[4], temp[0]; 3: MAD temp[0], input[0].wwww, const[5], temp[0]; 4: DP4 temp[1], const[0], input[0]; 5: ADD output[1], const[1].yyyy, -temp[1]; 6: MOV output[0], temp[0]; 7: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x01248042 reg: 2c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00492022 reg: 1c swiz: Y/ Y/ Y/ Y src1: 0x1ed10020 reg: 1t swiz: -X/-Y/-Z/-W src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 8 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~