Summary: | Failed to build shader (translation from TGSI) | ||
---|---|---|---|
Product: | Mesa | Reporter: | Enver Balalic <balalic.enver> |
Component: | Drivers/Gallium/r600 | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | normal | ||
Priority: | medium | CC: | balalic.enver, elia.argentieri, gw.fossdev, mirh, xavier.giannakopoulos |
Version: | 13.0 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
glxinfo
TGSI for failing shader Patch proposed on mesa-dev to work around too many temporaries |
With the R600_DEBUG=sbsafemath flag the game starts, it still spams the console with the error. The skybox is not being rendered and an box of pink flickers on the screen yes ive got this issue also on the new godot engine (3.0 git) More info here: https://github.com/godotengine/godot/issues/8774 I've had the same error output when I was playing with the Unreal Editor with an HD6850 (BARTS) using the latest mesa-git. In order to track it down I added some debugging to mesa and could get the following trace that leads to the error: r600/r600_asm.c:615 check_and_set_bank_swizzle - Couldn't find a working swizzle drivers/r600/r600_shader.c:3975 tgsi_op2_s - Error in tgsi_op2_s, i = 3, r600_bytecode_add_alu returned -1 r600/r600_shader.c:3332 r600_shader_from_tgsi - Failed to build shader at ctx.inst_info->process, chip class: 6, opcode: 7 result: -1 Since I added output for every "return -1" in r600/r600_asm.c I also get a lots of other messages, but I guess that these are normal. Debugging the code I also found that check_and_set_bank_swizzle exiting with "Couldn't find a working swizzle" is not necessarily an error, which means that I'm now a bit lost, because I'm not sure about how continue debugging this problem. The same problem also seems to be discussed in these mails http://mesa-dev.freedesktop.narkive.com/cHAXj1eT/bug-50338-radeon-tgsi-takes-more-than-two-cfiles-from-r600-shader It is very likely that this is actually a duplicated of #50338. Adding yet more debugging output, and so far it seems that there is only one operation failing: a multiplication of two operands with a write mask of 0xF (see log below). I also tested gzdoom like the poster I referenced in my last comment, but on my machine it works fine. r600_asm.c:1297 r600_bytecode_add_alu_type - check_and_set_bank_swizzle returned -1 r600_asm.c:1303 r600_bytecode_add_alu_type - slot[0]: op: 2, bank_swizzle:0 bank_swizzle_force: 0 r600_asm.c:1303 r600_bytecode_add_alu_type - slot[1]: op: 2, bank_swizzle:0 bank_swizzle_force: 0 r600_asm.c:1303 r600_bytecode_add_alu_type - slot[2]: op: 2, bank_swizzle:0 bank_swizzle_force: 0 r600_asm.c:1303 r600_bytecode_add_alu_type - slot[3]: op: 2, bank_swizzle:0 bank_swizzle_force: 0 r600_asm.c:1305 r600_bytecode_add_alu_type - slot[4] = 0: 0 r600_shader.c:3979 tgsi_op2_s - Error in tgsi_op2_s, i = 3, lasti = 3, r600_bytecode_add_alu returned -1 r600_shader.c:3981 tgsi_op2_s - op=2, num src registers: 2, write_mask=15 r600_shader.c:3983 tgsi_op2_s - alu.dst: {sel: 15, chan: 3, clamp: 0, write: 1, rel: 0 r600_shader.c:3986 tgsi_op2_s - alu.src0: {sel: 160, chan: 3, kc_bank:0 r600_shader.c:3988 tgsi_op2_s - alu.src1: {sel: 535, chan: 3, kc_bank:0 r600_shader.c:3332 r600_shader_from_tgsi - Failed to build shader at ctx.inst_info->process, chip class: 6, opcode: 7 result: -1 r600_shader.c:183 r600_pipe_shader_create - translation from TGSI failed ! r600_state_common.c:787 r600_shader_select - Failed to build shader variant (type=1) -1 The actual instruction failing is MUL TEMP[11], CONST[26], CONST[23] i.e. the multiplication of two constants. Now, just multiplying two constants/uniforms not necessarily trigger the bug. With a simple shader program like uniform vec4 base_color; uniform vec4 test; uniform vec4 test2; uniform vec4 test3; void main() { vec4 h1 = base_color * test; vec4 h2 = test2 * test3; gl_FragColor = h1 * h2; } for both const-const multiplications one constant is always addressed via a GPR, i.e. I get 1: MUL TEMP[0], CONST[0], CONST[1] r600_shader.c:3986 tgsi_op2_s - About to multiply two constants r600_shader.c:4000 tgsi_op2_s - ctx->src[0]: sel:7 // this is a GPR address swizzle:0 1 2 3 neg:0 abs:0 rel:0 kc_bank:0 kc_rel:0 value:0 0 0 0 r600_shader.c:4000 tgsi_op2_s - ctx->src[1]: sel:513 // this is a cfile address swizzle:0 1 2 3 neg:0 abs:0 rel:0 kc_bank:0 kc_rel:0 value:0 0 0 0 and then check_vector/reserve_cfile can successfully assign the read ports via cfile because only 4 values need to be read. However, for a more complicated shader I get the following: 250: MUL TEMP[11], CONST[26], CONST[23] r600_shader.c:3986 tgsi_op2_s - About to multiply two constants r600_shader.c:4000 tgsi_op2_s - ctx->src[0]: sel:160 // cfile kcache after translation swizzle:0 1 2 3 neg:0 abs:0 rel:0 kc_bank:0 kc_rel:0 value:0 0 0 0 r600_shader.c:4000 tgsi_op2_s - ctx->src[1]: sel:535 // cfile kcache before translation swizzle:0 1 2 3 neg:0 abs:0 rel:0 kc_bank:0 kc_rel:0 value:0 0 0 0 r600_asm.c:472 check_vector - bs->hw_cfile_addr:[-1 -1] bs->hw_cfile_elem: [-1 -1] bank_swizzle:0 num_src:2 r600_asm.c:494 check_vector - src 0: sel:160 elem:0 r600_asm.c:423 reserve_cfile - res=0: bs->hw_cfile_addr:-1 bs->hw_cfile_elem:-1 sel:160 chan:0 r600_asm.c:494 check_vector - src 1: sel:535 elem:0 r600_asm.c:423 reserve_cfile - res=0: bs->hw_cfile_addr:160 bs->hw_cfile_elem:0 sel:535 chan:0 r600_asm.c:423 reserve_cfile - res=1: bs->hw_cfile_addr:-1 bs->hw_cfile_elem:-1 sel:535 chan:0 r600_asm.c:472 check_vector - bs->hw_cfile_addr:[160 535] bs->hw_cfile_elem: [0 0] bank_swizzle:0 num_src:2 r600_asm.c:494 check_vector - src 0: sel:160 elem:1 r600_asm.c:423 reserve_cfile - res=0: bs->hw_cfile_addr:160 bs->hw_cfile_elem:0 sel:160 chan:0 r600_asm.c:494 check_vector - src 1: sel:535 elem:1 r600_asm.c:423 reserve_cfile - res=0: bs->hw_cfile_addr:160 bs->hw_cfile_elem:0 sel:535 chan:0 r600_asm.c:423 reserve_cfile - res=1: bs->hw_cfile_addr:535 bs->hw_cfile_elem:0 sel:535 chan:0 r600_asm.c:472 check_vector - bs->hw_cfile_addr:[160 535] bs->hw_cfile_elem: [0 0] bank_swizzle:0 num_src:2 r600_asm.c:494 check_vector - src 0: sel:160 elem:2 r600_asm.c:423 reserve_cfile - res=0: bs->hw_cfile_addr:160 bs->hw_cfile_elem:0 sel:160 chan:1 r600_asm.c:423 reserve_cfile - res=1: bs->hw_cfile_addr:535 bs->hw_cfile_elem:0 sel:160 chan:1 r600_asm.c:436 reserve_cfile - All cfile read ports are used, cannot reference vector element. In summary allocating a read port for elem >= 2 fails, because it would mean reading more than four values in one instruction group, and this is ot possible according to the AMD Evergreen-Family instruction set manual 4.7.5. The mesa-code with the added debugging output can be found at: https://github.com/gerddie/mesa It turns out that in r600_shader.c:tgsi_split_constant the constants should be moved to the GPR range, but for large shaders this is not sufficient, since the temporary registers used there may be beyond 127 which is the limit for GPRs. tgsi_split_constant doesn't move all constants and if an operator uses the same constant as source more than once, then one of the instances of the constants is moved to a new address, and this may even be counter productive. Now to fix this bug, a partial workaround is in the repo I've given above, the patch changes the register handling to reserve a few registers in the low range and moves constants there if necessary. Note, however, that it also contains additional debugging output. However, for the instruction LRP TEMP[0].xyz, CONST[31].wwww, CONST[31].xyzz, TEMP[0].xyzz check_and_set_bank_swizzle still fails. (This is, by the way, one such case where tgsi_split_constant moves one of the instances of CONST[31] to another place.) I will try to correct tgsi_split_constant to not move the values around if they are originally from the same source and see whether this fixes the problem. Well, it turns out that the shader simply uses too many registers, and since this is only tested at the end, at one point the indices of the temporaries used to store constants are beyond the GPR range, which makes translation from TGSI fail because it tries to do use two or more cfile addresses in one instruction, and this is not allowed. Created attachment 131567 [details]
TGSI for failing shader
This is the failing shader. For some reasons 151 GPRs are allocate as TEMP but only 40 actually appear as source for an operation, the remaining ones are all only targets and discarded.
Created attachment 131683 [details] [review] Patch proposed on mesa-dev to work around too many temporaries I tried your patch, but unfortunately it didn't solve my problem with godot engine... I don't get any error about GPR limits, just this: EE r600_shader.c:190 r600_pipe_shader_create - translation from TGSI failed ! EE r600_state_common.c:816 r600_shader_select - Failed to build shader variant (type=1) -1 There must be another bug. Thank you for your effort. Actually that patch was more of a bad hack. Try this new patch set that goes to the source of the problem: https://patchwork.freedesktop.org/series/26330/ for me it solved this "translation from TGSI" problem with large shaders with the GpuTest 0.7.0 piano and voloplosion benchmarks. Please note that after applying these patches you will have to set the environment variable MESA_GLSL_TO_TGSI_NEW_MERGE to activate it. Now it works! I also had to set MESA_GLSL_CACHE_DISABLE to make it work, maybe it was picking the old shader. Thank you very much. So if the patch works, will it be merged? or is already merged, and if yes, in what version? An updated version of the patch set is currently under review: https://patchwork.freedesktop.org/series/25594/ Fix applied and consolidated with c4741bbb6fb98f78551f9e42ae570dcc924e0031 Note that the series adds an extra optimisation pass. As such it's not suitable for stable. You'll have to use mesa from git until 17.3 is out. http://www.graphicsfuzz.com/benchmark/android-v1.html I'm still having the same problem with mesa-git and firefox nightly (HD 6310) r600_shader_from_tgsi - GPR limit exceeded - shader requires 130 registers EE r600_shader.c:183 r600_pipe_shader_create - translation from TGSI failed ! EE r600_state_common.c:872 r600_shader_select - Failed to build shader variant (type=1) -12 Specific issue was reported in bug 105371. Follows there. Sorry for the bother. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 128864 [details] glxinfo I'm using a Radeon HD6950, it fails to build a shader when playing "War Thunder", renders the login screen fine, starts to build shaders and then fails and spams the following in the console: EE r600_state_common.c:799 r600_shader_select - Failed to build shader variant (type=1) -1 EE r600_shader.c:183 r600_pipe_shader_create - translation from TGSI failed ! same thing happens a lof of shaders from shadertoy.com. OS: OpenSUSE Tumbleweed, glxinfo attached