Created attachment 143355 [details] reduced shader spir-v Hello The spir-v shader attached triggers an assert ../src/compiler/nir/nir.h:1040: nir_instr_as_deref: Assertion `parent && parent->type == nir_instr_type_deref' failed. (the shader is reduced from a larger shader.) => parent->type is "nir_instr_type_phi" bisected to commit 639c236e74e99524245c22f1fa0758603f558cf2 Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Thu Oct 25 10:34:34 2018 -0500 spirv: Emit NIR deref instructions on-the-fly This simplifies our deref handling by emitting the actual NIR deref instructions on-the-fly instead of of building up a deref chain and then emitting them at the last moment. In order for this to work with the parts of the compiler that assume they can chase deref chains, we have to run nir_rematerialize_derefs_in_use_blocks_impl to put the derefs back in the right places. Otherwise, in cases such as loop continues where the SPIR-V blocks are not in the same order as the NIR blocks, we may end up with a deref chain with a parent that does not dominate it's child and nir_repair_ssa_impl will insert phis in the deref chain. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> - backtrace: #0 0x00007ffff7c74d7f in raise () from /usr/lib/libc.so.6 #1 0x00007ffff7c5f672 in abort () from /usr/lib/libc.so.6 #2 0x00007ffff7c5f548 in __assert_fail_base.cold.0 () from /usr/lib/libc.so.6 #3 0x00007ffff7c6d396 in __assert_fail () from /usr/lib/libc.so.6 #4 0x00007ffff07bf416 in nir_instr_as_deref (parent=0x5555557c7040) at ../src/compiler/nir/nir.h:1039 #5 0x00007ffff07c0fa4 in validate_deref_instr (instr=0x5555557bd8a0, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:436 #6 0x00007ffff07c1b36 in validate_instr (instr=0x5555557bd8a0, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:623 #7 0x00007ffff07c1f21 in validate_block (block=0x5555557bc400, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:714 #8 0x00007ffff07c2a43 in validate_cf_node (node=0x5555557bc400, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:877 #9 0x00007ffff07c27a1 in validate_if (if_stmt=0x5555557bc330, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:831 #10 0x00007ffff07c2a63 in validate_cf_node (node=0x5555557bc330, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:881 #11 0x00007ffff07c2998 in validate_loop (loop=0x5555557b6ae0, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:863 #12 0x00007ffff07c2a83 in validate_cf_node (node=0x5555557b6ae0, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:885 #13 0x00007ffff07c2998 in validate_loop (loop=0x5555557b5240, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:863 #14 0x00007ffff07c2a83 in validate_cf_node (node=0x5555557b5240, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:885 #15 0x00007ffff07c381e in validate_function_impl (impl=0x5555556c75b0, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:1090 #16 0x00007ffff07c393a in validate_function (func=0x5555556ced40, state=0x7fffffffd8b0) at ../src/compiler/nir/nir_validate.c:1108 #17 0x00007ffff07c3fb9 in nir_validate_shader (shader=0x5555557a6430, when=0x7ffff084791d "after spirv_to_nir") at ../src/compiler/nir/nir_validate.c:1224 #18 0x00007ffff06c726f in radv_shader_compile_to_nir (device=0x55555578f0f0, module=0x55555578afe0, entrypoint_name=0x555555574443 "main", stage=MESA_SHADER_COMPUTE, spec_info=0x0, flags=0) at ../src/amd/vulkan/radv_shader.c:257 #19 0x00007ffff06bdd11 in radv_create_shaders (pipeline=0x55555578d890, device=0x55555578f0f0, cache=0x5555557994c0, key=0x7fffffffe250, pStages=0x7fffffffe220, flags=0) at ../src/amd/vulkan/radv_pipeline.c:2068 #20 0x00007ffff06c24fc in radv_compute_pipeline_create (_device=0x55555578f0f0, _cache=0x5555557994c0, pCreateInfo=0x7fffffffe3a0, pAllocator=0x0, pPipeline=0x7fffffffe380) at ../src/amd/vulkan/radv_pipeline.c:3791 #21 0x00007ffff06c267a in radv_CreateComputePipelines (_device=0x55555578f0f0, pipelineCache=0x5555557994c0, count=1, pCreateInfos=0x7fffffffe3a0, pAllocator=0x0, pPipelines=0x7fffffffe380) at ../src/amd/vulkan/radv_pipeline.c:3821 #22 0x0000555555565401 in create_compute_pipeline ()
the hlsl code it was compiled from: RWBuffer<uint> Weights : register(u0); struct SCtx { uint iList ; uint LightId ; float LightW ; }; void LoadLight(in out SCtx C) { const uint LightIdW = Weights[C.iList]; C.LightId = LightIdW & 0xFFFFFF; C.LightW = float(LightIdW >> 24)/255.; } void GoToNextLight(in out SCtx C, uint LightId) { if (LightId != C.LightId) return; C.iList++; LoadLight(C); } groupshared uint Group_DstLightIdWs[16 * 32]; [numthreads(32,1,1)] void main(uint iMerge : SV_DispatchThreadID) { const uint iListSrc0 = 4 * iMerge; const uint iDstLightIdW_Start = 16 * iMerge; SCtx Ctxs[4]; [loop] for (uint iDstLightIdW = 0; iDstLightIdW<16; iDstLightIdW++) { uint LightId = Ctxs[0].LightId; Group_DstLightIdWs[iDstLightIdW_Start + iDstLightIdW] = LightId; for (uint iListSrc = 0; iListSrc < 4; iListSrc++) { GoToNextLight(Ctxs[iListSrc], LightId); } } if (iMerge < 123) { for (uint iDstLightIdW = 0; iDstLightIdW<16; iDstLightIdW++) { Weights[iDstLightIdW_Start+iDstLightIdW] = Group_DstLightIdWs[iDstLightIdW_Start+iDstLightIdW]; } } }
I've got a patch on the mailing list which fixes this: https://patchwork.freedesktop.org/patch/285338/
(In reply to Jason Ekstrand from comment #2) > I've got a patch on the mailing list which fixes this: > > https://patchwork.freedesktop.org/patch/285338/ indeed, thanks. tested-by: Xavier Bouchoux <xavierb@gmail.com>
This has been fixed by the following commit in master: commit 9e6a6ef0d45a5bb61a541c495fe12e54e646ecfe (public/master) Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Sun Feb 10 22:23:01 2019 -0600 nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks When nir_rematerialize_derefs_in_use_blocks_impl was first written, I attempted to optimize things a bit by not bothering to re-materialize the sources of deref instructions figuring that the final caller would take care of that. However, in the case of more complex deref chains where the first link or two lives in block A and then another link and the load/store_deref intrinsic live in block B it doesn't work. The code in rematerialize_deref_in_block looks at the tail of the chain, sees that it's already in block B and skips it, not realizing that part of the chain also lives in block A. The easy solution here is to just rematerialize deref sources of deref instructions as well. This may potentially lead to a few more deref instructions being created by the conditions required for that to actually happen are fairly unlikely and, thanks to the caching, it's all linear time regardless. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603 Fixes: 7d1d1208c2b "nir: Add a small pass to rematerialize derefs per-block" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Sorry, but I forgot to add your tested-by before pushing. :-(
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.