Created attachment 145070 [details] reproducer While I was trying to create a shader_test for the issue: https://bugs.freedesktop.org/show_bug.cgi?id=111069 I created a test which leads the latest mesa master to an infinity compilation. Like in the mentioned issue there is a loop without any exits. And compiler stucks in an optimization loop because 'progress' always is true. One of optimization can't stop some constants generation( here is a part of nir: > vec1 32 ssa_5 = load_const (0x3fc00000 /* 1.500000 */) > vec1 32 ssa_6 = load_const (0x40100000 /* 2.250000 */) > vec1 32 ssa_7 = load_const (0x40580000 /* 3.375000 */) > ... > ... > vec1 32 ssa_221 = load_const (0x7ef509be /* 162855721648427948063735027106695872512.000000 */) > vec1 32 ssa_222 = load_const (0x7f37c74e /* 244283572331437120269767328686418165760.000000 */) > vec1 32 ssa_223 = load_const (0x7f800000 /* inf */) Possible that this and mentioned issue could be fixed using a single solution which was mentioned by Jason Ekstrand in the bug111069: > 1. Detect when we've deleted the last exit of a loop and simply delete the entire loop immediately in nir_opt_dead_cf. But it is just an assumption. I am investigating this issue.
08bfd710a25c14df5f690cce9604617536d7c560 is the first bad commit commit 08bfd710a25c14df5f690cce9604617536d7c560 Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Wed Feb 13 21:42:39 2019 -0600 nir/dead_cf: Stop relying on liveness analysis The liveness analysis pass is fairly expensive because it has to build large bit-sets and run a fix-point algorithm on them. Instead of requiring liveness for detecting if values escape a CF node, just take advantage of the structured nature of NIR and use block indices instead. This only requires the block index metadata which is the fastest we have metadata to generate. No shader-db changes on Kaby Lake Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> I continue my investigation)
Created attachment 145071 [details] [review] simple fix Actually, I checked this solution and it helped for this bug but didn't for bug111069 because there the last break is removed in nir_opt_if isn't in nir_opt_dead_cf. So for bug111069 we need another solution or we need to move nir_opt_dead_cf after nir_opt_if. The second option helps for both issues I already checked that but didn't check side effects using Intel CI yet.
(In reply to andrii simiklit from comment #2) > Created attachment 145071 [details] [review] [review] > simple fix > > Actually, I checked this solution and it helped for this bug but didn't for > bug111069 because there the last break is removed in nir_opt_if isn't in > nir_opt_dead_cf. So for bug111069 we need another solution or we need to > move nir_opt_dead_cf after nir_opt_if. The second option helps for both > issues I already checked that but didn't check side effects using Intel CI > yet. Never mind, I found why it didn't work for bug111069, I will fix it a bit later.
Well, looks like I found a root cause. There is an infinite loop of the following optimizations: nir_opt_if->nir_opt_constant_folding->nir_opt_if->nir_opt_constant_folding->... For instance the original loop is: >vec1 32 ssa_1 = load_const (0x40000000 /* 2.000000 */) >vec1 32 ssa_2 = load_const (0x41200000 /* 10.000000 */) >... >loop { > vec1 32 ssa_10 = phi block_2: ssa_1, block_3: ssa_12 > vec1 32 ssa_12 = fmul ssa_10, ssa_2 >} nir_opt_if optimizes first loop iteration extracting fmul out of the loop: >vec1 32 ssa_0 = load_const (0x40000000 /* 2.000000 */) >vec1 32 ssa_1 = load_const (0x41200000 /* 10.000000 */) >... >vec1 32 ssa_6 = fmul ssa_0, ssa_1 >loop { > vec1 32 ssa_8 = phi block_1: ssa_6, block_2: ssa_10 > vec1 32 ssa_10 = fmul ssa_8, ssa_1 >} then nir_opt_constant_folding replaces fmul by constant value: >vec1 32 ssa_0 = load_const (0x40000000 /* 2.000000 */) >vec1 32 ssa_1 = load_const (0x41200000 /* 10.000000 */) >... >vec1 32 ssa_16 = load_const (0x41a00000 /* 20.000000 */) >loop { > vec1 32 ssa_8 = phi block_1: ssa_16, block_2: ssa_10 > vec1 32 ssa_10 = fmul ssa_8, ssa_1 >} and nir_opt_if does it again extracts fmul out of the loop: >vec1 32 ssa_0 = load_const (0x40000000 /* 2.000000 */) >vec1 32 ssa_1 = load_const (0x41200000 /* 10.000000 */) >... >vec1 32 ssa_16 = load_const (0x41a00000 /* 20.000000 */) >vec1 32 ssa_17 = fmul ssa_16, ssa_1 >loop { > vec1 32 ssa_19 = phi block_1: ssa_17, block_2: ssa_18 > vec1 32 ssa_18 = fmul ssa_19, ssa_1 >} again nir_opt_constant_folding replaces fmul by the constant .... The simple solution is to disallow `opt_split_alu_of_phi` for infinity loops but I can't say for sure that it is the best one at least for now. So I am investigating nir_opt_if
Ardrii: please use the bisected and regression tags when you find a bug like this. We rely on those tags to block releases for unfixed regressions.
Fixed by the following commit now in master: commit c832820ce959ae7c2b4971befceae634d800330f (HEAD -> master, origin/master, origin/HEAD) Author: Jason Ekstrand <jason@jlekstrand.net> Date: Fri Aug 30 11:35:26 2019 -0500 nir/dead_cf: Repair SSA if the pass makes progress The dead_cf pass calls into the CF manipulation helpers which attempt to keep NIR's SSA form sane. However, when the only break is removed from a loop, dominance gets messed up anyway because the CF SSA clean-up code only looks at phis and doesn't consider the case of code becoming unreachable. One solution to this would be to put the loop into LCSSA form before we modify any of its contents. Another (and the approach taken by this pass) is to just run the repair_ssa pass afterwards because the CF manipulation helpers are smart enough to keep all the use/def stuff sane; they just don't always preserve dominance properties. While we're here, we clean up some bogus indentation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.