I recently encountered some generated NIR like: vec1 ssa_0 = load_const (0x00000000 /* 0.000000 */) ... /* succs: block_3 block_4 */ if ssa_122 { block block_3: /* preds: block_2 */ /* succs: block_5 */ } else { block block_4: /* preds: block_2 */ vec1 ssa_139 = fmul ssa_138, ssa_1 /* succs: block_5 */ } block block_5: /* preds: block_3 block_4 */ vec1 ssa_140 = phi block_3: ssa_0, block_4: ssa_139 vec1 ssa_141 = fadd ssa_17, ssa_140 /* succs: block_6 block_7 */ if ssa_122 { block block_6: /* preds: block_5 */ vec1 ssa_142 = fmul ssa_138, ssa_1 /* succs: block_8 */ } else { block block_7: /* preds: block_5 */ /* succs: block_8 */ } block block_8: /* preds: block_6 block_7 */ vec1 ssa_143 = phi block_6: ssa_142, block_7: ssa_0 The i965 backend generates the obvious, terrible code for this. There are two levels of optimization that could occur on just the NIR. First, merge the two if-statements. /* succs: block_3 block_4 */ if ssa_122 { block block_3: /* preds: block_2 */ vec1 ssa_142 = fmul ssa_138, ssa_1 /* succs: block_5 */ } else { block block_4: /* preds: block_2 */ vec1 ssa_139 = fmul ssa_138, ssa_1 /* succs: block_5 */ } block block_5: /* preds: block_3 block_4 */ vec1 ssa_140 = phi block_3: ssa_0, block_4: ssa_139 vec1 ssa_143 = phi block_6: ssa_142, block_7: ssa_0 vec1 ssa_141 = fadd ssa_17, ssa_140 Second, notice that the calculations in both branches is the same, and pull it out. vec1 ssa_999 = fmul ssa_138, ssa_1 /* succs: block_3 block_4 */ if ssa_122 { block block_3: /* preds: block_2 */ /* succs: block_5 */ } else { block block_4: /* preds: block_2 */ /* succs: block_5 */ } block block_5: /* preds: block_3 block_4 */ vec1 ssa_140 = phi block_3: ssa_0, block_4: ssa_999 vec1 ssa_143 = phi block_6: ssa_999, block_7: ssa_0 vec1 ssa_141 = fadd ssa_17, ssa_140 Existing optimization passes should turn this into a pair of bcsel instructions.
> Existing optimization passes should turn this into a pair of bcsel instructions. Ian, is this already done or do you have patches for it?
(In reply to Eero Tamminen from comment #1) > > Existing optimization passes should turn this into a pair of bcsel instructions. > > Ian, is this already done or do you have patches for it? This is done however i965 doesn't make use of it. If you pass a non-zero limit to nir_opt_peephole_select() (i965 currently has a limit of zero, all other nir drivers seem to pass a value) these if-statments should get turned into bcsel.
(In reply to Timothy Arceri from comment #2) > (In reply to Eero Tamminen from comment #1) > > > Existing optimization passes should turn this into a pair of bcsel instructions. > > > > Ian, is this already done or do you have patches for it? > > This is done however i965 doesn't make use of it. If you pass a non-zero > limit to nir_opt_peephole_select() (i965 currently has a limit of zero, all > other nir drivers seem to pass a value) these if-statments should get turned > into bcsel. nip_opt_peephole_select would help with this specific case, but I have seen other cases where the bodies of the then-blocks and the else-blocks are quite large. I noticed one or two cases of this recently, but I don't recall which shaders. I will probably work on this fairly soon (sometime this year) if someone doesn't beat me to it.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/809.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.