Bug 94744

Summary: Merge similar if-statements
Product: Mesa Reporter: Ian Romanick <idr>
Component: glsl-compilerAssignee: Ian Romanick <idr>
Status: RESOLVED MOVED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: enhancement    
Priority: medium    
Version: git   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Ian Romanick 2016-03-29 18:55:27 UTC
I recently encountered some generated NIR like:

        vec1 ssa_0 = load_const (0x00000000 /* 0.000000 */)

        ...

                /* succs: block_3 block_4 */
                if ssa_122 {
                        block block_3:
                        /* preds: block_2 */
                        /* succs: block_5 */
                } else {
                        block block_4:
                        /* preds: block_2 */
                        vec1 ssa_139 = fmul ssa_138, ssa_1
                        /* succs: block_5 */
                }
                block block_5:
                /* preds: block_3 block_4 */
                vec1 ssa_140 = phi block_3: ssa_0, block_4: ssa_139
                vec1 ssa_141 = fadd ssa_17, ssa_140
                /* succs: block_6 block_7 */
                if ssa_122 {
                        block block_6:
                        /* preds: block_5 */
                        vec1 ssa_142 = fmul ssa_138, ssa_1
                        /* succs: block_8 */
                } else {
                        block block_7:
                        /* preds: block_5 */
                        /* succs: block_8 */
                }
                block block_8:
                /* preds: block_6 block_7 */
                vec1 ssa_143 = phi block_6: ssa_142, block_7: ssa_0

The i965 backend generates the obvious, terrible code for this.  There are two levels of optimization that could occur on just the NIR.  First, merge the two if-statements.

                /* succs: block_3 block_4 */
                if ssa_122 {
                        block block_3:
                        /* preds: block_2 */
                        vec1 ssa_142 = fmul ssa_138, ssa_1
                        /* succs: block_5 */
                } else {
                        block block_4:
                        /* preds: block_2 */
                        vec1 ssa_139 = fmul ssa_138, ssa_1
                        /* succs: block_5 */
                }
                block block_5:
                /* preds: block_3 block_4 */
                vec1 ssa_140 = phi block_3: ssa_0, block_4: ssa_139
                vec1 ssa_143 = phi block_6: ssa_142, block_7: ssa_0
                vec1 ssa_141 = fadd ssa_17, ssa_140

Second, notice that the calculations in both branches is the same, and pull it out.

                vec1 ssa_999 = fmul ssa_138, ssa_1
                /* succs: block_3 block_4 */
                if ssa_122 {
                        block block_3:
                        /* preds: block_2 */
                        /* succs: block_5 */
                } else {
                        block block_4:
                        /* preds: block_2 */
                        /* succs: block_5 */
                }
                block block_5:
                /* preds: block_3 block_4 */
                vec1 ssa_140 = phi block_3: ssa_0, block_4: ssa_999
                vec1 ssa_143 = phi block_6: ssa_999, block_7: ssa_0
                vec1 ssa_141 = fadd ssa_17, ssa_140

Existing optimization passes should turn this into a pair of bcsel instructions.
Comment 1 Eero Tamminen 2018-07-11 08:13:59 UTC
> Existing optimization passes should turn this into a pair of bcsel instructions.

Ian, is this already done or do you have patches for it?
Comment 2 Timothy Arceri 2018-07-18 06:29:17 UTC
(In reply to Eero Tamminen from comment #1)
> > Existing optimization passes should turn this into a pair of bcsel instructions.
> 
> Ian, is this already done or do you have patches for it?

This is done however i965 doesn't make use of it. If you pass a non-zero limit to nir_opt_peephole_select() (i965 currently has a limit of zero, all other nir drivers seem to pass a value) these if-statments should get turned into bcsel.
Comment 3 Ian Romanick 2018-07-18 21:49:34 UTC
(In reply to Timothy Arceri from comment #2)
> (In reply to Eero Tamminen from comment #1)
> > > Existing optimization passes should turn this into a pair of bcsel instructions.
> > 
> > Ian, is this already done or do you have patches for it?
> 
> This is done however i965 doesn't make use of it. If you pass a non-zero
> limit to nir_opt_peephole_select() (i965 currently has a limit of zero, all
> other nir drivers seem to pass a value) these if-statments should get turned
> into bcsel.

nip_opt_peephole_select would help with this specific case, but I have seen other cases where the bodies of the then-blocks and the else-blocks are quite large.  I noticed one or two cases of this recently, but I don't recall which shaders.  I will probably work on this fairly soon (sometime this year) if someone doesn't beat me to it.
Comment 4 GitLab Migration User 2019-09-18 19:45:44 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/809.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.