Bug 94747 - Convert phi nodes to logical operations
Summary: Convert phi nodes to logical operations
Status: NEW
Alias: None
Product: Mesa
Classification: Unclassified
Component: glsl-compiler (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: i965-perf
  Show dependency treegraph
 
Reported: 2016-03-29 19:25 UTC by Ian Romanick
Modified: 2016-07-12 11:18 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ian Romanick 2016-03-29 19:25:59 UTC
I recently encountered some generated NIR like:

        vec1 ssa_3 = load_const (0xffffffff /* -nan */)

        ...

                        /* succs: block_22 block_23 */
                        if ssa_153 {
                                block block_22:
                                /* preds: block_21 */
                                /* succs: block_24 */
                        } else {
                                block block_23:
                                /* preds: block_21 */
                                vec1 ssa_175 = fadd ssa_171, -ssa_135
                                vec1 ssa_176 = fge abs(ssa_175), ssa_145
                                /* succs: block_24 */
                        }
                        block block_24:
                        /* preds: block_22 block_23 */
                        vec1 ssa_177 = phi block_22: ssa_3, block_23: ssa_176

This is just a bloated way of saying (ssa_153 || (abs(ssa_171 - ssa_135) >= ssa_145)).  Some part of the compiler should be able to recognize this and generate the more compact code.

The out-of-SSA pass generates:

                        /* succs: block_19 block_20 */
                        if r12 {
                                block block_19:
                                /* preds: block_18 */
                                r12 = imov ssa_3
                                /* succs: block_21 */
                        } else {
                                block block_20:
                                /* preds: block_18 */
                                vec1 ssa_172 = fadd r14, -r3
                                r12 = fge abs(ssa_172), ssa_145
                                /* succs: block_21 */
                        }
                        block block_21:
                        /* preds: block_19 block_20 */

And the i965 backend generates (for BDW):

mov.nz.f0(8)    null<1>D        g39<8,8,1>D                     { align1 1Q };
(+f0) if(8)     JIP: 5          UIP: 7                          { align1 1Q };
   END B14 ->B15 ->B16
   START B15 <-B14
mov(8)          g39<1>UD        0xffffffffUD                    { align1 1Q compacted };
else(8)         JIP: 4                                          { align1 1Q };
   END B15 ->B17
   START B16 <-B14
add(8)          g2<1>F          g41<8,8,1>F     -g34<8,8,1>F    { align1 1Q compacted };
cmp.ge.f0(8)    g39<1>F         (abs)g2<8,8,1>F g111<8,8,1>F    { align1 1Q compacted };
   END B16 ->B17

There are a bunch of improvements that could be made to the generated code.  At the very least, the spurious assignment to g39 in the then-statement should be removed.

Then, depending on the surrounding code any one of

add(8)          g2<1>F          g41<8,8,1>F     -g34<8,8,1>F    { align1 1Q compacted };
cmp.ge.f0(8)    g40<1>F         (abs)g2<8,8,1>F g111<8,8,1>F    { align1 1Q compacted };
or(8)           g39<1>D         g39<8,8,1>D     g40<8,8,1>D     { align1 1Q compacted };

or

add(8)          g2<1>F          g41<8,8,1>F     -g34<8,8,1>F    { align1 1Q compacted };
cmp.ge.f0(8)    null<1>F        (abs)g2<8,8,1>F g111<8,8,1>F    { align1 1Q compacted };
(+f0) mov(8)    g39<1>UD        0xffffffffUD                    { align1 1Q compacted };

or

mov.nz.f0(8)    null<1>D        g39<8,8,1>D                     { align1 1Q };
(-f0) add(8)          g2<1>F          g41<8,8,1>F     -g34<8,8,1>F    { align1 1Q compacted };
(-f0) cmp.ge.f0(8)    g39<1>F         (abs)g2<8,8,1>F g111<8,8,1>F    { align1 1Q compacted };
Comment 1 Jason Ekstrand 2016-03-31 00:33:57 UTC
The code for this already exists.  It's called nir_opt_peephole_select.  The only problem is that it only triggers if both sides of the if are empty.  I've been wanting to add some sort of heuristic to it for some time now.  The only problem is that it's really back-end specific.


bug/show.html.tmpl processed on Feb 19, 2017 at 16:40:41.
(provided by the Example extension).