Summary: | i965/fs: copy propagate should propagate bitwise not into logic operations | ||
---|---|---|---|
Product: | Mesa | Reporter: | Ian Romanick <idr> |
Component: | Drivers/DRI/i965 | Assignee: | Ian Romanick <idr> |
Status: | RESOLVED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | enhancement | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 77547 |
Description
Ian Romanick
2015-04-09 19:49:20 UTC
Presumably you're talking about BDW+ where the negate source modifier on logical operations performs a bitwise-NOT? This doesn't seem like a task for CSE. I think you meant copy propagation. Okay, talked to idr. The problem seems to be that the not(8) in his example can't be propagated into the and(8) because the types are signed...? I don't see any code in brw_fs_copy_propagate.cpp to handle this, so I don't think it's related to the argument types. But yeah, we should be propagating NOT into logic operations. Ian, is this still an issue, and if yes, could you attach an example shader? At the time, we generated intBitsToFloat(b & floatBitsToInt(1.0)) to do b2f conversions. Since b could only be 0 or -1, ~b & 0x38f00000 produces the same possible values as -b & 0x37f00000. However, we now emit float(-b) instead, so I don't think this can occur anymore. Maybe generating ~b & 0x3f800000 (BDW+) or -b & 0x38f00000 (SNB - HSW) for float(!b) is better? Right now there's a late optimization in nir_opt_algebraic that does this: (('b2f(is_used_more_than_once)', ('inot', a)), ('bcsel', a, 0.0, 1.0)), (('fneg(is_used_more_than_once)', ('b2f', ('inot', a))), ('bcsel', a, -0.0, -1.0)), In at least some cases, this results in some extra load-immediate instructions mov(16) g16<1>D 1065353216D { align1 1H }; mov.nz.f0(16) null<1>D g8<8,8,1>D { align1 1H }; (-f0) sel(16) g126<1>UD g16<8,8,1>UD 0x00000000UD { align1 1H }; Which also seems dumb. This should be mov.z.f0(16) null<1>D g8<8,8,1>D (-f0) sel(16) g126<1>F g8<8,8,1>F 1.0F (In reply to Ian Romanick from comment #4) > At the time, we generated intBitsToFloat(b & floatBitsToInt(1.0)) to do b2f > conversions. Since b could only be 0 or -1, ~b & 0x38f00000 produces the > same possible values as -b & 0x37f00000. However, we now emit float(-b) > instead, so I don't think this can occur anymore. > > Maybe generating ~b & 0x3f800000 (BDW+) or -b & 0x38f00000 (SNB - HSW) for > float(!b) is better? Right now there's a late optimization in > nir_opt_algebraic that does this: > > (('b2f(is_used_more_than_once)', ('inot', a)), ('bcsel', a, 0.0, 1.0)), > (('fneg(is_used_more_than_once)', ('b2f', ('inot', a))), ('bcsel', a, > -0.0, -1.0)), > > In at least some cases, this results in some extra load-immediate > instructions > > mov(16) g16<1>D 1065353216D > mov.nz.f0(16) null<1>D g8<8,8,1>D > (-f0) sel(16) g126<1>UD g16<8,8,1>UD 0x00000000UD > > Which also seems dumb. This should be > > mov.z.f0(16) null<1>D g8<8,8,1>D > (-f0) sel(16) g126<1>F g8<8,8,1>F 1.0F Ignore this last part. That's rubbish. I believe this patch series resolves all the issues that I wanted this bug to record: 1edf67fc3f6 intel/fs: Generate if instructions with inverted conditions d40640efe8a nir/algebraic: Replace a bcsel of a b2f sources with a b2f(!(a || b)) 7725d609387 intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a)) cb3e21cd192 intel/fs: Use De Morgan's laws to avoid logical-not of a logic result on Gen8+ 8eb36c91295 intel/fs: Emit logical-not of operands on Gen8+ 06eaaf2de94 intel/fs: Refactor ALU source and destination handling to a separate function fb3ca9109cb intel/fs: Handle OR source modifiers in algebraic optimization c9d5bd050c3 intel/fs: Relax type matching rules in cmod propagation from MOV instructions eae19f5f19f nir/algebraic: Replace i2b used by bcsel or if-statement with comparison |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.