Summary: | 4.5% perf drop in CSDof with "nir: Optimize integer division and modulus with 1" | ||
---|---|---|---|
Product: | Mesa | Reporter: | Eero Tamminen <eero.t.tamminen> |
Component: | Drivers/DRI/i965 | Assignee: | Ian Romanick <idr> |
Status: | CLOSED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | Other | ||
OS: | All | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=98299 | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Eero Tamminen
2016-10-24 11:24:25 UTC
I see the same perf regression in my results. Strange. I don't see shader-db differences, and I just recaptured the CSDof shaders and don't see differences there either. None of the shaders seem to use intdiv and only one uses intmod (by 0xc0). Jordan noted that I likely needed to revert i965/cs: Use udiv/umod for local IDs i965/cs: Don't use a thread channel ID for small local sizes as well in order to see a difference. Indeed, I do, though it looks like the three patches helped instruction and cycle counts in four compute shaders by small amounts. Notably, no differences in spills or fills. Confirming Matt's observation. Reverting the indicated change from Mesa HEAD, doesn't affect the performance, although going back to version before this commit (or commits before it), shows 4-5% better perf than the version built from the commit. I.e. additional changes after the indicated commit also have effect on CSDof test. (In reply to Eero Tamminen from comment #4) > Confirming Matt's observation. > > Reverting the indicated change from Mesa HEAD, doesn't affect the > performance, although going back to version before this commit (or commits > before it), shows 4-5% better perf than the version built from the commit. > I.e. additional changes after the indicated commit also have effect on CSDof > test. When reverting: 4d35683 nir: Optimize integer division and modulus with 1 Did you also revert these? 64c3d73 i965/cs: Don't use a thread channel ID for small local sizes 1fa000a i965/cs: Use udiv/umod for local IDs It would be good to know if taking the current master, and reverting these 3 patches regains the 4-5%. (In reply to Jordan Justen from comment #5) > When reverting: > > 4d35683 nir: Optimize integer division and modulus with 1 > > Did you also revert these? No. > 64c3d73 i965/cs: Don't use a thread channel ID for small local sizes > 1fa000a i965/cs: Use udiv/umod for local IDs > > It would be good to know if taking the current master, and > reverting these 3 patches regains the 4-5%. Tried that now. If one builds commit 4d35683, reverting just that commit fixes the drop. With HEAD, that's not enough, but reverting all 3 patches does fix the drop there too. Note: performance-wise I'm less worried in this test about perf drop than these INTEL_DEBUG=perf warnings (and resulting spilling): ----------------- Unsupported form of variable indexing in CS; falling back to very inefficient code generation ----------------- Probably not worth tracking this old regression - we've since improved performance a lot. Closing. verified/closed |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.