Summary: | 2-3% performance drop in GpuTest v0.7 FurMark with "nir/copy_prop: Respect the source's number of components" | ||
---|---|---|---|
Product: | Mesa | Reporter: | Eero Tamminen <eero.t.tamminen> |
Component: | Drivers/DRI/i965 | Assignee: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Status: | VERIFIED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | Other | ||
OS: | All | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=100913 | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Eero Tamminen
2017-05-16 16:22:54 UTC
(In reply to Eero Tamminen from comment #0) > This commit regressed GpuTest v0.7 FurMark performance by 2-3% on GEN9+: > -------------------------------------------------------------------- > commit 3c312be7b3e95ec7540e98abed1b6f3cc8d31b2a > Author: Jason Ekstrand <jason.ekstrand@intel.com> > AuthorDate: Thu Mar 2 17:10:24 2017 -0800 > Commit: Jason Ekstrand <jason.ekstrand@intel.com> > CommitDate: Tue Mar 14 07:36:20 2017 -0700 > > nir/copy_prop: Respect the source's number of components > > In the near future we are going to require that the num_components in a > src dereference match the num_components of the SSA value being > dereferenced. To do that, we need copy_prop to not remove our MOVs from > a larger SSA value into an instruction that uses fewer channels. > > Because we suddenly have to know how many components each source has, > this makes the pass a bit more complicated. Fortunately, copy > propagation is the only pass that cares about the number of components > are read by any given source so it's fairly contained. > > Shader-db results on Sky Lake: > > total instructions in shared programs: 13318947 -> 13320265 (0.01%) > instructions in affected programs: 260633 -> 261951 (0.51%) > helped: 324 > HURT: 1027 > -------------------------------------------------------------------- > > On few of the GEN8+ platforms, that commit also: > * Regressed SynMark v7 PSBump2 performance by ~1% > * Increased SynMark v7 PSBump8 performance by ~1% > > > Timothy's commits 2 months later: > * i965: remove GLSL IR optimisation loop > * nir: shuffle constants to the top > > Compensated for the FurMark drop (SynMark tests were not affected by them). > > If those commits fixed what this regressed, this bug should be closed. If > they improved something else, it may be better to keep this open. There has been lately further improvements to FurMark perf, so current performance is higher than the originally regressed one. Is the original regression still valid or should this be closed as fixed? (In reply to Eero Tamminen from comment #1) > (In reply to Eero Tamminen from comment #0) > > This commit regressed GpuTest v0.7 FurMark performance by 2-3% on GEN9+: ... > > Timothy's commits 2 months later: > > * i965: remove GLSL IR optimisation loop > > * nir: shuffle constants to the top > > > > Compensated for the FurMark drop (SynMark tests were not affected by them). > > > > If those commits fixed what this regressed, this bug should be closed. If > > they improved something else, it may be better to keep this open. > > There has been lately further improvements to FurMark perf, so current > performance is higher than the originally regressed one. > > Is the original regression still valid or should this be closed as fixed? Marking this 2017 bug as FIXED. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.