Bug 109517 - [GEN9+] 14-24% perf drop in SynMark2 v7 CSDof
Summary: [GEN9+] 14-24% perf drop in SynMark2 v7 CSDof
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Jason Ekstrand
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
: 110344 110412 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-01-31 11:34 UTC by Eero Tamminen
Modified: 2019-09-25 20:29 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Eero Tamminen 2019-01-31 11:34:53 UTC
Mesa performance in SynMark2 v7 CSDof test dropped 24% on SKL GT4e and 14-20% on other GEN9+ platforms.

Setup:
- Ubuntu 18.04
- Unity / compiz desktop
- Mesa from git
- X server and drm-tip kernel from git within last 1-2 months, with modifier support enabled
- FullHD monitor

Test-case:
- synmark2 OglCSDof
  (configured to run in fullscreen FullHD resolution.)

Regression happened between following commits:
* 2019-01-28 17:50:08 41a0acd6a1: Switch imx to kmsro and remove the imx winsys
* 2019-01-30 17:49:45 f4eb746ef7: r600: add -Wstrict-overflow=0 to meson to silence the warning

CSDof compute shaders register spill, so a change to register usage (e.g. SENDS support which is GEN9+ specific) is likely cause for the perf regression.
Comment 1 Eero Tamminen 2019-02-01 09:09:37 UTC
FYI: None of the few other compute shader tests for which I have data are impacted, but they don't spill either.
Comment 2 Mark Janes 2019-02-02 01:32:32 UTC
bisected to series ending in:

a920979d4f30a48a23f8ff375ce05fa8a947dd96
Author:     Jason Ekstrand <jason@jlekstrand.net>
intel/fs: Use split sends for surface writes on gen9+

Surface reads don't need them because they just have the one address
payload.  With surface writes, on the other hand, we can put the address
and the data in the different halves and avoid building the payload all
together.

The decrease in register pressure and added freedom in register
allocation resulting from this change reduces spilling enough to improve
the performance of one customer benchmark by about 2x.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Comment 3 Eero Tamminen 2019-02-11 16:59:35 UTC
There was a very small (<1%) drop also in Sacha Willems' Vulkan Raytracing demo.
Comment 4 Jason Ekstrand 2019-02-11 17:30:02 UTC
Ugh...  I'm not really sure what we should do about this one.  Mark's bisect is exactly correct.  I've looked at the shaders, and there seems to be two issues:

 1) There's one SIMD8 shader which schedules massively differently for no apparent reason.

 2) There's a SIMD16 shader which starts spilling way more than it was before

In both cases, I have no idea why it's happening beyond the fact that our current RA and scheduling has rather random behaviour at times.  Using SENDS should only ever decrease register pressure and increase RA freedom because it no longer has to build the message into a single hunk and can just send the two bits (address and data) separately.

As I said in the commit message I have another (unfortunately not public yet) customer workload where the opposite happens and using SENDS decreases spilling and improves performance by 2x.

Ken, Matt, Any thoughts?
Comment 5 Eero Tamminen 2019-02-12 09:30:00 UTC
(In reply to Jason Ekstrand from comment #4)
> Ugh...  I'm not really sure what we should do about this one.  Mark's bisect
> is exactly correct.  I've looked at the shaders, and there seems to be two
> issues:
> 
>  1) There's one SIMD8 shader which schedules massively differently for no
> apparent reason.
>
>  2) There's a SIMD16 shader which starts spilling way more than it was before

Based on your comment below I assume that SIMD8 shader also got worse, but does it also spill?  I.e. is the bad behavior limited to spilling shaders?

 
> In both cases, I have no idea why it's happening beyond the fact that our
> current RA and scheduling has rather random behaviour at times.  Using SENDS
> should only ever decrease register pressure and increase RA freedom because
> it no longer has to build the message into a single hunk and can just send
> the two bits (address and data) separately.
> 
> As I said in the commit message I have another (unfortunately not public
> yet) customer workload where the opposite happens and using SENDS decreases
> spilling and improves performance by 2x.

SENDS support is needed performance feature.  If the current implementation improves things more than it regresses, and especially if the improving cases are more important like here, I think letting regression in for the release is fine.

There could be some meta-bug about the RA / scheduler related issues which this (and e.g. bugs about bad sampler fetch scheduling) would link to though.
Comment 6 Jason Ekstrand 2019-02-14 17:20:19 UTC
We had a discussion about this today and determined that we'd leave the bug open but drop it from the 19.0 release tracker.
Comment 7 Eero Tamminen 2019-04-17 12:45:43 UTC
Perf improved 5-10% (depending on platform) between following commits:
2019-02-28 17:30:48 df5cd51259: gitlab-ci: install xmllint to validate 00-mesa-defaults.conf
2019-03-01 16:46:32 fc82ea1350: Revert "swr/rast: Archrast codegen updates"

I assume improvement comes from the nir/copy_prop_vars series.
Comment 8 Jason Ekstrand 2019-04-17 14:35:33 UTC
(In reply to Eero Tamminen from comment #7)
> I assume improvement comes from the nir/copy_prop_vars series.

Very likely.
Comment 9 Jason Ekstrand 2019-04-17 14:36:08 UTC
*** Bug 110344 has been marked as a duplicate of this bug. ***
Comment 10 Jason Ekstrand 2019-04-17 14:36:59 UTC
*** Bug 110412 has been marked as a duplicate of this bug. ***
Comment 11 GitLab Migration User 2019-09-25 20:29:32 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1786.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.