Bug 100899 - [SKL] Enable UMD to control GPGPU EUs slice number in static/dynamic way
Summary: [SKL] Enable UMD to control GPGPU EUs slice number in static/dynamic way
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium enhancement
Assignee: Dmitry Rogozhkin
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2017-05-02 00:29 UTC by Dmitry Rogozhkin
Modified: 2019-11-29 17:22 UTC (History)
2 users (show)

See Also:
i915 platform: BDW, SKL
i915 features: display/Other


Attachments

Description Dmitry Rogozhkin 2017-05-02 00:29:27 UTC
Media workloads performance may suffer because of the cost associated with managing of GPGPU EUs slices which are not required for the workloads (media workloads may use no more than certain number of EUs at the same time due to wave front limitation). Usually the rule is: lower stream resolution, lower number of EUs slices is needed. According to experiments, performance may be 5-20% lower from the optimal operating point. This under performance is one of the key factors why in certain use cases SKL is slower than BDW assuming all other system settings except EU aligned.

Here is a link to the feature request to provide an API on the UMD level to permit user to control number of slices selections to potentially benefit the performance: https://github.com/01org/intel-vaapi-driver/issues/152

Here are links to the RFC i915 drm-tip patches to enable static slice shutdown for BDW and SKL:
* https://patchwork.kernel.org/patch/9670509/ [RFC,2/2] drm/i915/bdw: permit make_rpcs execution on BDW to enable slice shutdown
* https://patchwork.kernel.org/patch/9670507/ [RFC,1/2] drm/i915/skl: add slice shutdown debugfs interface

These patches can be used to demonstrate influence of EUs slice numbers on the media workloads. Refer to https://github.com/01org/intel-vaapi-driver/issues/152 for the reproducing details.
Comment 1 Chris Wilson 2017-05-02 09:55:20 UTC
Hmm, since the slice/eu masks are stored in the context, is userspace unable to adjustment via LRI? If it can, job done. If not, then we need a context param to allow unprivileged adjustment for the users own contexts.
Comment 2 Chris Wilson 2017-05-02 13:57:22 UTC
Is it a good recommendation for all non-GPGPU workloads to set subslice_mask = 0?
Comment 3 Oscar Mateo 2017-05-02 17:21:44 UTC
Hi Chris,

Userspace could adjust it via LRI, if only i915 whitelists GEN8_R_PWR_CLK_STATE (this configuration from userspace via LRI is the preferred way for newer platforms, and I was suggesting Dmitry the same thing could be done from BDW onwards).
Comment 4 Lionel Landwerlin 2017-05-11 13:21:47 UTC
Just letting you know that letting userspace directly change this configuration through LRI might corrupt the data coming from the OA unit.
I think Chris' patchset to set the userspace control this through a context ioctl is the best solution. That way we can do something sensible in the kernel when the OA unit is enabled.
Comment 5 Elizabeth 2017-06-20 19:35:55 UTC
(In reply to Lionel Landwerlin from comment #4)
> Just letting you know that letting userspace directly change this
> configuration through LRI might corrupt the data coming from the OA unit.
> I think Chris' patchset to set the userspace control this through a context
> ioctl is the best solution. That way we can do something sensible in the
> kernel when the OA unit is enabled.

Good afternoon, is the problem still present or the patchset fixed it??
Comment 6 Dmitry Rogozhkin 2017-09-01 17:35:19 UTC
With inline GEN8_R_PWR_CLK_STATE programming we are getting reset of per-slice registers including MOCS, this can lead to functional and performance regressions. From the other hand with the i915 KMD level programming we can reprogram the state we lost. Thus, we probably will follow with the KMD level slice programming. Here are the related patches:

* https://patchwork.freedesktop.org/series/29715/ - patch series to enable slice programming for Gen8+ (most recent respin)
* https://patchwork.freedesktop.org/series/29564/ - patch series to fix conflict with OA NOA programming
Comment 7 Dmitry Rogozhkin 2017-09-01 17:37:32 UTC
And slice programming IGT test: https://patchwork.freedesktop.org/patch/174839/
Comment 8 Jani Saarinen 2018-03-29 07:12:01 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 9 Jani Saarinen 2018-04-23 09:54:29 UTC
Dmitry, Chris, is this still valid issue?
Comment 10 Dmitry Rogozhkin 2018-04-23 14:06:44 UTC
Yes, it is still valid.
Comment 11 Tvrtko Ursulin 2019-06-10 10:37:13 UTC
Is media stack still interested in having this for performance reasons? Are there patches available against some userspace component?
Comment 12 Martin Peres 2019-11-29 17:22:32 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/36.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.