Bug 108700 - 2x Media CPU power usage, or 15% perf drop when CPU bound GPU use-case is TDP limited
Summary: 2x Media CPU power usage, or 15% perf drop when CPU bound GPU use-case is TDP...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: Triaged
Keywords:
Depends on:
Blocks:
 
Reported: 2018-11-09 14:06 UTC by Eero Tamminen
Modified: 2018-12-05 16:28 UTC (History)
1 user (show)

See Also:
i915 platform: CFL, KBL, SKL
i915 features:


Attachments
parameter file for sample_media_transcode (7.47 KB, text/plain)
2018-11-09 14:06 UTC, Eero Tamminen
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eero Tamminen 2018-11-09 14:06:55 UTC
Created attachment 142419 [details]
parameter file for sample_media_transcode

Test setup:
* Ubuntu 18.04
* git head build of drm-tip kernel
* git head build of Mesa & X and their main deps
* git head build of Intel MediaSDK and their main dependencies

Good drm-tip version:
a4e9f377a9: 2018-11-03 01:29:29: drm-tip: 2018y-11m-03d-01h-28m-29s UTC integration manifest

Bad drm-tip version:
1a4a6dafa1: 2018-11-05 16:07:52: drm-tip: 2018y-11m-05d-16h-07m-05s UTC integration manifest

Test-case 1:
* Run (mostly) CPU bound GfxBench v4 Driver2 test

Test-case 2:
* Run MediaSDK provided tool with the attached parameter file (does 50 streams which lower H264 video frame & bit rates, size and adds filtering):
  sample_multi_transcode -par inputs.par
* Sum FPS of all streams together

Outcome on HW that is TDP limited:
* Test-case 1 performance drops 15%
* Test-case 2 performance drops 5%
* Performance of other CPU bound GPU tests regress also, but less

Outcome on HW that isn't TDP limited:
* RAPL reports marginally larger CPU power consumption for test-case 1
* RAPL reports 1.5-2.5x higher CPU power consumption for test-case 2

There were no performance improvements in other tests we run on these devices.

Large CPU power usage increase without perf change is visible on:
* SKL i5-6600K (GT2)
* KBL i7-7500U (GT2)
* KBL i7-8809G (GT2)
(And on pre-production CFL-S device we had)

TDP limit caused performance to drop (with increased CPU usage) on:
* KBL i7-7567U (GT3e)
* SKL i7-6770HQ (GT4e)

There was one device where performance increases with the much higher CPU power usage, but it's only by 1-2% and only in test-case 2:
* SKL i5-6260U (GT3e)

Neither perf nor power usage changed on BXT devices, so I guess this change concerns only Core devices.

On BDW GT2 the CPU usage increase was clearly smaller than on GEN9 Core devices (and there was no noticeable performance change).  MediaSDK doesn't support older devices, so I don't have data from them.
Comment 1 Eero Tamminen 2018-11-09 14:10:52 UTC
Drm-tip seems to have rebased from v4.19 to v4.20-rc1 during that 1 day interval.
Comment 2 Jani Saarinen 2018-11-09 14:14:07 UTC
Yeah, can you bisect Eero ;)
Comment 3 Eero Tamminen 2018-11-09 14:27:35 UTC
(In reply to Jani Saarinen from comment #2)
> Yeah, can you bisect Eero ;)

I don't have anything set up that would automate bisecting kernel well enough (reboots, boot failures, handling drm-tip rebases etc).

However, if you have in mind few commits in that range, I could manually check whether they give good or bad performance.

And I can of course (internally) provide ready-made SW setup and reserve suitable HW for whomever is going to look into this.
Comment 4 Chris Wilson 2018-11-13 16:26:35 UTC
I haven't yet tried the exact tests as cited here, all I've found so far is a remarkable improvement from 08e3e21a24d23db6a4adca90f7cb40d69e09d35c ("drm/i915: kill resource streamer support") in the -rc1 merge.

The report would suggest we were looking for a pstate or scheduler change.
Comment 5 Eero Tamminen 2018-11-14 11:57:09 UTC
(In reply to Chris Wilson from comment #4)
> I haven't yet tried the exact tests as cited here, all I've found so far is
> a remarkable improvement from 08e3e21a24d23db6a4adca90f7cb40d69e09d35c
> ("drm/i915: kill resource streamer support") in the -rc1 merge.

In CPU bound Driver2 GL tests?  On which device?
Comment 6 Chris Wilson 2018-11-19 14:20:26 UTC
(In reply to Eero Tamminen from comment #5)
> (In reply to Chris Wilson from comment #4)
> > I haven't yet tried the exact tests as cited here, all I've found so far is
> > a remarkable improvement from 08e3e21a24d23db6a4adca90f7cb40d69e09d35c
> > ("drm/i915: kill resource streamer support") in the -rc1 merge.
> 
> In CPU bound Driver2 GL tests?  On which device?

kbl + glxgears; basic context switch exercise.

In light of the rc1 controversy, do you have spectre/meltdown migrations enabled on your test systems?
Comment 7 Eero Tamminen 2018-11-19 17:36:06 UTC
We don't specifically enable any mitigations, just use drm-tip kernel defaults.
It seems to have enabled an additional one when it was rebased to 4.20-rc1:
  Spectre V2 : Spectre v2 cross-process SMT mitigation: Enabling STIBP


Threading in the listed test-cases:

* MediaSDK 50 stream transcode case has 250 threads

* I thought GfxBench Driver2 doesn't thread, as only single CPU is busy, but it actually uses 3 threads of which 2 use as much CPU as they can, and apparently kernel just sticks them to same core, so they seem hyperthreaded

-> I think that SMT mitigation is very likely cause for the drop instead of i915.


Could you point out suitable drm-tip commit IDs before and after enabling the mitigation so that I could verify it?
Comment 8 Eero Tamminen 2018-12-05 16:28:04 UTC
STIBP fixes in drm-tip v4.20-rc5 fix the CPU bound 3D cases performance (test-case 1).

However, those fixes, nor disabling Spectre mitigation completely from kernel command line (checked by David), do NOT have any impact on the Media performance regression (test-case 2).

David will try to bisect the Media perf regression.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.