97075 – VCE encoding slow when GPU is not stressed (HD 7970M)

Bug 97075 - VCE encoding slow when GPU is not stressed (HD 7970M)

Summary: VCE encoding slow when GPU is not stressed (HD 7970M)

Status:	RESOLVED MOVED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/radeonsi (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	high enhancement
Assignee:	Default DRI bug account
QA Contact:	Default DRI bug account

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2016-07-25 14:27 UTC by Christoph Haag
Modified:	2019-09-25 17:54 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:
i915 features:

Attachments

Description Christoph Haag 2016-07-25 14:27:51 UTC

This is on an intel + radeon laptop, so I need to run encoding with gstreamer with DRI_PRIME=1.

Here is an example video: http://www.sample-videos.com/video/mp4/720/big_buck_bunny_720p_1mb.mp4

DRI_PRIME is doing a good job of waking up the GPU from runpm when needed for encoding via VAAPI and OMX, but for comparison I'll run glxgears both times.

I'm encoding the mentioned example video with VAAPI with this exact command:
$ time DRI_PRIME=1 LIBVA_DRIVER_NAME=radeonsi gst-launch-1.0 -e filesrc location=big_buck_bunny_720p_1mb.mp4 ! qtdemux ! h264parse ! avdec_h264 ! queue ! videoconvert ! queue ! video/x-raw,format=NV12 ! vaapih264enc ! h264parse ! matroskamux ! filesink location=output.mkv

For low GPU stress I run the gst pipeline while glxgears with vsync is running:
$ DRI_PRIME=1 glxgears
Result: 0.75s user 0.33s system 2% cpu 52.779 total

For higher GPU stress I run the gst pipeline while glxgears without vsync is running:
$ DRI_PRIME=1 vblank_mode=0 glxgears
Result: 0.99s user 0.28s system 43% cpu 2.928 total

I also tried a very similar pipeline with OMX:
$ time DRI_PRIME=1 gst-launch-1.0 -e filesrc location=big_buck_bunny_720p_1mb.mp4 ! qtdemux ! h264parse ! avdec_h264 ! queue ! videoconvert ! queue ! video/x-raw,format=NV12 ! omxh264enc ! h264parse ! matroskamux ! filesink location=output.mkv

Low GPU stress: 0.96s user 0.24s system 19% cpu 6.298 total
High GPU stress: 1.10s user 0.24s system 141% cpu 0.949 total

Overall OMX encoding does a lot better, but it's still a large difference and still below "real time" for the 5 second video.

Comment 1 Christian König 2016-07-25 14:43:53 UTC

Yeah, that is a known issue.

The current VA-API implementation waits for the result after sending a single frame to the hardware.

The OpenMAX implementation pipelines the whole thing and waits for a result after sending multiple frames to the hardware to chew on.

So with OpenMAX the hardware is always busy, while with VA-API it constantly turns on/off.

Comment 2 Andy Furniss 2016-07-25 17:04:48 UTC

So maybe there is also some dpm type issue on your system.

Rather than running gears maybe there is somewhere you can force gpu clocks to high.

My setup is very different but I would do -

echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level

Comment 3 Christoph Haag 2016-07-25 17:07:15 UTC

I put the issue in DRM/radeon instead of mesa/radeonsi because I thought it would be related to power management.

I tried
echo high > /sys/class/drm/card1/device/power_dpm_force_performance_level
and
performance > /sys/class/drm/card1/device/power_dpm_state
but it makes no difference, still just as slow.

Comment 4 Christian König 2016-07-25 17:29:11 UTC

Good point, but no the problem is clearly in the VA-API state tracker.

Comment 5 Andy Furniss 2016-07-25 18:13:12 UTC

Well his omx test is 6x slower as well without load (though the test vid is very short).

So I think in addition to to the vaapi issue he is seeing some prime+HD 7970M dpm problem.

Though maybe forcing CPUs to high and re-testing would help rule out cpufreq messing things up.

Comment 6 Christian König 2016-07-26 07:57:16 UTC

I should open my eyes while reading. Indeed that is way to much to be explained by the VAAPI problems.

Comment 7 GitLab Migration User 2019-09-25 17:54:43 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1235.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.