102683 – mpv confuses the frequency scaling, leading to freqyency flapping and missed vsyncs

Bug 102683 - mpv confuses the frequency scaling, leading to freqyency flapping and missed vsyncs

Summary: mpv confuses the frequency scaling, leading to freqyency flapping and missed ...

Status:	RESOLVED INVALID

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/AMDgpu (show other bugs)
Version:	unspecified
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2017-09-12 16:52 UTC by Niklas Haas
Modified:	2017-09-18 01:13 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments

Description Niklas Haas 2017-09-12 16:52:04 UTC

When rendering e.g. 24 Hz video on 60 Hz, `mpv`'s usage pattern consists of one “fresh” frame (e.g. 20ms rendering time) followed by two “redraw” frames, each of which are essentially just blits/mixes of already-rendered frames. This results in a high-low-low-high-low GPU activity pattern.

Apparently this confuses the GPU frequency scaling quite heavily, which leads to bad performance (under-scaling), inconsistent performance (“flapping” SCLK) and missed vblanks (delayed/dropped frames and vsync jitter as measured by mpv). If I `watch -n0.1 cat /sys/kernel/debug/dri/*/amdgpu_pm_info`, I can see it varying wildly between `SCLK: 500 MHz` and `SCLK: 1000 MHz`, as the reported `GPU Load:` varies between 0% and 100% from frame to frame. I can consistently solve the issue by setting `power_dpm_force_performance_level` to `high`.

A graphical explanation of the issue, with mpv performance graphs:
`auto`: https://0x0.st/7qz.jpg
`high`: https://0x0.st/7qi.jpg

This is not just cosmetic, since it results in an increase in the number of missed vsyncs, due to the occasional spikes. I've also had a different user report significantly worse performance with 'auto', even more extreme than my example: https://0x0.st/7Aw.jpg (Note: the third step in this image, going from 10k ms to 5k ms is due to switching from mesa 17.2 to 17.1; but that's an unrelated, cosmetic bug). Using 'auto' makes mpv completely unusable for this user.

I expect the solution would be adding a tiny bit of (top-weighted) smoothing of this performance state / GPU load estimation across frames. The nvidia driver, for example, gets this right: If I alter between 'high' performance and 'low' performance states, it recognizes that and sticks to 'high' performance mode, instead of varying the frequency wildly like amdgpu. This results in very flat graphs, much like the 'high' screenshot I uploaded.

Kernel version is 4.12.4, mesa version is 17.2.0, device is a Sapphire RX 560.

Comment 1 Niklas Haas 2017-09-18 01:13:31 UTC

Re-testing this I noticed that it's possible */amdgpu_pm_info just delivers more unrealiable data or something; with GALLIUM_HUD_PERIOD=0 GALLIUM_HUD=shader-clock,gpu-load I noticed it working much better.

As for the “performance issues”, I think it may be because of timers getting confused due to the change in frequency? Re-testing I can't notice more dropped frames than usual except at the very beginning, probably due to the very fact that it does frequency scaling.

I'll close this issue unless I have something more concrete to report.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.