Bug 107111 - "Avoid waitboosting" (e9af4ea2b9e7) now prevents GPU boosting
Summary: "Avoid waitboosting" (e9af4ea2b9e7) now prevents GPU boosting
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Lakshmi
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: Triaged, ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-04 07:02 UTC by freedesktop
Modified: 2018-10-16 08:36 UTC (History)
4 users (show)

See Also:
i915 platform: KBL, SKL
i915 features: power/runtime PM


Attachments
Monitoring script and results (58.44 KB, text/plain)
2018-07-04 07:02 UTC, freedesktop
no flags Details
nudge (4.47 KB, patch)
2018-07-04 08:15 UTC, Chris Wilson
no flags Details | Splinter Review
nudge2 (1.04 KB, patch)
2018-07-04 08:57 UTC, Chris Wilson
no flags Details | Splinter Review
Tidier nudge (9.69 KB, patch)
2018-07-08 19:11 UTC, Chris Wilson
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description freedesktop 2018-07-04 07:02:17 UTC
Created attachment 140459 [details]
Monitoring script and results

We've had reports of the Kodi GUI "stuttering" since LibreELEC switched from kernel 4.14.49 to 4.17.3.

I ran a script that monitors /sys/kernel/debug/dri/0/i915_rps_boost_info and noticed that with 4.17.3 the GPU never boosts out of the "low power" window. It's the same when watching a video - the GPU is now almost always stuck in the "low power" window.

git bisecting the kernel between 4.14.49 and 4.17.3 reveals e9af4ea2b9e7e5d3caa6354be14de06b678ed0fa[1] ("drm/i915: Avoid waitboosting on the active request") as the bad commit.

In the attached file you can see the script, and also the results of testing many different LibreELEC/Kodi builds (corresponding to the builds here[2]).

With this script the first 30 seconds of a 4K video (Elysium trailer, on a Skylake NUC6i5SYH outputting at 1080p) is played from local storage in Kodi. At the end of the script a summary is output which includes the percentage of "LOW" (ie. low power) and "!LOW" (ie. high + mixed) window samples. The script is executed 5 times per build.

What is very clear is that from build 180616 (ie. #0616[3]) onwards - which is when the LibreELEC kernel switched from 4.14.49 to 4.17.3 - the number of "!LOW" samples is reduced to almost zero - ie. the GPU no longer boosts, remaining locked at the "idle" frequency.

To rule out any changes in Kodi that might also be responsible I rebuilt build 180702 replacing only the 4.17.3 kernel with 4.14.49, creating build 180702a, where now the GPU boosts as expected.

Reverting the bad commit from kernel 4.17.3[4] resulted in build 180702b, and again the GPU is now boosting as expected.

Kernel 4.18-rc1 also has this non-boosting issue (same as 4.17.3).

With this commit reverted, the reports of "stutter" in Kodi with 4.17.3 (and now also 4.17.4) are resolved[5, 6, 7].

Can this commit be reverted, or is there a better solution that doesn't hobble the GPU?

Thanks
Neil

1. https://github.com/torvalds/linux/commit/e9af4ea2b9e7e5d3caa6354be14de06b678ed0fa
2. https://forum.kodi.tv/showthread.php?tid=298462
3. http://forum.kodi.tv/showthread.php?tid=298462&pid=2744031#pid2744031
4. https://github.com/LibreELEC/LibreELEC.tv/pull/2798/commits/914cadbafd09861f05244d4b985476414760823b
5. https://forum.kodi.tv/showthread.php?tid=298462&pid=2748603#pid2748603
6. https://forum.kodi.tv/showthread.php?tid=298462&pid=2748688#pid2748688
7. https://forum.kodi.tv/showthread.php?tid=298462&pid=2748696#pid2748696
Comment 1 Chris Wilson 2018-07-04 07:53:38 UTC
What it is telling us is that the GPU is never utilised enough for it to upclock by itself -- Kodi should not need more power to deliver the frames on time, and that boosting is overkill.

One thing to confirm is which boost Kodi is depending on. So try just reverting the intel_display chunk so that it always boost after the vblank miss.

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 681e0710a467..0d4c61e53a6c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12913,7 +12913,8 @@ static int do_rps_boost(struct wait_queue_entry *_wait,
         * is reasonable to assume that it will complete before the next
         * vblank without our intervention, so leave RPS alone.
         */
-       if (!i915_request_started(rq))
+       if (!i915_request_started(rq) ||
+           rq != i915_gem_active_peek(&rq->timeline->last_request))
                gen6_rps_boost(rq, NULL);
        i915_request_put(rq);
 
might be an interesting compromise. But what I think is that we want is more of a nudge.
Comment 2 Chris Wilson 2018-07-04 08:15:46 UTC
Created attachment 140462 [details] [review]
nudge
Comment 3 Chris Wilson 2018-07-04 08:57:36 UTC
Created attachment 140463 [details] [review]
nudge2

Since it's not the vblank waitboosting in effect here, it must be the i915_gem_object_wait (used by glFinish et al): try the second nudge on top of the first.
Comment 4 freedesktop 2018-07-07 23:31:42 UTC
Apologies for the delay in responding, I've been collecting information.

I produced 4 test builds for users that are experiencing the visual stutter. These builds are based on 4.17.4 where e9af4ea2b9e7 is not reverted:

#0703b, includes patch "reverting the intel_display chunk so that it always boost after the vblank miss."[1]
#0703c, includes patch "might be an interesting compromise"[2]
#0703d, includes patch "nudge"[3]
#0703e, includes patches "nudge"[3] + "nudge2"[4]

Results (from 2 users)[5]:

#0703b: visual stutter still present
#0703c: no visual stutter, but audio issues (both users) - delayed and/or stuttering audio
#0703d: no visual stutter and no audio issues
#0703e: no visual stutter and no audio issues

Regarding the audio issues, one user hasn't been able to reproduce[6] and the other won't be able to test for a while[7], so this could be a red herring.

Hopefully this information is useful when deciding in which direction to go!


1. http://ix.io/1fYp
2. http://ix.io/1fYI
3. https://bugs.freedesktop.org/attachment.cgi?id=140462
4. http://ix.io/1fYX (4.17.4 compatible version of id=140463)
5. https://forum.kodi.tv/showthread.php?tid=298462&pid=2749048#pid2749048
6. https://forum.kodi.tv/showthread.php?tid=298462&pid=2749440#pid2749440
7. https://forum.kodi.tv/showthread.php?tid=298462&pid=2749557#pid2749557
Comment 5 Chris Wilson 2018-07-08 19:11:39 UTC
Created attachment 140513 [details] [review]
Tidier nudge

Same nudge as before (v1), but with a bugfix so worth double checking that fixing the bug didn't bring the problem back.
Comment 6 freedesktop 2018-07-08 22:25:36 UTC
Thanks Chris. I've posted a new #0708b build[1] with "Tidier nudge" and will post again once I have feedback.

With #0708b on a Skylake NUC when playing 30 seconds of the Elysium 4K video, I see the GPU boosting frequently throughout playback[2] (sampling i915_rps_boost_info every 0.05 seconds). Other less "stressful" videos (eg. 1080p HEVC) barely boost at all.

The GPU also boosts while navigating the Kodi GUI, so fingers crossed.

1. https://forum.kodi.tv/showthread.php?tid=298462&pid=2749894#pid2749894
2. http://ix.io/1gwH
Comment 7 Chris Wilson 2018-07-09 09:15:29 UTC
If anyone also has a wattsup/killawatt or other power meter, I would be interested in knowing the total power consumed with waitboost-vs-nudge.

kodi can measure the GPU power used by using perf + rapl, e.g. https://cgit.freedesktop.org/drm/igt-gpu-tools/tree/overlay/power.c which I think might be of interest.
Comment 8 freedesktop 2018-07-11 20:50:30 UTC
> Same nudge as before (v1), but with a bugfix so worth double checking that fixing the bug didn't bring the problem back.

I added "Tidier nudge" in test build #0708b and received confirmation[1,2] that it is working well.

1. https://forum.kodi.tv/showthread.php?tid=298462&pid=2750248#pid2750248
2> https://forum.kodi.tv/showthread.php?tid=298462&pid=2750649#pid2750649
Comment 9 Chris Wilson 2018-07-31 16:27:43 UTC
commit 60548c554be2830d29d2533dad0ac8133347ee51 (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jul 31 14:26:29 2018 +0100

    drm/i915: Interactive RPS mode
    
    RPS provides a feedback loop where we use the load during the previous
    evaluation interval to decide whether to up or down clock the GPU
    frequency. Our responsiveness is split into 3 regimes, a high and low
    plateau with the intent to keep the gpu clocked high to cover occasional
    stalls under high load, and low despite occasional glitches under steady
    low load, and inbetween. However, we run into situations like kodi where
    we want to stay at low power (video decoding is done efficiently
    inside the fixed function HW and doesn't need high clocks even for high
    bitrate streams), but just occasionally the pipeline is more complex
    than a video decode and we need a smidgen of extra GPU power to present
    on time. In the high power regime, we sample at sub frame intervals with
    a bias to upclocking, and conversely at low power we sample over a few
    frames worth to provide what we consider to be the right levels of
    responsiveness respectively. At low power, we more or less expect to be
    kicked out to high power at the start of a busy sequence by waitboosting.
    
    Prior to commit e9af4ea2b9e7 ("drm/i915: Avoid waitboosting on the active
    request") whenever we missed the frame or stalled, we would immediate go
    full throttle and upclock the GPU to max. But in commit e9af4ea2b9e7, we
    relaxed the waitboosting to only apply if the pipeline was deep to avoid
    over-committing resources for a near miss. Sadly though, a near miss is
    still a miss, and perceptible as jitter in the frame delivery.
    
    To try and prevent the near miss before having to resort to boosting
    after the fact, we use the pageflip queue as an indication that we are
    in an "interactive" regime and so should sample the load more frequently
    to provide power before the frame misses it vblank. This will make us
    more favorable to providing a small power increase (one or two bins) as
    required rather than going all the way to maximum and then having to
    work back down again. (We still keep the waitboosting mechanism around
    just in case a dramatic change in system load requires urgent uplocking,
    faster than we can provide in a few evaluation intervals.)
    
    v2: Reduce rps_set_interactive to a boolean parameter to avoid the
    confusion of what if they wanted a new power mode after pinning to a
    different mode (which to choose?)
    v3: Only reprogram RPS while the GT is awake, it will be set when we
    wake the GT, and while off warns about being used outside of rpm.
    v4: Fix deferred application of interactive mode
    v5: s/state/interactive/
    v6: Group the mutex with its principle in a substruct
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107111
    Fixes: e9af4ea2b9e7 ("drm/i915: Avoid waitboosting on the active request")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180731132629.3381-1-chris@chris-wilson.co.uk
Comment 10 Lakshmi 2018-08-22 06:01:59 UTC
Reporter, can you verify that the issue is resolved for you on latest drm-tip?
Comment 11 freedesktop 2018-08-22 21:35:03 UTC
I'll have to get back to you on that, as I've just today had a report of stutter from a user after they upgraded from 4.17.14 to 4.18.4 (both kernels include this fix). I'm trying to get more information.

I don't normally build with drm-tip, is testing with 4.18.y sufficient?
Comment 12 Lakshmi 2018-08-23 09:57:47 UTC
(In reply to freedesktop from comment #11)

> I don't normally build with drm-tip, is testing with 4.18.y sufficient?

Currently this commit is not available in 4.18.4 yet, but you can get it from Linus T mainline. Most likely this change would be available from 4.19.
Comment 13 freedesktop 2018-08-27 22:26:54 UTC
> Currently this commit is not available in 4.18.4 yet

Yes, we've been including the v4 commit on top of the stable kernel releases in the LibreELEC build system[1] since it was initially proposed - we're currently creating test builds based on 4.18.4 (now 4.18.5, actually)[2].

> Most likely this change would be available from 4.19.

It looks like the commit (or a variation of it[3]) is now included in 4.19-rc1. I'll produce a 4.19-rc1 test build of LibreELEC and Kodi later this week.

So far I've only had one user refer to ongoing "stutter" issues which makes me suspect it may not be related to this issue, and I'm happy for this issue to be closed as "resolved". If necessary I'll reopen at a later date should it be necessary (hopefully not!)

Many thanks to everyone involved.

1. https://github.com/LibreELEC/LibreELEC.tv/blob/master/packages/linux/patches/default/linux-999-drm_i915-interactive-rps-mode.patch
2. https://github.com/LibreELEC/LibreELEC.tv/pull/2908
3. https://github.com/torvalds/linux/commit/027063b1606fea6df15c270e5f2a072d1dfa8fef
Comment 14 Lakshmi 2018-09-07 14:41:58 UTC
Reporter, were you able to reproduce this? Can you confirm if I can close this bug?
Comment 15 Lakshmi 2018-10-16 08:36:51 UTC
No feedback for more than a month.
I assume this bug has been fixed. Closing this bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.