Bug 68716

Summary: Dragging windows in Gnome3 overview is less smooth with double buffering
Product: xorg Reporter: Paul Neumann <paul104x>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: gordon.jin
Version: git   
Hardware: Other   
OS: All   
See Also: https://bugzilla.gnome.org/show_bug.cgi?id=707537
https://bugzilla.gnome.org/show_bug.cgi?id=707712
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.0.log using UXA
none
Xorg.0.log using SNA
none
Screencast with using UXA
none
Screencast with using SNA
none
.config causing build failure none

Description Paul Neumann 2013-08-29 17:32:38 UTC
Created attachment 84868 [details]
Xorg.0.log using UXA

As mentionend in the title, dragging windows in the overview of Gnome3 is quite jerky with SNA. With UXA doing the very same things is as smooth as you would expect. (see the screencasts).
In both cases, the X config just contained the AccelMethod.

This is on a
$ lspci -vnn | grep VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller [8086:0412] (rev 06) (prog-if 00 [VGA controller])

using

xf86-video-intel 2.21.15
mesa 9.2
linux from danvet's drm-intel-next-queued branch from around 20th August (the hash is bogus since I have applied some patches locally)

and standard Gnome 3.8, so using mutter as the WM.
Comment 1 Paul Neumann 2013-08-29 17:33:01 UTC
Created attachment 84869 [details]
Xorg.0.log using SNA
Comment 2 Paul Neumann 2013-08-29 17:33:40 UTC
Created attachment 84870 [details]
Screencast with using UXA
Comment 3 Paul Neumann 2013-08-29 17:34:10 UTC
Created attachment 84871 [details]
Screencast with using SNA
Comment 4 Chris Wilson 2013-08-29 18:06:51 UTC
For starters try the fixes from http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-bug68716 which include, but not limited to, the rcs flips.
Comment 5 Paul Neumann 2013-08-29 18:30:33 UTC
Created attachment 84879 [details]
.config causing build failure

This branch at 1d99222b0939be3decca54be4651437037179b36 does not build for me.
It fails to link (I guess) at the end with

Kernel: arch/x86/boot/bzImage is ready  (#1)
ERROR: "__hrtimer_start_range_ns" [drivers/gpu/drm/i915/i915.ko] undefined!
ERROR: "vma_merge" [drivers/gpu/drm/i915/i915.ko] undefined!
ERROR: "find_vma_prev" [drivers/gpu/drm/i915/i915.ko] undefined!
ERROR: "split_vma" [drivers/gpu/drm/i915/i915.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
Comment 6 Chris Wilson 2013-08-29 18:37:35 UTC
Haha, found it. It's just gnome-shell doesn't like being double-buffered! Owen Taylor was wrong...
Comment 7 Chris Wilson 2013-08-29 18:37:56 UTC
(In reply to comment #5)
> Created attachment 84879 [details]
> .config causing build failure
> 
> This branch at 1d99222b0939be3decca54be4651437037179b36 does not build for
> me.
> It fails to link (I guess) at the end with
> 
> Kernel: arch/x86/boot/bzImage is ready  (#1)
> ERROR: "__hrtimer_start_range_ns" [drivers/gpu/drm/i915/i915.ko] undefined!
> ERROR: "vma_merge" [drivers/gpu/drm/i915/i915.ko] undefined!
> ERROR: "find_vma_prev" [drivers/gpu/drm/i915/i915.ko] undefined!
> ERROR: "split_vma" [drivers/gpu/drm/i915/i915.ko] undefined!
> make[1]: *** [__modpost] Error 1
> make: *** [modules] Error 2

Meh, who uses modules?
Comment 8 Paul Neumann 2013-08-29 18:57:40 UTC
(In reply to comment #7)
> Meh, who uses modules?
Just for you: CONFIG_DRM_I915=y ;).

The jerkiness is gone now with SNA.

Apart from that, the only thing I noticed is some black flicker at the top of the screen with UXA which might interest you.
Comment 9 Paul Neumann 2013-08-29 19:08:13 UTC
One more thing. As already said, with the new kernel image, moving the windows in the overview is smooth. However, the animation when you start grabbing the window and it is being scaled down, is still not as smooth with SNA as with UXA (this is my impression at least), so this issue is half-solved.
Comment 10 Chris Wilson 2013-08-29 20:35:49 UTC
The issue appears to be that the overview panel redraw is slow and it misses the 60Hz refresh when double-buffered. UXA does tripling buffering by default, and I just reverted SNA back to double-buffering for the compositor (at the behest of the gnome-shell devs - because it should allow them to do smoother animations!). 

Option "TripleBuffer" "false"

should make UXA behave the same. I'm digging deeper into exactly why this misses, based on this SNB machine and your results, I think it is due to the semaphore interlock following the flip stalls the GPU. Which would also explain the regression with updating mesa. (But that seems an awfully long stall.)
Comment 11 Chris Wilson 2013-08-29 23:15:55 UTC
Reading the traces, the biggest difference is that in the triple-buffer case, the GPU is completing operations much faster (about 3x). So this feels like rc6 wakeup latency (when triple buffering the GPU never sleeps, but with double buffering we have periods of over 10ms idle whilst waiting for the flip-completion.)

This "fixes" it for me:

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dad0777..80f8730 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2704,6 +2704,10 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
 
        ret = 0;
        if (!i915_seqno_passed(from->get_seqno(from, false), seqno)) {
+               struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+               if (dev_priv->info->gen >= 6)
+                       gen6_rps_boost(dev_priv);
+
                ret = i915_gem_check_olr(from, seqno);
                if (ret)
                        return ret;

can you please test this on top of the for-bug68716 series?

I'm not happy with that approach just yet, it reeks of overkill.
Comment 12 Paul Neumann 2013-08-30 09:08:03 UTC
Hmm, even with your patch from c11 on top for-bug68716, default SNA and non-triple buffering UXA feels a little bit more laggy than default UXA.

Is there any way I can measure the frametimes or framerate so we don't have to rely on my gut feeling?
Comment 13 Chris Wilson 2013-08-30 09:24:40 UTC
Slightly less nasty is:

commit c32b7fa40bac264f847ece54ef1fab69179950b9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Aug 30 02:04:55 2013 +0100

    always boost

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2f9ff14..2f72420 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1126,8 +1126,13 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
 
 	timeout_jiffies = timeout ? timespec_to_jiffies_timeout(timeout) : 1;
 
-	if (dev_priv->info->gen >= 6 && can_wait_boost(file_priv))
+	if (dev_priv->info->gen >= 6 && can_wait_boost(file_priv)) {
 		gen6_rps_boost(dev_priv);
+		if (file_priv)
+			mod_delayed_work(dev_priv->wq,
+					 &file_priv->mm.idle_work,
+					 msecs_to_jiffies(100));
+	}
 
 	if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)) &&
 	    WARN_ON(!ring->irq_get(ring)))
@@ -2226,8 +2231,6 @@ int __i915_add_request(struct intel_ring_buffer *ring,
 	if (file) {
 		struct drm_i915_file_private *file_priv = file->driver_priv;
 
-		cancel_delayed_work_sync(&file_priv->mm.idle_work);
-
 		spin_lock(&file_priv->mm.lock);
 		request->file_priv = file_priv;
 		list_add_tail(&request->client_list,
@@ -2265,10 +2268,6 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 
 	spin_lock(&file_priv->mm.lock);
 	list_del(&request->client_list);
-	if (list_empty(&file_priv->mm.request_list))
-		mod_delayed_work(to_i915(request->ring->dev)->wq,
-				 &file_priv->mm.idle_work,
-				 msecs_to_jiffies(100));
 	request->file_priv = NULL;
 	spin_unlock(&file_priv->mm.lock);
 }
Comment 14 Chris Wilson 2013-08-30 09:26:58 UTC
I have intel-gpu-tools/overlay/intel-gpu-overlay which may help. (You'll have issues if you try to use that locally with UXA as it will interact with the system.)
Comment 15 Chris Wilson 2013-09-05 14:38:38 UTC
Paul, did you get the intel-gpu-overlay up and running? Did you notice any difference with the always-boost patch?
Comment 16 Paul Neumann 2013-09-07 02:56:11 UTC
(In reply to comment #15)
> Paul, did you get the intel-gpu-overlay up and running? Did you notice any
> difference with the always-boost patch?

Hey Chris, sorry for not replying earlier. I did get intel-gpu-overlay up and running but did not perform any further investigation.
Also, I won't have access to my machine until the end of September so I won't be able to test things for now.
Comment 17 Chris Wilson 2013-09-08 10:53:25 UTC
Note that gnome-shell includes a benchmark, gnome-shell-perf-tool. Unfortunately it doesn't capture the click'n'move slowdown, but it does indicate at least one area of concern.
Comment 18 Paul Neumann 2013-09-25 15:29:26 UTC
So I got back to do some more testing and indeed, your always-boost patch on top of for-bug68716 greatly improves things. In my opinion, the jerkiness is completely gone now.

Note that I upgraded to Gnome 3.10 in the meantime, and I don't know whether mutter has improved in that area regardless of the driver.
Therefore, I also quickly tested a vanilla 3.11 kernel build and double buffering is still causing some jerkiness (albeit not as much as with 3.8 when I originally reported this bug) when dragging windows in the overview. So the always-boost patch does help with this problem.
Comment 19 Chris Wilson 2013-09-25 16:19:03 UTC
My fedora/gnome-shell system still only has 3.8 (F19). Which distro are you using ? I wonder if switching to F20 is worth it to test gnome-shell-3.10 - but gnome-shell-3.8 is still going to be widely used for the next 6+ months.
Comment 20 Paul Neumann 2013-09-25 17:47:22 UTC
I am using Arch Linux with the [gnome-unstable] repository enabled. I assume it will still take some time until Gnome 3.10 lands in the standard repos.
Comment 21 Chris Wilson 2013-10-04 10:11:01 UTC
We've applied the manual RPS boosting - hopefully it sticks (I fear that QA will starting screaming about excess power consumption, but I think it should be ok...)

commit b29c19b645287f7062e17d70fa4e9781a01a5d88
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Sep 25 17:34:56 2013 +0100

    drm/i915: Boost RPS frequency for CPU stalls

commit dd75fdc8c69587c91bd68a6ed7c726b5e70f9399
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Sep 25 17:34:57 2013 +0100

    drm/i915: Tweak RPS thresholds to more aggressively downclock

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.