Created attachment 84868 [details] Xorg.0.log using UXA As mentionend in the title, dragging windows in the overview of Gnome3 is quite jerky with SNA. With UXA doing the very same things is as smooth as you would expect. (see the screencasts). In both cases, the X config just contained the AccelMethod. This is on a $ lspci -vnn | grep VGA 00:02.0 VGA compatible controller [0300]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller [8086:0412] (rev 06) (prog-if 00 [VGA controller]) using xf86-video-intel 2.21.15 mesa 9.2 linux from danvet's drm-intel-next-queued branch from around 20th August (the hash is bogus since I have applied some patches locally) and standard Gnome 3.8, so using mutter as the WM.
Created attachment 84869 [details] Xorg.0.log using SNA
Created attachment 84870 [details] Screencast with using UXA
Created attachment 84871 [details] Screencast with using SNA
For starters try the fixes from http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-bug68716 which include, but not limited to, the rcs flips.
Created attachment 84879 [details] .config causing build failure This branch at 1d99222b0939be3decca54be4651437037179b36 does not build for me. It fails to link (I guess) at the end with Kernel: arch/x86/boot/bzImage is ready (#1) ERROR: "__hrtimer_start_range_ns" [drivers/gpu/drm/i915/i915.ko] undefined! ERROR: "vma_merge" [drivers/gpu/drm/i915/i915.ko] undefined! ERROR: "find_vma_prev" [drivers/gpu/drm/i915/i915.ko] undefined! ERROR: "split_vma" [drivers/gpu/drm/i915/i915.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2
Haha, found it. It's just gnome-shell doesn't like being double-buffered! Owen Taylor was wrong...
(In reply to comment #5) > Created attachment 84879 [details] > .config causing build failure > > This branch at 1d99222b0939be3decca54be4651437037179b36 does not build for > me. > It fails to link (I guess) at the end with > > Kernel: arch/x86/boot/bzImage is ready (#1) > ERROR: "__hrtimer_start_range_ns" [drivers/gpu/drm/i915/i915.ko] undefined! > ERROR: "vma_merge" [drivers/gpu/drm/i915/i915.ko] undefined! > ERROR: "find_vma_prev" [drivers/gpu/drm/i915/i915.ko] undefined! > ERROR: "split_vma" [drivers/gpu/drm/i915/i915.ko] undefined! > make[1]: *** [__modpost] Error 1 > make: *** [modules] Error 2 Meh, who uses modules?
(In reply to comment #7) > Meh, who uses modules? Just for you: CONFIG_DRM_I915=y ;). The jerkiness is gone now with SNA. Apart from that, the only thing I noticed is some black flicker at the top of the screen with UXA which might interest you.
One more thing. As already said, with the new kernel image, moving the windows in the overview is smooth. However, the animation when you start grabbing the window and it is being scaled down, is still not as smooth with SNA as with UXA (this is my impression at least), so this issue is half-solved.
The issue appears to be that the overview panel redraw is slow and it misses the 60Hz refresh when double-buffered. UXA does tripling buffering by default, and I just reverted SNA back to double-buffering for the compositor (at the behest of the gnome-shell devs - because it should allow them to do smoother animations!). Option "TripleBuffer" "false" should make UXA behave the same. I'm digging deeper into exactly why this misses, based on this SNB machine and your results, I think it is due to the semaphore interlock following the flip stalls the GPU. Which would also explain the regression with updating mesa. (But that seems an awfully long stall.)
Reading the traces, the biggest difference is that in the triple-buffer case, the GPU is completing operations much faster (about 3x). So this feels like rc6 wakeup latency (when triple buffering the GPU never sleeps, but with double buffering we have periods of over 10ms idle whilst waiting for the flip-completion.) This "fixes" it for me: diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index dad0777..80f8730 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2704,6 +2704,10 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj, ret = 0; if (!i915_seqno_passed(from->get_seqno(from, false), seqno)) { + struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + if (dev_priv->info->gen >= 6) + gen6_rps_boost(dev_priv); + ret = i915_gem_check_olr(from, seqno); if (ret) return ret; can you please test this on top of the for-bug68716 series? I'm not happy with that approach just yet, it reeks of overkill.
Hmm, even with your patch from c11 on top for-bug68716, default SNA and non-triple buffering UXA feels a little bit more laggy than default UXA. Is there any way I can measure the frametimes or framerate so we don't have to rely on my gut feeling?
Slightly less nasty is: commit c32b7fa40bac264f847ece54ef1fab69179950b9 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Aug 30 02:04:55 2013 +0100 always boost diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2f9ff14..2f72420 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1126,8 +1126,13 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno, timeout_jiffies = timeout ? timespec_to_jiffies_timeout(timeout) : 1; - if (dev_priv->info->gen >= 6 && can_wait_boost(file_priv)) + if (dev_priv->info->gen >= 6 && can_wait_boost(file_priv)) { gen6_rps_boost(dev_priv); + if (file_priv) + mod_delayed_work(dev_priv->wq, + &file_priv->mm.idle_work, + msecs_to_jiffies(100)); + } if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)) && WARN_ON(!ring->irq_get(ring))) @@ -2226,8 +2231,6 @@ int __i915_add_request(struct intel_ring_buffer *ring, if (file) { struct drm_i915_file_private *file_priv = file->driver_priv; - cancel_delayed_work_sync(&file_priv->mm.idle_work); - spin_lock(&file_priv->mm.lock); request->file_priv = file_priv; list_add_tail(&request->client_list, @@ -2265,10 +2268,6 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request) spin_lock(&file_priv->mm.lock); list_del(&request->client_list); - if (list_empty(&file_priv->mm.request_list)) - mod_delayed_work(to_i915(request->ring->dev)->wq, - &file_priv->mm.idle_work, - msecs_to_jiffies(100)); request->file_priv = NULL; spin_unlock(&file_priv->mm.lock); }
I have intel-gpu-tools/overlay/intel-gpu-overlay which may help. (You'll have issues if you try to use that locally with UXA as it will interact with the system.)
Paul, did you get the intel-gpu-overlay up and running? Did you notice any difference with the always-boost patch?
(In reply to comment #15) > Paul, did you get the intel-gpu-overlay up and running? Did you notice any > difference with the always-boost patch? Hey Chris, sorry for not replying earlier. I did get intel-gpu-overlay up and running but did not perform any further investigation. Also, I won't have access to my machine until the end of September so I won't be able to test things for now.
Note that gnome-shell includes a benchmark, gnome-shell-perf-tool. Unfortunately it doesn't capture the click'n'move slowdown, but it does indicate at least one area of concern.
So I got back to do some more testing and indeed, your always-boost patch on top of for-bug68716 greatly improves things. In my opinion, the jerkiness is completely gone now. Note that I upgraded to Gnome 3.10 in the meantime, and I don't know whether mutter has improved in that area regardless of the driver. Therefore, I also quickly tested a vanilla 3.11 kernel build and double buffering is still causing some jerkiness (albeit not as much as with 3.8 when I originally reported this bug) when dragging windows in the overview. So the always-boost patch does help with this problem.
My fedora/gnome-shell system still only has 3.8 (F19). Which distro are you using ? I wonder if switching to F20 is worth it to test gnome-shell-3.10 - but gnome-shell-3.8 is still going to be widely used for the next 6+ months.
I am using Arch Linux with the [gnome-unstable] repository enabled. I assume it will still take some time until Gnome 3.10 lands in the standard repos.
We've applied the manual RPS boosting - hopefully it sticks (I fear that QA will starting screaming about excess power consumption, but I think it should be ok...) commit b29c19b645287f7062e17d70fa4e9781a01a5d88 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Sep 25 17:34:56 2013 +0100 drm/i915: Boost RPS frequency for CPU stalls commit dd75fdc8c69587c91bd68a6ed7c726b5e70f9399 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Sep 25 17:34:57 2013 +0100 drm/i915: Tweak RPS thresholds to more aggressively downclock
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.