Bug 73064 - [ilk Regression] Generalized slowness after upgrade to kernel 3.10
Summary: [ilk Regression] Generalized slowness after upgrade to kernel 3.10
Status: NEEDINFO
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-12-27 12:16 UTC by Jonathan Protzenko
Modified: 2014-07-01 07:31 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
glxinfo with the working kernel (9.70 KB, text/plain)
2013-12-27 13:07 UTC, Jonathan Protzenko
no flags Details
glxinfo with a kernal that exhibits the issue (9.70 KB, text/plain)
2013-12-27 13:07 UTC, Jonathan Protzenko
no flags Details
xorg with a working kernel (37.78 KB, text/plain)
2013-12-27 13:07 UTC, Jonathan Protzenko
no flags Details
xorg log with a kernel that exhibits the issue (36.40 KB, text/plain)
2013-12-27 13:08 UTC, Jonathan Protzenko
no flags Details
Series of x11perf tests with the (working) 3.9 kernel (2.52 KB, text/plain)
2014-01-29 10:47 UTC, Jonathan Protzenko
no flags Details
Series of x11perf tests with the (faulty) 3.12 kernel (2.53 KB, text/plain)
2014-01-29 10:49 UTC, Jonathan Protzenko
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Protzenko 2013-12-27 12:16:58 UTC
Bug description:

When running the 3.9 kernel everything is fine. However, starting with kernel 3.10, I experienced generalized slowness related to graphics operations. Drawing an entire page in Firefox, or switching desktops are slow operations. There's a delay (I would say 0.2s) whenever a large portion of the screen needs to be repainted. If I try to switch desktops repeatedly, the whole thing gets slower and then there's a one-second delay before the desktop switches complete.

Gratuitous hypothesis: there used to be a warning about the turbo mode being disabled which disappeared in 3.10. I somehow feel like the frequency of the graphics chip is too low for it to perform operations in a timely manner.

I'm not a regular at bugs.freedesktop.org, so please bear with me if the information below is incomplete. In any case, I'd be happy to provide more information if you tell me how to do it!

Thanks,

~ jonathan

Information about my setup:

computer: dell laptop e6410 (without the nvidia card, just the integrated intel chipset)
product: Core Processor Integrated Graphics Controller [8086:46]
vendor: Intel Corporation [8086]
xserver and intel driver versions:
  [    37.721] xorg-server 2:1.14.3-5 (Maarten Lankhorst <maarten.lankhorst@ubuntu.com>) 
  [    38.415] (II) Module intel: vendor="X.Org Foundation"
  [    38.415]    compiled for 1.14.3, module version = 2.21.15
  [    38.415]    Module class: X.Org Video Driver
  [    38.415]    ABI class: X.Org Video Driver, version 14.1
machine: debian testing x86_64
glxinfo:
  client glx vendor string: Mesa Project and SGI
  client glx version string: 1.4
  OpenGL version string: 2.1 Mesa 9.2.2
  OpenGL shading language version string: 1.20
libdrm:
  jonathan@ramona:~ $ pkg-config --modversion libdrm
  2.4.49
Comment 1 Chris Wilson 2013-12-27 12:52:13 UTC
Your complete Xorg.0.log and glxinfo will contain some essential information.
Comment 2 Jonathan Protzenko 2013-12-27 13:07:16 UTC
Created attachment 91217 [details]
glxinfo with the working kernel
Comment 3 Jonathan Protzenko 2013-12-27 13:07:39 UTC
Created attachment 91218 [details]
glxinfo with a kernal that exhibits the issue
Comment 4 Jonathan Protzenko 2013-12-27 13:07:51 UTC
Created attachment 91219 [details]
xorg with a working kernel
Comment 5 Jonathan Protzenko 2013-12-27 13:08:50 UTC
Created attachment 91220 [details]
xorg log with a kernel that exhibits the issue

Here's the requested information. I'm afraid both the logs and the glxinfo outputs are similar across kernel versions.

Hope that helps,

~ jonathan
Comment 6 Chris Wilson 2013-12-27 17:36:34 UTC
Hmm, Ironlake. Not likely to be directly related to i915.ko. The first thing I would suggest checking is that intel_ips is running (CONFIG_INTEL_IPS).
Comment 7 Jonathan Protzenko 2013-12-27 18:13:44 UTC
Yes, the module is loaded. Unloading it / reloading it has no effect.
Comment 8 Chris Wilson 2013-12-30 12:15:56 UTC
Did you enable vtd or iommu? Does intel_iommu=off make any difference?

i.e. does this log anything?

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 998f9a0b322a..2adfef0fa6ac 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1727,6 +1727,9 @@ static int i915_gmch_probe(struct drm_device *dev,
        dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
        dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
 
+       if (unlikely(dev_priv->gtt.do_idle_maps))
+               DRM_INFO("applying Ironlake quirks for intel_iommu\n");
+
        return 0;
 }
Comment 9 Jonathan Protzenko 2013-12-30 20:55:01 UTC
Hi, and first of all thanks for the quick feedback and help :-).

(In reply to comment #8)
> Did you enable vtd or iommu?
I, unfortunately, have no idea what this even means. I'm just following Debian's default setup. If you can tell me how to determine this, I'd be happy to provide you with the information.
> Does intel_iommu=off make any difference?
Yes, booting with 3.11 and intel_iommu=off _does_ fix the problem. Fantastic!
> 
> i.e. does this log anything?
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 998f9a0b322a..2adfef0fa6ac 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1727,6 +1727,9 @@ static int i915_gmch_probe(struct drm_device *dev,
>         dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
>         dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
>  
> +       if (unlikely(dev_priv->gtt.do_idle_maps))
> +               DRM_INFO("applying Ironlake quirks for intel_iommu\n");
> +
>         return 0;
>  }
Booting 3.11 with intel_iommu=off, I do not see the message above in the output of dmesg.

Let me know if I can help you further!

Cheers and a happy new year,

~ jonathan
Comment 10 Jonathan Protzenko 2014-01-02 12:24:30 UTC
It turns out that the bug re-surfaced with intel_iommu=off. The absence of the bug may have been correlated with my using my external screen and not the internal screen of the laptop, I'll investigate more.
Comment 11 Chris Wilson 2014-01-17 11:00:29 UTC
Hmm. Any news or fresh ideas?
Comment 12 Chris Wilson 2014-01-17 18:46:21 UTC
Also see if x11perf (perhaps -aa10text, -putimage10, -shmput500, -copywinwin500) detects any discrepancies.
Comment 13 Jonathan Protzenko 2014-01-29 10:38:43 UTC
I've done some more testing, and I can confirm that the issue:
- does NOT occur when booting 3.12 with the external screen attached;
- occurs on 3.12 without the external screen attached.

I'm running the benchmarks you suggested on 3.12 (faulty kernel) and soon on 3.9 (working kernel) to compare the figures.
Comment 14 Jonathan Protzenko 2014-01-29 10:47:39 UTC
Created attachment 92985 [details]
Series of x11perf tests with the (working) 3.9 kernel
Comment 15 Jonathan Protzenko 2014-01-29 10:49:38 UTC
Created attachment 92986 [details]
Series of x11perf tests with the (faulty) 3.12 kernel

Here's the requested series of tests. 3.9 seems to behave better for aatext and _much_ better for copy window to window.

FWIW, the "slowness" occurs mainly when switching desktops in my window manager. Whenever I do that, the x11perf tests are slowed down almost to a halt, and then after about a second, start running at full speed again.
Comment 16 Chris Wilson 2014-01-29 11:50:38 UTC
I feel fairly confident that the x11perf differences are an artifact of IPS. (The image tests were for a reference point to check that CPU/memory performance was the same between kernels. But you can notice that the tests gradually got quicker in 3.12, which is a sign of IPS - if you keep on running -copywinwin500 does it plateau at the 3.9 levels?)

I don't think those measurements give me the explanation I was looking for though - I had hoped that the GPU copy performance would mirror the dramatic performance difference in workspace switching between 3.9 and 3.12.

There is one other thing to quickly test, add

Section "Device"
 Identifier "Device0"
 Driver "intel"
 Option "AccelMethod" "sna"
EndSection

to your /etc/X11/xorg.conf

That also won't explain what changed in the kernel, but I am curious to know how it performs on your upset machine.
Comment 17 Chris Wilson 2014-03-25 10:57:39 UTC
Any recent updates?
Comment 18 Jonathan Protzenko 2014-04-01 12:21:25 UTC
Sorry for the delay.

I tried using sna, my /var/log/Xorg.0.log now shows:

[ 55982.994] (**) intel(0): Option "AccelMethod" "sna"

but this does not solve the problem.

The slowdown only happens when switching desktops with my window manager (e17), so I guess there's some special x11 call that happens there and that is responsible for the slowdown. If only I could figure out which X11 call... probably one that redraws the entire screen?
Comment 19 Jonathan Protzenko 2014-04-01 12:23:38 UTC
I can also confirm that today I was using 3.12 with my external screen plugged in, with no issues. When I woke up the computer after unplugging the screen, the problem surfaced immediately.
Comment 20 Daniel Vetter 2014-04-11 14:34:20 UTC
If you've found a more reliable way to reproduce this it might be interesting to attempt a bisect. Otherwise I don't really see a handle for tackling this bug here.
Comment 21 Chris Wilson 2014-07-01 07:30:56 UTC
It's a long shot, but if you compile SNA with ./configure --enable-debug=full and reproduce the slowdown I might be able to see what causes it in the  humongous haystack of a logfile.
Comment 22 Chris Wilson 2014-07-01 07:31:47 UTC
Oh, in the meantime try SNA with the latest kernel, and redo the x11perf testing before/after slowdown.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.