Bug 29536

Summary: Connection probing causes stalls
Product: DRI Reporter: Bruno <bonbons>
Component: DRM/IntelAssignee: Jesse Barnes <jbarnes>
Status: CLOSED INVALID QA Contact:
Severity: normal    
Priority: medium CC: brian, eugeni, fragabr, mikopp, mjulien.m, nbowler, patrakov, randrik, swt, tim, vasyl.demin
Version: XOrg git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg (including 0xc for drm.ko debug parameter)
none
Workaround, disable polling
none
Workaround, disable polling
none
Second, automatic workaround. none

Description Bruno 2010-08-12 12:19:25 UTC
Created attachment 37821 [details]
dmesg (including  0xc for drm.ko debug parameter)

System is Acer TM660 laptop with i855 chipset:
  00:02.0 VGA compatible controller [0300]: Intel Corporation 82852/855GM
          Integrated Graphics Device [8086:3582] (rev 02)
  00:02.1 Display controller [0380]: Intel Corporation 82852/855GM Integrated
          Graphics Device [8086:3582] (rev 02)

Software (Gentoo distribution):
  linux-2.6.35 (with a few patches,
                all unrelated to GPU except for Daniel Vetter's cache-coherancy
                patch)
  x11-drivers/xf86-video-intel-2.12.0
  x11-libs/libdrm-2.4.21
  x11-base/xorg-server-1.7.7-r1
  media-libs/mesa-7.8.2


Debugging on IRC with ickle produced following insight:

latencytop hit: drm_mode_cursor_ioctl drm_icoctl vfs_ioctl do_vfs_i (rest clipped)

perf top: intel_crt_load_detect gets high-score

quoting ickle:
   hotplug storm!
   that explains what's holding the mode lock

   what's happening is that every 10s the non-hotplug capable outputs are
   polled, and for VGA this involves reading back the border colour [which
   is quite laborious]
Comment 1 Chris Wilson 2010-08-12 12:32:57 UTC
Created attachment 37822 [details] [review]
Workaround, disable polling
Comment 2 Chris Wilson 2010-08-12 12:56:29 UTC
Created attachment 37823 [details] [review]
Workaround, disable polling
Comment 3 Nick Bowler 2010-08-12 13:24:55 UTC
I suspect this is related to https://bugzilla.kernel.org/show_bug.cgi?id=16265
Comment 4 Alex Deucher 2010-08-12 15:08:39 UTC
*** Bug 29433 has been marked as a duplicate of this bug. ***
Comment 5 Chris Wilson 2010-09-09 12:18:10 UTC
Created attachment 38582 [details] [review]
Second, automatic workaround.

Sweep the issue under the carpet by simpler not polling if it requires load-detection.
Comment 6 Alexander E. Patrakov 2010-11-14 09:37:23 UTC
I see that the patch is in 2.6.36. However, for me on 965g, it only fixes the bug if the kernel is compiled as fully preemptible. If the kernel is compiled with voluntary preemption, there is still a 30-100 ms latency introduced every 10 seconds. drm_mms_helper.poll=0 does help.

shining on #intel-gfx asked me to play with latencytrace. Unfortunately, this isn't useful: irqsoff and sched_switch yield no smoking guns, and preemptoff is only available with the fully preemptible kernel.
Comment 7 Chris Wilson 2010-11-14 10:39:52 UTC
If I thought I had resolved the bug, I would have closed the report. This is still open because we hold the mode mutex around the output polling and cursor manipulation, causing the stalls. The two workarounds (a) optionally disable the output polling and (b) remove the worst offender. The locking design remains less than optimal.
Comment 8 Scott Talbert 2011-01-09 15:55:05 UTC
Hi Chris, does your "automatic workaround" patch address this issue for ATI cards, or only Intel?
Comment 9 tim blechmann 2011-04-02 02:20:35 UTC
after switching from the proprietary nvidia driver to nouveau, i experience the same issue. latencytop reports latencies of 100 to 200 ms in drm_mode_cursor_ioctl.
Comment 10 Thomas Lindroth 2011-05-18 05:33:41 UTC
I started using a radeon card on my desktop a few days ago and started experiencing this problem again but I wasn't able to get any of the workarounds to work this time.

I tried to deactivate polling with the drm_kms_helper.poll=0 trick but it didn't work. I even went as far as removing all calls to queue_delayed_work() from drivers/gpu/drm/drm_crtc_helper.c but it didn't work. After digging a while I noticed that a daemon called upowerd reads from /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0/card0-VGA-1/status every 30sec and that cause the stall.

upowerd is a dependency on kde but I can't find any setting in kde to disable this polling. Killing the upowerd process works but I assume that has some negative consequence.
Comment 11 Chris Wilson 2012-03-07 10:04:48 UTC
*** Bug 47059 has been marked as a duplicate of this bug. ***
Comment 12 mikopp 2012-03-07 23:12:01 UTC
The patch for 2.6.36 does not seem to be in 3.1 anymore.
Comment 13 mikopp 2012-03-12 02:07:16 UTC
I tested 3.2.1. While xrandr is much faster (and I get less blocks during starting of wine applications which request that a lot) I now get intermittent blocks of X every 10-30 seconds. The blocks last several seconds and make working completely impossible.

I switched back to a 3.1 kernel and the problem is gone, with xrandr being slow again.

xf86-video-intel-2.17.0
libdrm-2.4.27
mesa-7.11.2 

not blocking: linux 3.1.10
blocking: linux 3.2.1
Comment 14 Eugeni Dodonov 2012-04-02 11:08:38 UTC
Could you test the 3.4-rc1 kernel please? it included a patch which speeds-up the EDID retrievals, so it could help here I think..
Comment 15 Jesse Barnes 2012-06-21 12:34:46 UTC
timeout, but I think this is either fixed or will be shortly by Daniel's hotplug handling fixes.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.