Created attachment 31523 [details] [review] proposed driver patch Symptom: when X starts up (even during fedora 12 DVD install) system performance is terrible, and there are seconds when the UI is unresponsive. Xorg usually consumes 100% of the CPU. Analysis: Even with X dead, running 'udevadm monitor --property', I get hundreds of 'change' events/sec on /devices/pci0000:00/0000:00:02.0/drm/card0 where SEQNUM increments by 1. The device is a component of the VGA chipset. With X running, the server reconfigures itself several times per second. Using Fedora 12 kernel (2.6.31.5-127), and a hint from Mr. Packard, I narrowed the change events to two bits in the hotplug mask, the HDMI B and D int status. If these are removed from the mask, the system appears to work normally. I don't know if this is the final solution or something that only works with this Foxconn board with a TV plugged into the hdmi port; i.e., works for me. System: motherboard is Foxconn G45M-S LGA 775 Intel G45 HDMI Micro ATX Intel; CPU is E5200. 2GB mem. The display is a Samsung 46" TV connected to the HDMI port. Other details attached: - lspci -vv output - the repeated udev change event - the common X stack trace when the change events are flooding the system - the bios dump The patch that works for me is also attached.
Created attachment 31524 [details] lspci -vv
Created attachment 31525 [details] frequent udev change event
Created attachment 31526 [details] bios dump
Created attachment 31527 [details] usual stack trace during change event flood
*** This bug has been marked as a duplicate of bug 25259 ***
With all due respect, the symptom may be a duplicate, but the fix does not work for my motherboard. This patch (http://cvs.fedoraproject.org/viewvc/F-12/xorg-x11-drv-intel/uevent.patch?revision=1.1) addresses the i830 driver. This patch (http://cvs.fedoraproject.org/viewvc/F-12/kernel/drm-intel-no-tv-hotplug.patch?revision=1.1) removes the TV_HOTPLUG_INT_EN from i915_reg.h, and the kernel/driver I have includes that patch. Hence, it is not applicable to whatever my hardware does.
It seems to me that we need to disable hotplug detect across modesets, as load-based detection probably triggers the hotplug bits. Has anyone tried that?
Please note after applying the patch listed (commenting out HDMI_ lines in i915_irq.c) for my HDMI/udevd issue, I get the following every second in my syslog: Dec 31 15:19:40 pcsca65 kernel: DRHD: handling fault status reg 3 Dec 31 15:19:40 pcsca65 kernel: DMAR:[DMA Write] Request device [00:02.0] fault addr b08003000 Dec 31 15:19:40 pcsca65 kernel: DMAR:[fault reason 05] PTE Write access is not set Dec 31 15:19:40 pcsca65 kernel: DRHD: handling fault status reg 3 Dec 31 15:19:40 pcsca65 kernel: DMAR:[DMA Write] Request device [00:02.0] fault addr b08003000 Dec 31 15:19:40 pcsca65 kernel: DMAR:[fault reason 05] PTE Write access is not set I tried commenting out two bits and just three bits but same error message.
Eric seems to have a theory for a fix here, so assigning to him. -Carl
Could you test with a v2.6.33rc7 or newer kernel? It might be fixed now.
I can try this weekend, I think. I did my initial patch on 2.6.31.5-127 based on some notes I found on building the kernel from the current fedora release (fc12 rawhide, at the time). Looks like kernel.org has http://www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.33-rc8.tar.bz2. We'll see how that goes.
I still have this problem in Fedora Rawhide, kernel-2.6.33-0.48.rc8.git1.fc14.x86_64. On every startup and on every resume from suspend, the system partially freezes for several seconds, sometimes more than a minute. Right after resume from suspend (from ram), the system seems normal, but after ~30 seconds, or after some graphics events (typically pressing alt-F2 in KDE 4) the system locks up, and when when I move the mouse, the pointer will jump non-continuously. I'd be happy to test anything from Fedora/koji. 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA controller]) libdrm-2.4.18-1.fc14.i686 libdrm-2.4.18-1.fc14.x86_64 xorg-x11-server-Xorg-1.7.99.901-6.20100215.fc14.x86_64 xorg-x11-drv-intel-2.10.0-4.fc13.x86_64 xorg-x11-drv-intel-devel-2.10.0-4.fc13.x86_64 intel-gpu-tools-2.10.0-4.fc13.x86_64 libdrm-devel-2.4.18-1.fc14.x86_64 kernel-2.6.33-0.48.rc8.git1.fc14.x86_64
Created attachment 33612 [details] dmesg after coming out of suspend from ram
Martin, thanks for testing this (I'm not finding the time to build/test). Could you run the 'udevadm monitor --property' as described above. It is the driver transistions that cause the performance problems. Thanks.
Created attachment 33613 [details] Output of udevadm monitor --property of a *working* suspend/resume cycle Very strange - after a reboot (making recent updates active), I could not reproduce this bug. Attached the output of udevadm monitor --property of a suspend to ram/resume cycle that went fine without lockup. I will report back if the bug comes back. Current setup is: libdrm-2.4.18-1.fc14.i686 libdrm-2.4.18-1.fc14.x86_64 xorg-x11-drv-intel-2.10.0-4.fc13.x86_64 xorg-x11-drv-intel-devel-2.10.0-4.fc13.x86_64 intel-gpu-tools-2.10.0-4.fc13.x86_64 libdrm-devel-2.4.18-1.fc14.x86_64 kernel-2.6.33-0.48.rc8.git1.fc14.x86_64
Created attachment 33614 [details] udevadm monitor --property while lockup for about one minute Now it happened again. I recorded "udevadmin monitor --property > file" for a while, including a couple of working suspend/resume cycles, but now, when coming out of suspend (to ram), the system was unresponsive for about one minute.
Created attachment 33615 [details] Corresponding dmesg of previous attachment
*** Bug 25259 has been marked as a duplicate of this bug. ***
There have been a few recent patches to prevent interrupt/hotplug storms: mmit 2d1c9752eaa4c0b38f6fb1ab79a6addc146cd64e Author: Andy Lutomirski <luto@MIT.EDU> Date: Sat Jun 12 05:21:18 2010 -0400 drm/i915: Fix CRT hotplug regression in 2.6.35-rc1 Commit 7a772c492fcfffae812ffca78a628e76fa57fe58 has two bugs which made the hotplug problems on my laptop worse instead of better. First, it did not, in fact, disable the CRT plug interrupt -- it disabled all the other hotplug interrupts. It seems rather doubtful that that bit of the patch fixed anything, so let's just remove it. (If you want to add it back, you probably meant ~CRT_HOTPLUG_INT_EN.) Second, on at least my GM45, setting CRT_HOTPLUG_ACTIVATION_PERIOD_64 and CRT_HOTPLUG_VOLTAGE_COMPARE_50 (when they were previously unset) causes a hotplug interrupt about three seconds later. The old code never restored PORT_HOTPLUG_EN so this could only happen once, but they new code restores those registers. So just set those bits when we set up the interrupt in the first place. Signed-off-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Eric Anholt <eric@anholt.net> commit 7a772c492fcfffae812ffca78a628e76fa57fe58 Author: Adam Jackson <ajax@redhat.com> Date: Mon May 24 16:46:29 2010 -0400 drm/i915/gen4: Extra CRT hotplug paranoia Disable the CRT plug interrupt while doing the force cycle, explicitly clear any CRT interrupt we may have generated, and restore when done. Should mitigate interrupt storms from hotplug detection. Signed-off-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net>
2.6.35 has the required patches, I think the hotplug storm has now passed.
(In reply to comment #20) > 2.6.35 has the required patches, I think the hotplug storm has now passed. Can you send these on to stable, please?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.