Summary: | [g45] driver floods udev change events: system unusable | ||
---|---|---|---|
Product: | xorg | Reporter: | Todd Brunhoff <toddb> |
Component: | Driver/intel | Assignee: | Eric Anholt <eric> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | critical | ||
Priority: | medium | CC: | cbm, erecio, gronslet, james.ausmus, john.ruemker, mcepl, William.Hanlon |
Version: | unspecified | ||
Hardware: | IA64 (Itanium) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
Todd Brunhoff
2009-11-27 22:28:42 UTC
Created attachment 31524 [details]
lspci -vv
Created attachment 31525 [details]
frequent udev change event
Created attachment 31526 [details]
bios dump
Created attachment 31527 [details]
usual stack trace during change event flood
*** This bug has been marked as a duplicate of bug 25259 *** With all due respect, the symptom may be a duplicate, but the fix does not work for my motherboard. This patch (http://cvs.fedoraproject.org/viewvc/F-12/xorg-x11-drv-intel/uevent.patch?revision=1.1) addresses the i830 driver. This patch (http://cvs.fedoraproject.org/viewvc/F-12/kernel/drm-intel-no-tv-hotplug.patch?revision=1.1) removes the TV_HOTPLUG_INT_EN from i915_reg.h, and the kernel/driver I have includes that patch. Hence, it is not applicable to whatever my hardware does. It seems to me that we need to disable hotplug detect across modesets, as load-based detection probably triggers the hotplug bits. Has anyone tried that? Please note after applying the patch listed (commenting out HDMI_ lines in i915_irq.c) for my HDMI/udevd issue, I get the following every second in my syslog: Dec 31 15:19:40 pcsca65 kernel: DRHD: handling fault status reg 3 Dec 31 15:19:40 pcsca65 kernel: DMAR:[DMA Write] Request device [00:02.0] fault addr b08003000 Dec 31 15:19:40 pcsca65 kernel: DMAR:[fault reason 05] PTE Write access is not set Dec 31 15:19:40 pcsca65 kernel: DRHD: handling fault status reg 3 Dec 31 15:19:40 pcsca65 kernel: DMAR:[DMA Write] Request device [00:02.0] fault addr b08003000 Dec 31 15:19:40 pcsca65 kernel: DMAR:[fault reason 05] PTE Write access is not set I tried commenting out two bits and just three bits but same error message. Eric seems to have a theory for a fix here, so assigning to him. -Carl Could you test with a v2.6.33rc7 or newer kernel? It might be fixed now. I can try this weekend, I think. I did my initial patch on 2.6.31.5-127 based on some notes I found on building the kernel from the current fedora release (fc12 rawhide, at the time). Looks like kernel.org has http://www.kernel.org/pub/linux/kernel/v2.6/testing/linux-2.6.33-rc8.tar.bz2. We'll see how that goes. I still have this problem in Fedora Rawhide, kernel-2.6.33-0.48.rc8.git1.fc14.x86_64. On every startup and on every resume from suspend, the system partially freezes for several seconds, sometimes more than a minute. Right after resume from suspend (from ram), the system seems normal, but after ~30 seconds, or after some graphics events (typically pressing alt-F2 in KDE 4) the system locks up, and when when I move the mouse, the pointer will jump non-continuously. I'd be happy to test anything from Fedora/koji. 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA controller]) libdrm-2.4.18-1.fc14.i686 libdrm-2.4.18-1.fc14.x86_64 xorg-x11-server-Xorg-1.7.99.901-6.20100215.fc14.x86_64 xorg-x11-drv-intel-2.10.0-4.fc13.x86_64 xorg-x11-drv-intel-devel-2.10.0-4.fc13.x86_64 intel-gpu-tools-2.10.0-4.fc13.x86_64 libdrm-devel-2.4.18-1.fc14.x86_64 kernel-2.6.33-0.48.rc8.git1.fc14.x86_64 Created attachment 33612 [details]
dmesg after coming out of suspend from ram
Martin, thanks for testing this (I'm not finding the time to build/test). Could you run the 'udevadm monitor --property' as described above. It is the driver transistions that cause the performance problems. Thanks. Created attachment 33613 [details]
Output of udevadm monitor --property of a *working* suspend/resume cycle
Very strange - after a reboot (making recent updates active), I could not reproduce this bug. Attached the output of udevadm monitor --property of a suspend to ram/resume cycle that went fine without lockup.
I will report back if the bug comes back. Current setup is:
libdrm-2.4.18-1.fc14.i686
libdrm-2.4.18-1.fc14.x86_64
xorg-x11-drv-intel-2.10.0-4.fc13.x86_64
xorg-x11-drv-intel-devel-2.10.0-4.fc13.x86_64
intel-gpu-tools-2.10.0-4.fc13.x86_64
libdrm-devel-2.4.18-1.fc14.x86_64
kernel-2.6.33-0.48.rc8.git1.fc14.x86_64
Created attachment 33614 [details]
udevadm monitor --property while lockup for about one minute
Now it happened again. I recorded "udevadmin monitor --property > file" for a while, including a couple of working suspend/resume cycles, but now, when coming out of suspend (to ram), the system was unresponsive for about one minute.
Created attachment 33615 [details]
Corresponding dmesg of previous attachment
*** Bug 25259 has been marked as a duplicate of this bug. *** There have been a few recent patches to prevent interrupt/hotplug storms: mmit 2d1c9752eaa4c0b38f6fb1ab79a6addc146cd64e Author: Andy Lutomirski <luto@MIT.EDU> Date: Sat Jun 12 05:21:18 2010 -0400 drm/i915: Fix CRT hotplug regression in 2.6.35-rc1 Commit 7a772c492fcfffae812ffca78a628e76fa57fe58 has two bugs which made the hotplug problems on my laptop worse instead of better. First, it did not, in fact, disable the CRT plug interrupt -- it disabled all the other hotplug interrupts. It seems rather doubtful that that bit of the patch fixed anything, so let's just remove it. (If you want to add it back, you probably meant ~CRT_HOTPLUG_INT_EN.) Second, on at least my GM45, setting CRT_HOTPLUG_ACTIVATION_PERIOD_64 and CRT_HOTPLUG_VOLTAGE_COMPARE_50 (when they were previously unset) causes a hotplug interrupt about three seconds later. The old code never restored PORT_HOTPLUG_EN so this could only happen once, but they new code restores those registers. So just set those bits when we set up the interrupt in the first place. Signed-off-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Eric Anholt <eric@anholt.net> commit 7a772c492fcfffae812ffca78a628e76fa57fe58 Author: Adam Jackson <ajax@redhat.com> Date: Mon May 24 16:46:29 2010 -0400 drm/i915/gen4: Extra CRT hotplug paranoia Disable the CRT plug interrupt while doing the force cycle, explicitly clear any CRT interrupt we may have generated, and restore when done. Should mitigate interrupt storms from hotplug detection. Signed-off-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net> 2.6.35 has the required patches, I think the hotplug storm has now passed. (In reply to comment #20) > 2.6.35 has the required patches, I think the hotplug storm has now passed. Can you send these on to stable, please? |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.