Created attachment 43142 [details] dmesg showing the failure With 2.6.38-rc3+, I'm seeing a GPU hang and a total loss of graphics very soon after Gnome starts (within 2~3 seconds). It did not happen with 2.6.37. lspci: 00:00.0 Host bridge: Intel Corporation Device 0100 (rev 09) 00:02.0 VGA compatible controller: Intel Corporation Device 0112 (rev 09) 00:16.0 Communication controller: Intel Corporation Device 1c3a (rev 04) 00:1a.0 USB Controller: Intel Corporation Device 1c2d (rev 04) 00:1b.0 Audio device: Intel Corporation Device 1c20 (rev 04) 00:1c.0 PCI bridge: Intel Corporation Device 1c10 (rev b4) 00:1c.2 PCI bridge: Intel Corporation Device 1c14 (rev b4) 00:1c.3 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b4) 00:1c.4 PCI bridge: Intel Corporation Device 1c18 (rev b4) 00:1d.0 USB Controller: Intel Corporation Device 1c26 (rev 04) 00:1f.0 ISA bridge: Intel Corporation Device 1c4a (rev 04) 00:1f.2 SATA controller: Intel Corporation Device 1c02 (rev 04) 00:1f.3 SMBus: Intel Corporation Device 1c22 (rev 04) 02:00.0 USB Controller: Device 1b6f:7023 (rev 01) 03:00.0 PCI bridge: Device 1b21:1080 (rev 01) 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06) # lspci -vv -s 0:2.0 00:02.0 VGA compatible controller: Intel Corporation Device 0112 (rev 09) (prog-if 00 [VGA controller]) Subsystem: ASRock Incorporation Device 0112 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 43 Region 0: Memory at fe000000 (64-bit, non-prefetchable) [size=4M] Region 2: Memory at c0000000 (64-bit, prefetchable) [size=256M] Region 4: I/O ports at f000 [size=64] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0f00c Data: 4171 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a4] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR- AFStatus: TP- Kernel driver in use: i915
Created attachment 43143 [details] Xorg.0.log
Created attachment 43144 [details] kernel config
Can you please merge drm-intel-next [git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel.git] and attach the /sys/kernel/debug/dri/0/i915_error_state?
Created attachment 43861 [details] i915_error_state.gz i915_error_state with the current linus-2.6 + drm-intel-next. Same thing. Maybe a bit worse. GDM works fine, but after login when the desktop begins to appear, the GPU goes into a restart cycle.
Created attachment 44102 [details] i915_error_state.gz Still present as of b65a0e0c84cf489bfa00d6aa6c48abc5a237100f. I updated libdrm/mesa/xserver/xf86-video-intel to the latest from git a few minutes ago, no changes with .38. It still dies. 2.6.37 on the other hand, fine. Do I need to be reporting this on the kernel bugzilla?
Created attachment 44106 [details] dmesg from working 2.6.37
Same IPEHR, hmm. What's the lspci for this chip?
Ok, not the same. Just forgot to look for the renamed file, d'oh. Hmm. It looks like the write to the tail went astray, judging by the IPEHR; I don't see any other reason for it not to have advanced. Try this to see if it makes the hang disappear: diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index bdf4ceb..5e26b5e 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -264,9 +264,11 @@ void __gen6_force_wake_get(struct drm_i915_private *dev_priv) { int count; +#if 0 count = 0; while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_ACK) & 1)) udelay(10); +#endif I915_WRITE_NOTRACE(FORCEWAKE, 1); POSTING_READ(FORCEWAKE); @@ -278,8 +280,10 @@ void __gen6_force_wake_get(struct drm_i915_private *dev_priv) void __gen6_force_wake_put(struct drm_i915_private *dev_priv) { +#if 0 I915_WRITE_NOTRACE(FORCEWAKE, 0); POSTING_READ(FORCEWAKE); +#endif } static int i915_drm_freeze(struct drm_device *dev)
(In reply to comment #8) > Try this to see if it makes the hang disappear: This does indeed make the hang/reboot-cycle go away. Also, dmesg from .37 and .38 differ as to stolen memory. I didn't modify any BIOS settings to make this happen.
Created attachment 44137 [details] [review] Poll the FIFO for free entries before writing the register
(In reply to comment #10) > Created an attachment (id=44137) [details] > Poll the FIFO for free entries before writing the register That fixes it! Please have a Reported-and-Tested-by: Matt Turner <mattst88@gmail.com> Thanks a ton, Chris. :)
Matt, thanks a lot for that quick testing. I'll send it onwards shortly (I'm just waiting to hear if fixes another issue as well).
*** Bug 34421 has been marked as a duplicate of this bug. ***
Pushed to -fixes: commit 91355834646328e7edc6bd25176ae44bcd7386c7 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Mar 4 19:22:40 2011 +0000 drm/i915: Do not overflow the MMADDR write FIFO Whilst the GT is powered down (rc6), writes to MMADDR are placed in a FIFO by the System Agent. This is a limited resource, only 64 entries, of which 20 are reserved for Display and PCH writes, and so we must take care not to queue up too many writes. To avoid this, there is counter which we can poll to ensure there are sufficient free entries in the fifo. "Issuing a write to a full FIFO is not supported; at worst it could result in corruption or a system hang." Reported-and-Tested-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34056 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(In reply to comment #13) > *** Bug 34421 has been marked as a duplicate of this bug. *** Test in commit 913558,it works fine when open and close Terminal in gnome-desktop.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.