Summary: | [4X bisected] 2.6.35: BSD ring buffer implementation makes suspend to ram unreliable | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Thomas Meyer <thomas.mey> | ||||
Component: | DRM/Intel | Assignee: | Zou Nan hai <nanhai.zou> | ||||
Status: | CLOSED FIXED | QA Contact: | |||||
Severity: | major | ||||||
Priority: | high | CC: | cllccl, florian, haihao.xiang, hege, north, yuanhan.liu, zhenyu.z.wang | ||||
Version: | unspecified | ||||||
Hardware: | x86 (IA32) | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
Description
Thomas Meyer
2010-08-05 05:28:47 UTC
My GM45 and Arrandale work fine, but my G45 is impacted by this issue. Same problem on a DG41TY desktop board with GMA X4500. The HAS_BSD (0) hack seems to work here also. (In reply to comment #1) > My GM45 and Arrandale work fine, but my G45 is impacted by this issue. Correction: my G45 has working suspend-to-ram but fails to suspend-to-disk, so it should be a separate issue. can't reproduce this issue on our GM45 till now. Could any of you provide more detailed machine info, so we can try to reproduce this issue. Here's an elided dmidecode: BIOS Information Vendor: Intel Corp. Version: TYG4110H.86A.0031.2009.0626.1405 Release Date: 06/26/2009 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 1024 kB Characteristics: Base Board Information Manufacturer: Intel Corporation Product Name: DG41TY Version: AAE47335-300 Serial Number: AZTY9300030R Asset Tag: To be filled by O.E.M. Features: Board is a hosting board Board is replaceable Location In Chassis: To be filled by O.E.M. Chassis Handle: 0x0003 Type: Motherboard Contained Object Handles: 0 Processor Information Socket Designation: LGA775 Type: Central Processor Family: Pentium Manufacturer: Intel(R) Corp. ID: 7A 06 01 00 FF FB EB BF Signature: Type 0, Family 6, Model 23, Stepping 10 ... Version: Pentium(R) Dual-Core CPU E6300 @ 2.80GHz ... which I'd think would be adequate. And here's ver_linux just for good measure: Gnu C 4.4.3 Gnu make 3.81 binutils 2.20.1 util-linux 2.17.2 mount support module-init-tools 3.11.1 e2fsprogs 1.41.11 reiserfsprogs 3.6.21 Linux C Library 2.11.1 Dynamic linker (ldd) 2.11.1 Procps 3.2.8 Net-tools 1.60 Kbd 1.15 Sh-utils 7.4 If there's anything else that would help, let me know. And thanks. Dave What OS are you using? Fedora 13? ubuntu? Do you have compiz enabled? compiz fusion? Are you running any application (3D, Media) before S3? Do you have special configuration? Did you connect external monitor when met this issue? Do you have power supply connected? (In reply to comment #6) I encounter the described behaviour on an Acer Aspire 1810T. > What OS are you using? Fedora 13? ubuntu? Fedora 13 > Do you have compiz enabled? No > compiz fusion? No > Are you running any application (3D, > Media) before S3? No. > Do you have special configuration? No. Just the laptop itself. > Did you connect external monitor when met this issue? No. > Do you have power supply connected? Yes and the battery removed. By the way: I use an UP kernel. Sorry to take so long but I've been doing some testing. First, I wasn't using compiz but I did have xcompmgr loaded. However, unloading it does not stop the problem. No 3D or media apps running. The configuration is special in the sense that it does not use modules or an initrd, if that's what you mean. This is a desktop unit, so it has only an external monitor and does not have a battery other than the cmos pill. Kernel is smp. The truly interesting question turns out to be: what distribution? My initial problem was seen on Ubuntu Lucid (10-4). So I tested on Gentoo unstable with the same kernel (literally) and the same problem happened. Then I tested on Debian Squeeze and no, the problem simply would not happen. Further, I tried all three systems running from ttys1 without X running, and none of them would fail (worked every time on all systems). I tried this at least 20 times each over a minimum of four reboots with zero failures. So it clearly relates to something about X. Comment: I also ran several retests using 2.6.34.1, and none of the systems would fail running any variant of X. Lucid is running X.Org version: 1.7.6 (usually fails) Squeeze is running X.Org version: 1.7.7 (always works) Gentoo is running X.Org version: 1.8.2 (fails more than half the time) Unfortunately, that's not very helpful. Even worse, the problem turns out to be sporadic. On some boots, suspend worked fine from X running Gentoo, but on others it would fail. And once, on both Lucid and Gentoo, the first suspend would succeed and the second one fail. Failure mode gives me no information; the computer deadlocks and doesn't respond to external hookups (ssh, ping etc). I have no serial port. There is nothing out of the ordinary in /var/log/pm-suspend.log. This is a very messy one. If I find anything more helpful, I'll let you know. Created attachment 37926 [details]
Xorg log
I use the intel driver version 2.12.0.
Since this is assigned to the ia32 platform I feel obligated to tell, that it also happens with the 64-bit version, if that is of any concern to you. However, else that that I have experience exactly the same symptoms. My computer is a Lenovo T400s and I used the Debian/experimental kernel linux-image-2.6.35-trunk-amd64_2.6.35-1~experimental.2 Here's some lspci: 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device 20e4 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 29 Region 0: Memory at f2000000 (64-bit, non-prefetchable) [size=4M] Region 2: Memory at d0000000 (64-bit, prefetchable) [size=256M] Region 4: I/O ports at 1800 [size=8] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0300c Data: 41a1 Capabilities: [d0] Power Management version 3 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 00:02.1 Display controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) Subsystem: Lenovo Device 20e4 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Region 0: Memory at f2400000 (64-bit, non-prefetchable) [size=1M] Capabilities: [d0] Power Management version 3 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Seems to be fixed in 2.6.36-rc2. Probably by merge commit 4238a417a91643e1162a98770288f630e37f0484 Will there be an official fix in a upcoming 2.6.35.x Kernel? I guess the problem is that the exact commit id that fixes this bug is still unknown. I suspect any of the commits contained in above merge commit id. So if you like you could test each commit id in above merger commit and/or try to bisect the concrete commit id, that fixed this bug. Once the commit id that fixes this bug is known, this id could be forwarded to the stable kernel team, so they hopefully will pick this commit and bundles it into the next stable kernel release. My GM45 (Thinkpad X200) is also affected, approx. 3 out of 10 attempts end up with lockup accompanied by a blinking sleep led, relevant logs are empty after hard reboot. 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device 20e4 Flags: bus master, fast devsel, latency 0, IRQ 48 Memory at f2000000 (64-bit, non-prefetchable) [size=4M] Memory at d0000000 (64-bit, prefetchable) [size=256M] I/O ports at 1800 [size=8] Expansion ROM at <unassigned> [disabled] Capabilities: <access denied> Kernel driver in use: i915 Kernel modules: i915 I'm using the latest 2.6.35 kernel, libdrm-2.4.21-2, mesa-7.8.2-1, xf86-video-intel-2.12.0-1 (latest Arch Linux packages). Suspend had never failed with the 2.6.34 series. Adam, can you grab git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel.git drm-intel-fixes and confirm that s2ram is reliable again? (In reply to comment #16) > Adam, can you grab > > git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel.git > drm-intel-fixes > > and confirm that s2ram is reliable again? sorry, I haven't notices this message before :( Anyway, I see this bug was supposed to be fixed. Has the fix made its way into upstream (.36-RC)? If it's fixed there, I can wait until the next stable kernel release. I think a backport for stable .35 is also needed. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.