Bug 17833 - [855GM S3] X gets corrupted after suspend/resume on R51
Summary: [855GM S3] X gets corrupted after suspend/resume on R51
Status: CLOSED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Jesse Barnes
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-29 23:05 UTC by Yves Dorfsman
Modified: 2009-04-28 12:22 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
gnome password prompt after resume, missing colours (283.65 KB, image/jpeg)
2008-09-29 23:07 UTC, Yves Dorfsman
no flags Details
final state of the screen (656.32 KB, image/jpeg)
2008-09-29 23:08 UTC, Yves Dorfsman
no flags Details
Xorg.0.log (24.49 KB, text/plain)
2008-09-29 23:11 UTC, Yves Dorfsman
no flags Details
pre-suspend.out just after the boot and ctrl-alt-F1 (6.98 KB, text/plain)
2008-09-30 19:46 UTC, Yves Dorfsman
no flags Details
post suspend / resume (in text) intel_reg_dumper dump (7.01 KB, text/plain)
2008-09-30 19:47 UTC, Yves Dorfsman
no flags Details
intel_reg_dump after a good suspend/resume (7.00 KB, text/plain)
2008-09-30 21:58 UTC, Yves Dorfsman
no flags Details
intel_reg_dump after a good suspen / resume but from an X terminal (6.99 KB, text/plain)
2008-09-30 22:00 UTC, Yves Dorfsman
no flags Details
save/restore fifo regs at suspend/resume time (9.34 KB, patch)
2009-03-30 17:12 UTC, Jesse Barnes
no flags Details | Splinter Review

Description Yves Dorfsman 2008-09-29 23:05:51 UTC
I run Fedora core 9 on an R51, model 28885RU with an Intel 855GM video card:

$ lshal |grep hardware
  system.hardware.primary_video.product = 13698  (0x3582)  (int)
  system.hardware.primary_video.vendor = 32902  (0x8086)  (int)
  system.hardware.product = '28885RU'  (string)
  system.hardware.vendor = 'IBM'  (string)
  system.hardware.version = 'ThinkPad R51'  (string)
  laptop_panel.brightness_in_hardware = true  (bool)

I installed through kick-start and let it pick as many automatic configurations as possible. I update (yum update) on a regular basis. The X server is  x.org 1.4.99, from the fedora package xorg-x11-server-Xorg-1.4.99.905-2.20080702.fc9.i386 with the "intel" driver 2.2.1 from the fedora package xorg-x11-drv-i810.

I have downloaded and compiled xf86-video-intel-2.4.2 and replaced my intel_drv.so with it, but the results stayed identical.


PROBLEM:
When I resume from a memory suspend, most of the time, the screen is missging colours, and then it display some weird vertical pattern (see screen shots) which eventually renders the machine locked. Pressing the power button once, will make the machine shutdown. A suspend in console/text mode, resumes normaly, but if I then switch to X (ctrl-alt F7), then the same problem happens.

Something happens to the OS, that makes it work sometimes. BUT, once it works, it just works, I can suspend and resume at will, it just works, until I do a shutdown. Temperature, as I initially thought, does not seem to be the trigger, as I've had both cases (work and not work) from cold, and after having used the laptop for over an hour.

It's like something I do while using the computer changes a register somewhere that makes it work, but, I have not been able to isolate what is it I do that makes it work.

I have managed to isolate one case that always fails: Boot up from scratch, wait for the login menu from gdm to come up, then suspend. This always fails.

/etc/default/acpi-support contains one line:
DOUBLE_CONSOLE_SWITCH=true


I have added two options in xorg.conf:
        Option      "VBERestore" "true"
        Option      "ForceEnablePipeA" "true"


Somebody suggested that it could be a module, I ran lsmod and managed to capture information before a crash, and before a successful resume, and the only difference were nfs modules. I have since had successful resumes with the same modules (no nfs).

I followed the step at:
http://people.freedesktop.org/~hughsient/quirk/quirk-suspend-advanced.html

but I only got one match in main, no module name, which seems to indicate that the problem is not in a module:
dirvers/base/power/main.c: 222
Comment 1 Yves Dorfsman 2008-09-29 23:07:28 UTC
Created attachment 19296 [details]
gnome password prompt after resume, missing colours
Comment 2 Yves Dorfsman 2008-09-29 23:08:48 UTC
Created attachment 19297 [details]
final state of the screen
Comment 3 Yves Dorfsman 2008-09-29 23:11:03 UTC
Created attachment 19298 [details]
Xorg.0.log
Comment 4 Yves Dorfsman 2008-09-29 23:13:19 UTC
As I mentioned the machine regularly gets into a state where resume works, and I can create the problem at any given point in time by rebooting, and go to suspend right away.

Let me know if you want me to capture data in those states for comparison purposes.
Comment 5 Gordon Jin 2008-09-30 06:45:30 UTC
what's the kernel version?
Comment 6 Yves Dorfsman 2008-09-30 07:14:20 UTC
It's the stock Fedora 9 (updated) kernel:

> cat /proc/version

Linux version 2.6.25.14-108.fc9.i686 (mockbuild@) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Mon Aug 4 14:08:11 EDT 2008

Comment 7 Jesse Barnes 2008-09-30 13:38:46 UTC
Can you suspend/resume from the console alone?  I.e.:
  1) boot up into text mode
  2) modprobe i915
  3) intel_reg_dumper > pre-suspend.out
  4) echo mem > /sys/power/state
  5) resume
  6) intel_reg_dumper > post-resume.out

Do those steps work?  Please attach the register dumps from the above (the intel_reg_dumper tool can be found in the xf86-video-intel tree under src/reg_dumper).
Comment 8 Yves Dorfsman 2008-09-30 19:45:17 UTC
Here are the files. Note that I have not booted up in text mode, I let it boot up in normal mode, then ctrl-alt-F1, ran your procedure.

As expected, because I had just booted up and had not logged in through gnome yet, when I pressed ctrl-alt-F7 to go back to the gdm screen in X, it ceased, as shown in the pictures uploaded earlier.

Let me know if you want me to re-do the procedure from a true boot to text only.

Also, when I managed to get the machine in a state when it will suspend/resume properly, is there any point of me running the intel_reg_dumper and upload the result ?
Comment 9 Yves Dorfsman 2008-09-30 19:46:29 UTC
Created attachment 19322 [details]
pre-suspend.out just after the boot and ctrl-alt-F1
Comment 10 Yves Dorfsman 2008-09-30 19:47:25 UTC
Created attachment 19323 [details]
post suspend / resume (in text) intel_reg_dumper dump
Comment 11 Yves Dorfsman 2008-09-30 21:58:48 UTC
Created attachment 19324 [details]
intel_reg_dump after a good suspend/resume

This is a intel_reg_dump after a suspend / resume that work, but after ctrl-alt-F1, so in text mode.
Comment 12 Yves Dorfsman 2008-09-30 22:00:03 UTC
Created attachment 19325 [details]
intel_reg_dump after a good suspen / resume but from an X terminal

intel_reg_dump after a good suspend / resume cycle, but from X, from an X terminal.
Comment 13 Yves Dorfsman 2008-10-03 16:25:54 UTC
I have upgraded with the latest Fedora Core 9 packages:

Kernel:
Linux version 2.6.26.5-45.fc9.i686 (mockbuild@) (gcc version 4.3.0 20080428 (Red
 Hat 4.3.0-8) (GCC) ) #1 SMP Sat Sep 20 03:45:00 EDT 2008

X:
X.Org X Server 1.5.0

and the same problem is still there.
Comment 14 Yves Dorfsman 2008-10-13 09:18:30 UTC
I have been using hibernate / resume without any problem, it always comes back clean. I have done more testing with suspend / resume, but still the same results, fails most of the time..
Comment 15 Yves Dorfsman 2008-10-13 09:20:10 UTC
Any progress on this ?

Any test I can do to help with this ?

Comment 16 Diego Escalante Urrelo 2008-10-21 04:09:39 UTC
I have a similar problem with my 855GM card in a r50e. I think we are suffering bug #13609 actually.
Comment 17 Michael Fu 2008-11-18 00:33:09 UTC
remove NEEDINFO and ping Jesse...
Comment 18 Yves Dorfsman 2008-11-21 07:34:02 UTC
Upgraded Fedora with latest packages:

> cat /proc/version 
Linux version 2.6.27.5-41.fc9.i686 (mockbuild@) (gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) ) #1 SMP Thu Nov 13 20:52:14 EST 2008


From Xorg.0.log:
.../...
(II) Loading /usr/lib/xorg/modules/drivers//intel_drv.so
(II) Module intel: vendor="X.Org Foundation"
        compiled for 1.5.0, module version = 2.2.1
        Module class: X.Org Video Driver
        ABI class: X.Org Video Driver, version 4.1
(II) Loading sub module "intel_master"
(II) LoadModule: "intel_master"

(II) Loading /usr/lib/xorg/modules/drivers//intel_master_drv.so
(II) Module intel: vendor="X.Org Foundation"
        compiled for 1.5.0, module version = 2.3.2
        Module class: X.Org Video Driver
        ABI class: X.Org Video Driver, version 4.1
.../...
(II) intel(0): Integrated Graphics Chipset: Intel(R) 855GME
(--) intel(0): Chipset: "852GM/855GM"
(--) intel(0): Linear framebuffer at 0xE0000000
(--) intel(0): IO registers at addr 0xD0000000
(II) intel(0): 2 display pipes available.
(==) intel(0): Using EXA for acceleration



The problem is still exactly the same.
Is there anything else I can provide to help ? Or tests you want me to run ?


Comment 19 Jesse Barnes 2008-12-18 13:55:10 UTC
Do you know if your distro is running vbetool post at resume time?  It sounds like suspend/resume is correctly restoring the console; could you try resuming to text mode, then running 'vbetool post' by hand before switching back to X?  
Comment 20 Gordon Jin 2009-01-22 23:20:14 UTC
ping Yves
Comment 21 Yves Dorfsman 2009-01-23 07:30:31 UTC
Sorry I missed the update from 2008/12/18.... If you update this bug and don't hear from me soon after, please don't hesitate to email me directly (yves the at sign zioup.com)

I use Fedora 10, and once you logged in into your gnome session now, you cannot switch back to text mode. Here is what I have done:

1) boot up, login into gnome, suspend.
   Resotre
   At this point I only get a cursor, I can move it, but nowhere to enter my password etc....

   I go to another machine, ssh into the R51, and run "vbetool post". The laptop then turns black, with just a cursor at the top left hand corner.

2) Added vbe_post to /usr/share/hal/fdi/information/10freedesktop/20-video-quirk-pm-ibm.fdi. Reboot, etc.... This time I get back to the "ghost" screen I use to get, I can see the rectangle where I'd enter my password, but it's all white.

3) Added vbemode_restore (under vbe_post), just because I noticed this is the config for another model. Reboot, etc... I get the exact same behaviour in #1.

Comment 22 Yves Dorfsman 2009-01-24 08:45:34 UTC
Remnoved "need info"
Comment 23 Jesse Barnes 2009-01-28 13:54:16 UTC
This sounds like it could be a dup of the compiz vt switch hang, where all you'd get back after VT switch or suspend/resume is a blank screen with a cursor.  Does updating your libdrm to 2.4.4 or better make the problem go away?
Comment 24 Yves Dorfsman 2009-02-28 06:46:08 UTC
I looked for a distro with libdrm 2.4.4 or >, but couldn't find one, so I'm going to download the source, build - might take a little while.
Comment 25 Yves Dorfsman 2009-02-28 21:34:59 UTC
Ok:

I've tried with "Intel 2008Q4 graphics package" (http://intellinuxgraphics.org/2008Q4.html), poor graphic (black lines around all objects), but the exact same symptom when it comes to resume.

Then, I tried the combination:
libdrm-2.4.5
dri2proto-1.99.3
mesa_7_3
xf86-video-intel-2.6.1 (2.6.2 wouldn't compile)

Graphic display is good now, but, same symptoms for resume:
-hybernate / resume works 100% of the time

-If I boot the machine up in normal mode (gdm), log in, open an xterm, suspend, resume, then I get a black screen and a stuck cursor. I have to do a hard reboot.

-If on the contrary I work for a while, use different apps, then suspend, then it resumes fine. I have not been able to identify what exact step makes it work.


Is there anything I can capture to find the difference between when it works and when it does not ?
Comment 26 Jesse Barnes 2009-03-30 17:12:18 UTC
Created attachment 24382 [details] [review]
save/restore fifo regs at suspend/resume time

In another bug I found that we don't save/restore the FIFO regs, so on your machine things might go bad if we don't take care of it.  Can you try this kernel patch?
Comment 27 Yves Dorfsman 2009-04-07 07:47:59 UTC
I am unable to test this patch, but I think this problem is now solved, and probably has been solved for a while:

-I was on Ubuntu 8.10, kernel 2.6.27. I tried to apply the patch, but patching failed because it was missing a few files.

-rather then installing a newer kernel on an old version of Ubuntu I decided to install the newer beta version of Ubuntu, 9.04, thinking I might be able to patch the kernel for that version. Once installed though, I tried to suspend/resume, and it worked, out of the box, without any patching or change.

The one thing I do realise now is that, I kept trying newer versions of the Intel module and X itself, but I had never tried a newer version of the drm kernel module (except when upgrading Fedora and Ubuntu), so I am assuming that the fix for this problem is in the drm module somewhere.

Here are the versions of the different elements:
kernel: Linux version 2.6.28-11-generic
drm kernel module: srcversion:     577B0D96244FB56F871FA81
i915 kernel module: srcversion:     342B6C179E8780AF28932E3
X: X.Org X Server 1.6.0
Intel module: intel_drv.so, compiled for 1.6.0, module version = 2.6.3

This is great news, Intel is back on the list for video cards for my next laptop !

I have only been using this version for four days now, so I am going to keep the bug open for a week or two, to make sure I test it under different scenarios.

Comment 28 Jesse Barnes 2009-04-07 09:37:14 UTC
Ok great, thanks for testing.  Maybe it's better to mark this one as 'fixed' for now, and you can change the resolution to 'resolved' or re-open if you see it again.  And please be sure to file new bugs for any other issues you discover.

Thanks!
Comment 29 Yves Dorfsman 2009-04-28 12:22:45 UTC
I have been using the machine for 20 days now with the above configuration (Ubuntu 9.04) and it suspend / resume without a problem.

Closing this bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.