Bug 32303

Summary: Massive screen corruption after screensaver stop
Product: xorg Reporter: sergio.callegari
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: afoglia, seversbenjamin, skhilko, zeke
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg
none
Xorg.log
none
Xorg.log when crash is invoked none

Description sergio.callegari 2010-12-10 14:47:03 UTC
On Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)

DELL E6500 Laptop with kubuntu maverick 64 bit
Stock 2.6.35 ubuntu kernel
intel drivers 2.13.902+git20101210
xorg server 1.9 from Xorg 7.5
mesa 7.9~git20100924
libdrm 2.4.23

to reproduce

lock screen, wait for the screensaver to start, move mouse

about 50% of the times recover from screensave is messy

only textbox to enter password appears of the whole unlock dialogue
when mouse moves over, dialogue flashes
after unlock pieces of the screen suddendly black out and flash

In case it can be useful I have a video
Comment 1 Chris Wilson 2010-12-11 09:30:13 UTC
Seeing the form of the corruption is a good indicator of the cause. From your description it sounds like flushing versus damage, but it is difficult to be sure. Do you see any graphical corruption elsewhere? If it's only affecting the transition after the screensaver, that is an important clue. (It would perhaps imply that the bug does not lie on the core code paths, but in only those bits of code triggered by the transition, though it may be a race condition on the core code paths that is only hit under this transition.) If you downgrade any of those components, does the bug disappear?
Comment 2 sergio.callegari 2010-12-11 10:26:25 UTC
Hi, as mentioned, I have a video of the corruption. I have not uploaded it yet because it is 15M and possibly to large to be an attachment on this site. I can anyway share it by other means too.

The corruption is repeatable and always happens in the same way. Almost 50% of the times when the screensaver is displayed due to a suspend/resume cycle, less frequently otherwise.

In getting out of the screensaver, the screen should darken and the unlock dialogue should appear. However nothing appears and when the mouse is moved only the textbox appears, by moving the mouse other parts of the dialog can appear, apparently at random. After the screen is unlocked, when moving the mouse over parts of the screen involving effects everything flashes and parts of the screen black out.

I have just discovered that by disabling and re-enablying the effects at this point seems to fix the issue (display is normal again). However the server does not completely recover, since using effects at this point is likely to cause freezes.

I have experienced this bug since the installation of Maverick. At the beginning I was using it with the stock 2.12 intel driver, then I started using the glasen ppa in the hope to see some improvements and with this I have more or less tracked the git development (driver, libdrm, mesa) for the last month, keeping the standard kernel and xorg server infrastructure.
Comment 3 sergio.callegari 2011-01-10 03:20:17 UTC
Still experiencing the bug as of
xserver-xorg-video-intel 2.14

Currently, this has remained the most severe issue I am experiencing with the graphics card.

In fact, not always the server recovers fine and at time there are freezes. Which means that work done before the screensaver started or before a system sleep can easily be lost.  This is most serious for the screensaver. When one puts to sleep his system this is an explicit action, so he can save first. Conversely, the screensaver tends to start on its own.

I still have a video of the issue.  Please let me know if it can be useful to someone, if I should send it to someone or if it can be uploaded here in spite of being ~ 15 MB.
Comment 4 Chris Wilson 2011-01-10 03:29:05 UTC
The log from Xorg.log crashes and the dmesg would be useful, along with /sys/kernel/debug/dri/0/i915_error_state if it is a hang. It's likely a GL bug and that would confirm it. There may also be debugging info captured in .xsession-errors.
Comment 5 Chris Wilson 2011-01-11 16:28:12 UTC
"Stock ubuntu kernel", now there is a misnomer.

Having spotted one bug recently that might be related, but without even a dmesg I can't confirm, you might like to try drm-intel-fixes (2.6.38-rc1 once that is released).
Comment 6 sergio.callegari 2011-01-15 07:16:11 UTC
Hi, thanks!

sorry for the misnomer, not a native speaker, just meant the 2.6.35 kernel with the patches that are shipped by ubuntu, with no other change.

Unfortunately I cannot do to many tests on my machine with Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07), since it is my work one and I cannot risk making it unstable or corrupting its package management.

Can test the kernels in the ubuntu mainline ppa though. These should ship mainline kernels packaged in order to install them easily.

Tryied the latest in the intel-drm-next section, which is a 2.6.37-997 (don't know what 997 stands for, dated 13 jan), namely:

http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-next/2011-01-13-natty/

Unfortunately, this creates extra severe rendering artifacts. Maybe it is due to too old pieces I am using for the rest of the video software stack.  I have:

- black squares in place of graphics events.
- gray regions instead of transparent shadows
- in animations objects that are drawn are not then erased properly.

Regards,

Sergio
Comment 7 Chris Wilson 2011-01-15 09:07:31 UTC
(In reply to comment #6)
> Can test the kernels in the ubuntu mainline ppa though. These should ship
> mainline kernels packaged in order to install them easily.
> 
> Tryied the latest in the intel-drm-next section, which is a 2.6.37-997 (don't
> know what 997 stands for, dated 13 jan), namely:
> 
> http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-next/2011-01-13-natty/
> 
> Unfortunately, this creates extra severe rendering artifacts. Maybe it is due
> to too old pieces I am using for the rest of the video software stack.  I have:
> 
> - black squares in place of graphics events.
> - gray regions instead of transparent shadows
> - in animations objects that are drawn are not then erased properly.

Sorry, you just happened to pull at the wrong moment and got a bad kernel. The fixes for those were merged yesterday, so hopefully will have hit the mainline repo now.

Please do try again, and attach your dmesg + Xorg.log.
Comment 8 sergio.callegari 2011-01-17 01:44:21 UTC
Can this be related to 30509?
Screensaver is indeed an opengl fullscreen app.
Comment 9 Chris Wilson 2011-01-17 01:54:42 UTC
Less likely, unless you are experiencing the same crashes as well.
Comment 10 sergio.callegari 2011-01-17 06:09:17 UTC
Hi,

I was suggesting the connection, since reading that bug made me realize that in many occasions where I experience issues in getting out of the screensaver, I also get a kwin crash.

For what concerns testing newer kernels, it looks like the ubuntu system for packaging mainline kernels is having some sort of issue right now (latest correctly compiled debs are still those of Jan 13.

As soon as I have something newer to test I will provide feedback.

Thanks,

Sergio
Comment 11 sergio.callegari 2011-01-22 04:35:42 UTC
Hi,

I have installed libreoffice and I sometimes see the same issue after exit from impress presentation fullscreen.
Comment 12 sergio.callegari 2011-01-22 04:59:07 UTC
Created attachment 42302 [details]
dmesg

obtained after issue, namely screen flicker happened, from which I recovered by disabling and re-enabling desktop effects
Comment 13 sergio.callegari 2011-01-22 05:03:06 UTC
Created attachment 42303 [details]
Xorg.log
Comment 14 sergio.callegari 2011-01-22 05:07:21 UTC
Hi,

I have finally tried 2.6.38-020638rc2-generic from the ubuntu mainline ppa.
No luck, the issue is still present (might happen a bit less frequently, but it is hard to judge).

The attachments above provide the dmesg and the xorg.log collected after the issue manifested.

Seen the problem both at screensaver exit and after termination of libreoffice presentation (openoffice does not seem to show the issue, maybe due to the fact that it does not use hardware acceleration in presentations).

In both cases, after the screensaver/presentation termination:

- flicker of full screen at window move
- graphical objects not drawn or only partially drawn
- black squares on the screen
- occasional kwin crashes.
Comment 15 sergio.callegari 2011-01-22 05:13:16 UTC
Forgot to mention one thing, maybe it is important.

When I exit from the screensaver, the first thing I do is to unlock the screen with a password dialog which is the first element on which I start to see issues (typically only the textbox and not the whole dialog is displayed). After the screen unlock, I get the desktop with rendering corruptions and if I do anything on it I see flicker (all the screen starts flashing, etc). The weird thing is that sometimes this flicker transitionally results in something that may look like former graphical elements being displayed (e.g. the previous unlock dialog).
Comment 16 sergio.callegari 2011-01-22 05:17:18 UTC
One more thing.

With the newer 2.6.38RC kernel the issue at the screensaver exit seems to happen only with the opengl screensavers. I am almost sure that before it was happening also with non opengl screensavers (asciiquarium, and the like).
Comment 17 sergio.callegari 2011-03-15 09:25:49 UTC
Tried with the just released 2.6.38 from the ubuntu mainline kernel repository, together with xorg 1.9 (with the page-flipping fix mentioned in https://bugs.freedesktop.org/show_bug.cgi?id=30509), intel driver 2.14-901. 

Problem persists.

On exit from opengl screensaver i still get screen corruption (unlock screen dialog does not appear or only partially appears), kwin crash and/or further screen corruption (moving the cursor or moving windows black blocks appear on screen, parts of the screen get dimmed, or start flashing, etc.). Only disabling and re-enabling compositing at this point appears to recover usable behavior.
Comment 18 sergio.callegari 2011-03-16 07:47:31 UTC
Hi,

I have uploaded on https://launchpad.net/~callegar/+archive/xorg an xorg 1.9 for ubuntu maverick patched to have page flipping always disabled (2:1.9.0-0ubuntu7.3+cllsrg3~maverick1).

With this, xorg is slower, but the issue is gone.

IMHO the issue is serious, since it results in a display corruption whenever compositing is enabled and an accelerated full screen application is started and then stopped. This includes
- screensaver with opengl screensaver
- libreoffice impress presentation (openoffice does not have acceleration)
- youtube full screen
- moonlight full screen
- and possibly much more.

What is worse is that the way out of the corruption is non obvious for many unexperienced users (disable and re-enable compositing), which may scare people away from linux or from intel display cards on linux.

I would thus suggest that one of the following option is picked, until the root cause of the problem is finally identified and fixed.

1) enable a page flip option in the xorg configuration, so that page flipping can be switched off when it is broken. If I am not wrong in the past there was this option.

2) disable page flip for Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07).

IMHO, it is much better to have a slow X than a misbehaving X.
Comment 19 Chris Wilson 2011-03-16 08:02:11 UTC
*** Bug 30509 has been marked as a duplicate of this bug. ***
Comment 20 anderlia 2011-05-03 02:47:30 UTC
@Sergio: do you plan to make a Natty version of the pacthed xorg (xserver-xorg-core 2:1.10.1 with page flipping disabled) in the ppa:callegar/xorg repository. I've used your patched version in Maverick with a very positive outcome: my Kwin no longer crashes upon fullscreen mode exit, and display performances remain substantially the same. Since I've upgraded to Kubuntu Natty 11.04, the very same problem re-appears. Thanks for your support on this case.
Comment 21 sergio.callegari 2011-05-03 06:41:09 UTC
I really hoped that there was no need for that!

My plan A (I should say my first hope) was to see the reason for this issue finally discovered an a proper fix devised

My plan B was the bug I opened on launchpad, saying that the GM45 still needed page-flipping disabled https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/737602

Unfortunately natty shipped without any patch in this sense. Please be so kind to confirm the bug on launchpad.


To go back to your original question... since the bug is still there, I will definitely need to find a solution and when I do that, I will definitely share it via the PPA to all natty users!


My only problem is that I cannot do it right now. I use my laptop for work and I absolutely need it to be fully operative. Hence my upgrade path is to first try natty on a backup machine and when that is ok I can upgrade my main machine too. Unfortunately so far natty is giving me serious issues on the backup machine (no touchpad, no xorg restore after hibernation - both are regressions). So until I cannot get rid of those I need to keep on with maverick.  Just to try giving you some timing - for the hibernation an official ubuntu fix should be on the way; for the touchpad, lets keep finger crossed. So let's hope in 10-15 days I'll be able to try building and testing an xorg with no page flipping for natty.
Comment 22 anderlia 2011-05-03 13:29:24 UTC
Thanks Sergio for your prompt answer. Meanwhile, I have confirmed your 737602 bug in Launchpad. FYI, I'm running Kubuntu on a T400 thinkpad with "Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)" listed as display controller with lspci command. I have exactly the same symptoms as you describe, and I'm waiting too for the intel driver's code lines that cause this corruption to be fixed. Hope they will do it very soon since the pb is rather critical.
Comment 23 sergio.callegari 2011-05-12 11:02:00 UTC
The bug is still there in distros based on xorg-server 1.10.1, such as (k)ubuntu 11.04.

At the usual place https://launchpad.net/~callegar/+archive/xorg I have uploaded an xorg server with page flipping disabled for ubuntu 11.04.  This covers up the issue. In the meantime let's hope the root cause can be found.
Comment 24 Eugene Markow 2011-05-14 12:59:14 UTC
I would like to confirm this bug using the following:

- KDE 4.6.3
- Arch Linux

kernel info (uname -a):
Linux Galicja 2.6.39-rc7-git4-ARCHMOD #1 PREEMPT Fri May 13 11:27:11 CEST 2011 x86_64 Genuine Intel(R) CPU 575 @ 2.00GHz GenuineIntel GNU/Linux

glxinfo:
OpenGL vendor string: Tungsten Graphics, Inc
OpenGL renderer string: Mesa DRI Mobile IntelĀ® GM45 Express Chipset
OpenGL version string: 2.1 Mesa 7.10.2

Xorg Version:
X.Org X Server 1.10.1

Intel Driver:
xf86-video-intel 2.15.0


I an unable to find a way to disable 'Page Flipping' in xorg to test if this solution works.
Comment 25 Eugene Markow 2011-07-03 02:56:45 UTC
I'd like to add, not only does Xorg crash (KDE logs out and requires log in again) when using the OpenGL screensaver, it also occurs when using a full screen OpenGL application. Hope this issue gets resolved very soon.

uname-a: 
Linux Galicja 3.0.0-rc5-git5-ARCHMOD #1 PREEMPT Sat Jul 2 14:45:00 CEST 2011 x86_64 Genuine Intel(R) CPU 575 @ 2.00GHz GenuineIntel GNU/Linux

If this additional information helps resolve the bug, inluded is a section of my Xorg.log when it crashes (also attached the entire Xorg.log):


====================
.
.
.
[    26.602] (**) Option "xkb_rules" "evdev"
[    26.602] (**) Option "xkb_model" "evdev"
[    26.602] (**) Option "xkb_layout" "us"
[    45.358] (II) intel(0): EDID vendor "LPL", prod id 12545
[    45.358] (II) intel(0): Printing DDC gathered Modelines:
[    45.358] (II) intel(0): Modeline "1280x800"x0.0   69.30  1280 1328 1352 1416  800 803 809 816 -hsync -vsync (48.9 kHz)
[    62.272] (II) intel(0): EDID vendor "LPL", prod id 12545
[    62.272] (II) intel(0): Printing DDC gathered Modelines:
[    62.272] (II) intel(0): Modeline "1280x800"x0.0   69.30  1280 1328 1352 1416  800 803 809 816 -hsync -vsync (48.9 kHz)
[  3453.469] (II) intel(0): EDID vendor "LPL", prod id 12545
[  3453.469] (II) intel(0): Printing DDC gathered Modelines:
[  3453.469] (II) intel(0): Modeline "1280x800"x0.0   69.30  1280 1328 1352 1416  800 803 809 816 -hsync -vsync (48.9 kHz)
[  3560.729] 
Backtrace:
[  3560.730] 0: /usr/bin/X (xorg_backtrace+0x26) [0x49f556]
[  3560.730] 1: /usr/bin/X (0x400000+0x60cf9) [0x460cf9]
[  3560.730] 2: /lib/libpthread.so.0 (0x7f173b8d2000+0xf7e0) [0x7f173b8e17e0]
[  3560.730] 3: /usr/lib/xorg/modules/dri/i965_dri.so (0x7f1737c5e000+0x43730) [0x7f1737ca1730]
[  3560.730] 4: /usr/lib/xorg/modules/dri/i965_dri.so (0x7f1737c5e000+0x3ab1a) [0x7f1737c98b1a]
[  3560.730] 5: /usr/lib/xorg/modules/dri/i965_dri.so (0x7f1737c5e000+0x3d0fa) [0x7f1737c9b0fa]
[  3560.730] 6: /usr/lib/xorg/modules/extensions/libglx.so (0x7f17396f3000+0x228c7) [0x7f17397158c7]
[  3560.730] 7: /usr/lib/xorg/modules/extensions/libglx.so (0x7f17396f3000+0x25085) [0x7f1739718085]
[  3560.730] 8: /usr/bin/X (0x400000+0x2e8d9) [0x42e8d9]
[  3560.730] 9: /usr/bin/X (0x400000+0x22b7e) [0x422b7e]
[  3560.730] 10: /lib/libc.so.6 (__libc_start_main+0xed) [0x7f173a85017d]
[  3560.730] 11: /usr/bin/X (0x400000+0x22e6d) [0x422e6d]
[  3560.730] Segmentation fault at address 0x40
[  3560.730] 
Fatal server error:
[  3560.730] Caught signal 11 (Segmentation fault). Server aborting
[  3560.730] 
[  3560.730] 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[  3560.730] Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[  3560.730] 
[  3560.739] (II) Power Button: Close
[  3560.739] (II) UnloadModule: "evdev"
[  3560.739] (II) Unloading evdev
[  3560.753] (II) Video Bus: Close
[  3560.753] (II) UnloadModule: "evdev"
[  3560.753] (II) Unloading evdev
[  3560.766] (II) Sleep Button: Close
[  3560.766] (II) UnloadModule: "evdev"
[  3560.766] (II) Unloading evdev
[  3560.779] (II) AT Translated Set 2 keyboard: Close
[  3560.779] (II) UnloadModule: "evdev"
[  3560.779] (II) Unloading evdev
[  3560.807] (II) UnloadModule: "synaptics"
[  3560.807] (II) Unloading synaptics
[  3560.819] (II) HP WMI hotkeys: Close
[  3560.819] (II) UnloadModule: "evdev"
[  3560.819] (II) Unloading evdev
[  3560.820] (II) AIGLX: Suspending AIGLX clients for VT switch
====================
Comment 26 Eugene Markow 2011-07-03 02:58:10 UTC
Created attachment 48705 [details]
Xorg.log when crash is invoked
Comment 27 Eugene Markow 2011-08-02 22:07:46 UTC
Full screen OpenGL applications and OpenGL KDE screensavers NO LONGER crash KDE / Kwin on my Arch Linux OS. This issue seems to be resolved upon the recent release of the Mesa OpenGL 3D Graphics Library 7.11 drivers for Linux. Awesome, it's working well so far!

Note, in KDE's "Desktop Effects" settings, when "Suspend Desktop Effects for Fullscreen Windows" is *NOT* Checked, this works perfectly. However, when it *IS* checked, the system system crashes upon using any full screen OpenGL Apps / Screensavers.

As I experience it, this issue can be marked as SOLVED.

Below are my system specifications and KDE 4.7.0 configuration graphics settings. Hopefully, this will assist others having the same issue.



Linux System Specs:
===================


-------------------------
uname -a:

Linux Galicja 3.0.0-git17-ARCHMOD #1 PREEMPT Tue Aug 2 16:07:02 CEST 2011 x86_64 Genuine Intel(R) CPU 575 @ 2.00GHz GenuineIntel GNU/Linux


glxinfo:

OpenGL vendor string: Tungsten Graphics, Inc
OpenGL renderer string: Mesa DRI Mobile IntelĀ® GM45 Express Chipset 
OpenGL version string: 2.1 Mesa 7.11
OpenGL shading language version string: 1.20


Xorg.0.log:

Integrated Graphics Chipset: Intel(R) GM45
Chipset: "GM45"


Package Versions:

intel-dri 7.11
libgl 7.11
mesa 7.11
libdrm 2.4.26
xf86-video-intel 2.15.0
xorg-server 1.10.3


lsmod:

i915                  674472  4 
drm_kms_helper         24425  1 i915
drm                   178855  5 i915,drm_kms_helper
i2c_algo_bit            4951  1 i915
button                  4190  1 i915
i2c_core               18609  5 i2c_i801,i915,drm_kms_helper,drm,i2c_algo_bit
video                  10756  1 i915
intel_agp              10624  1 i915
intel_gtt              13897  3 i915,intel_agp
-------------------------




KDE Version 4.7.0 Graphics Settings:
====================================


-------------------------
Settings > System Settings > Desktop Effects >

> General Tab:

Enable Desktop Effects at Startup [CHECKED]
Improved Window Management [CHECKED]
Various Animations [CHECKED]
Effect for Window Switching: [Box Switch]
Effect for Desktop Switching: [Desktop Cube Animation]
Animation Speed: [Normal]

> All Effects Tab:

Blur [CHECKED]
Fade [CHECKED]
Translucency [CHECKED]
...and many various others are [CHECKED]

> Advanced Tab:

Compositing Type: [OpenGL]
Disable Functionality Checks [CHECKED]
Keep Window Thumbnails: [Only for Shown Windows]
Scale Method: [Accurate]
Suspend Desktop Effects for Fullscreen Windows [*NOT* CHECKED]
Enable Direct Rendering: [CHECKED]
Use OpenGL 2 Shaders: [CHECKED]
Use VSync: [*NOT* CHECKED]
-------------------------
Comment 28 sergio.callegari 2011-09-01 07:55:47 UTC
Since the bug is still there in ubuntu natty, and ubuntu has shipped a new version of xorg 1.10.1-1ubuntu1.2, I am once more uploading on https://launchpad.net/~callegar/+archive/xorg an xorg package with page flipping disabled, hoping that this can be useful.

It is currently building.

Note that I am getting (though rarely) the bug even with kde configured to suspend desktop effects on fullscreen windows.
Comment 29 sergio.callegari 2012-01-23 01:07:22 UTC
I think that this is fixed in the latest intel drivers.

On ubuntu 11.10 with the latest intel drivers from the git ppa I am not anymore experiencing the issue without the need to disable flipping by a hack.

If others are not experiencing any issue, this can probably be closed.
Comment 30 Chris Wilson 2012-01-23 01:59:07 UTC
Fwiw I still believe this was the DRI2 pageflip bug...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.