Bug 34427

Summary: Graphical corruption when opening windows
Product: xorg Reporter: Harald Judt <h.judt>
Component: Server/GeneralAssignee: Adam Jackson <ajax>
Status: RESOLVED DUPLICATE QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: arequipeno, bryce, bugs.xorg, jlp.bugs, realnc, tdroste
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
screenshot of the garbled frame
none
Xorg.0.log
none
xorg.conf
none
Patch which reverts commit 6dd775f57d2f94f0ddaee324aeec33b9b66ed5bc none

Description Harald Judt 2011-02-18 00:27:48 UTC
Created attachment 43507 [details]
screenshot of the garbled frame

Although it is visible only for a very short moment, I was able to capture it, see the screenshot attached to this bug. It seems something doesn't get initialized correctly like it used to before, part of the garbled area looks like it's previous screen content, just pretty messed.

* I think this only happens when using compositing, but did not test it, so I'm not sure.
* Is noticable with xfwm4 (the xfce window manager) and compiz.
* Is only noticable sometimes (maybe because the other times it's just too fast, but I'll try to capture it with recordmydesktop).
* It can be seen on various window types, like the small firefox / thunderbird frame which appears before the normal window appears, or menus.
* It is not constrained to window size.
* This did not happen some time ago, but I don't know when it started. Before it started, the windows were simply grey.

Software and versions:
* git vanilla kernel (2.6.38-rc5), certainly happened with 2.6.37
* git mesa (r600g)
* git xf86-video-ati
* git libdrm
Comment 1 Harald Judt 2011-02-18 00:39:56 UTC
Update:

* It seems this is not related to mesa vs gallium, as it both happens when using r600c or r600g.
* If I turn off Option "Composite" (set it to "Disabled"), then the corruption does not appear.

I've found this similar bug, https://bugs.freedesktop.org/show_bug.cgi?id=27529, but there the issue is more severe.

Here is lspci -s 01:00.0 -vvv output:
01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 3400 Series (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 210e
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 48
        Region 0: Memory at d0000000 (32-bit, prefetchable) [size=256M]
        Region 1: I/O ports at 2000 [size=256]
        Region 2: Memory at cfff0000 (32-bit, non-prefetchable) [size=64K]
        [virtual] Expansion ROM at cff00000 [disabled] [size=128K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee0300c  Data: 4199
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Kernel driver in use: radeon
Comment 2 Alex Deucher 2011-02-18 09:01:21 UTC
Do you have tiling enabled in your configuration?  Option "ColorTiling" "True" in your xorg.conf?  If so, does disabling it fix the issues?
Comment 3 Harald Judt 2011-02-21 01:43:42 UTC
No, Option "ColorTiling" is not in xorg.conf. I've added it now, but setting it to true or false makes no difference at all.

Option "AccelMethod" "EXA"
Option "EnablePageFlip" "on"
Option "ColorTiling" "true"
Option "ClockGating" "1"
Option "DynamicPM" "on"

I'd say it can't be page flipping either because the problems started quite some time before it was introduced.
Comment 4 Alex Deucher 2011-02-21 07:43:29 UTC
Any chance you could track down which component is problematic (drm, ddx, or mesa) and bisect?
Comment 5 Michel Dänzer 2011-02-22 03:42:47 UTC
Please attach the full Xorg.0.log and xorg.conf files.
Comment 6 Harald Judt 2011-02-22 05:31:24 UTC
Created attachment 43655 [details]
Xorg.0.log

Xorg.0.log as requested.
Comment 7 Harald Judt 2011-02-22 05:32:19 UTC
Created attachment 43656 [details]
xorg.conf

xorg.conf as requested.
Comment 8 Harald Judt 2011-02-22 05:40:30 UTC
> Any chance you could track down which component is problematic
> (drm, ddx, or mesa) and bisect?

Chance yes, but bisecting will be /very/ time-consuming as it needs compiling different kernel versions, mesa, etc. Some of these packages will only compile with specific versions of other packages.

Can't we exclude some packages from the beginning? How about mesa, is it used for compositing without opengl windows manager? I've already switched between gallium and classic mesa, and it did not change anything.
Comment 9 Harald Judt 2011-02-22 05:41:33 UTC
I wonder if anyone else is able to reproduce this issue...
Comment 10 Michel Dänzer 2011-02-23 08:38:46 UTC
I guess it's most likely an xserver issue. E.g. commit 6dd775f57d2f94f0ddaee324aeec33b9b66ed5bc ('composite: Don't backfill non-bg-None windows') looks possibly relevant.

Sounds like bug 27596 btw (though AFAICT that was reported before the change above was made).
Comment 11 Harald Judt 2011-02-25 01:43:48 UTC
Thanks for pointing me to this. The symptoms looks indeed very similar, though I do not start X server with option -wm:

xinit /etc/X11/xinit/xinitrc -- /usr/bin/X -dpi 96 -auth /home/xxxxx/.serverauth.13388

Hopefully I will find some time next week to check this out.
Comment 12 Michel Dänzer 2011-02-26 00:49:05 UTC
*** Bug 34739 has been marked as a duplicate of this bug. ***
Comment 13 Ian Pilcher 2011-02-26 11:01:48 UTC
Created attachment 43859 [details] [review]
Patch which reverts commit 6dd775f57d2f94f0ddaee324aeec33b9b66ed5bc

I have rebuilt the Fedora 15 RPM with the attached patch (which reverts
commit 6dd775f57d2f94f0ddaee324aeec33b9b66ed5bc), and the problem appears
to be mostly gone.

I no longer see any corruption when opening menus and most windows.  I
did see a bit of corruption when opening up Thunderbird just now, but I'm
willing to put that down to KDE's "Glide" effect.
Comment 14 Michel Dänzer 2011-02-27 05:16:06 UTC
(In reply to comment #13)
> I have rebuilt the Fedora 15 RPM with the attached patch (which reverts
> commit 6dd775f57d2f94f0ddaee324aeec33b9b66ed5bc), and the problem appears
> to be mostly gone.

Thanks for confirming that. However, I'm not sure this is really a bug in the X server, or possibly rather in the clients. Adam?
Comment 15 Ian Pilcher 2011-02-27 12:22:15 UTC
(In reply to comment #14)
> Thanks for confirming that. However, I'm not sure this is really a bug in the X
> server, or possibly rather in the clients. Adam?

Note that if it is a bug in the clients, it appears to be shared by xfwm4,
compiz, and KWin.
Comment 16 Harald Judt 2011-03-08 03:25:04 UTC
I confirm the patch which reverts the commit solves the problem.

Except when resizing windows, then one can still see similar corruption; I didn't notice that before.
Comment 17 Adam Jackson 2011-03-15 07:21:18 UTC
Fine to revert from the server.  I'm not completely convinced that it's _wrong_, but I'm also not convinced it's completely right, so we might as well go with what doesn't break clients.
Comment 18 Michel Dänzer 2011-03-24 09:24:05 UTC
*** Bug 35519 has been marked as a duplicate of this bug. ***
Comment 19 Bryce Harrington 2011-03-24 18:42:52 UTC
Can confirm reversing the patch seems to resolve it for me.  I've added the patch to Ubuntu's xserver, which resolves dupe bug 35519.
Comment 20 Adam Jackson 2011-03-29 07:52:28 UTC
Revert request sent:

http://lists.x.org/archives/xorg-devel/2011-March/020882.html
Comment 21 Harald Judt 2011-03-29 09:43:50 UTC
(In reply to comment #14)
> Thanks for confirming that. However, I'm not sure this is really a bug in the X
> server, or possibly rather in the clients. Adam?

Another question is: Why do firefox and thunderbird and other applications open a small window showing nothing, then resize themselves to show the content. This is of no use and messes up compiz animation. Instead, the window should rather show when it has been rendered completely. Maybe we should file a bug on mozilla about this...
Comment 22 Harald Judt 2011-04-04 06:07:40 UTC
While reverting the commit fixes the issue when opening windows, I can still see some garbled output when switching workspaces in xfce. It's only visible for a fraction of a second and hard to capture, and not as annoying as the original issue but nevertheless noticable. Should I open another bug report for this?

When resizing windows to become larger, the empty space its contents are going to fill are now black. This is much better.
Comment 23 Nikos Chantziaras 2011-04-04 16:06:38 UTC
*** Bug 35888 has been marked as a duplicate of this bug. ***
Comment 24 Nikos Chantziaras 2011-04-04 19:16:07 UTC
I can confirm the problem on a Radeon HD4870 using xf86-video-ati Git master, KMS and Gallium on KDE 4.6.1.

Applying the above patch fixed the problem, but partially. The really annoying glitch where pop-up dialogs would not fade-in smoothly but rather a piece of pixel garbage would pop-in suddenly and then the correct contents would fade-in, has been fixed.

What this patch doesn't fix for me, is the area of newly opened application windows. They still contain pixel garbage. Prime example here is Adobe Reader. When I first launch it, it takes about 2 seconds for it to start displaying the PDF document. For those 2 seconds, the whole window contains randomly colored horizontal lines.

This is not very annoying, since it only happens the first time after a reboot. When I close the program and then launch it again, the initial contents of the (yet) empty window is a big hunk of uniform gray; no pixel garbage.
Comment 25 Nikos Chantziaras 2011-04-09 11:31:10 UTC
1.10.0.902 got released, but the patch isn't included? You guys just happen to love pixel garbage? :-P
Comment 26 jpsinthemix 2011-04-22 21:07:19 UTC
This window-background corruption issue occurs on all PCs I've installed xorg-server-1.10.1 on. I have an (old/slow) P3/Savage laptop on which the problem is quite annoying; on faster PCs it is much less noticeable. It happens for all popup menus/dialogs under KDE.

It does not occur for xorg-server-1.9.5 or 1.9 master.

The problem occurs for these windows because they have
pWin->backgroundState = None so that bgNoneVisitWindowi() always returns WT_WALKCHILDREN, and hence bits are never copied from the parent for these windows (composite/compalloc.c):

    /*
     * If there's no bg=None in the tree, we're done.
     *
     * We could optimize this more by collection the regions of all the
     * bg=None subwindows and feeding that in as the clip for the
     * CopyArea below, but since window trees are shallow these days it
     * might not be worth the effort.
     */
    if (TraverseTree(pWin, bgNoneVisitWindow, NULL) == WT_NOMATCH)
        return pPixmap; <--- for KDE popup menus/dialogs this return always
                             occurs <- ** NOTE **

    /*
     * Copy bits from the parent into the new pixmap so that it will
     * have "reasonable" contents in case for background None areas.
     */

To circumvent this issue I have used the following patch, which, due to my very
limited upderstanding of X internals, may be inappropraite/incorrect. My hope is that it at least sheds some light on the problem.

--- xorg-server-1.10.1.old/composite/compalloc.c        2011-02-24 22:27:24.000000000 -0500
+++ xorg-server-1.10.1.new/composite/compalloc.c        2011-04-22 06:13:23.170986733 -0400
@@ -511,12 +511,13 @@
 static int
 bgNoneVisitWindow(WindowPtr pWin, void *null)
 {
-    if (pWin->backgroundState != BackgroundPixmap)
-       return WT_WALKCHILDREN;
-    if (pWin->background.pixmap != None)
+    if (pWin->drawable.class == InputOnly)
        return WT_WALKCHILDREN;

-    return WT_STOPWALKING;
+    if (pWin->backgroundState == None || pWin->background.pixmap == None)
+       return WT_STOPWALKING;
+
+    return WT_WALKCHILDREN;
 }

 static PixmapPtr

thanks for your time.
John
Comment 27 jpsinthemix 2011-04-23 00:52:32 UTC
Spoke to soon. While Ian Pilcher's patch or the one in comment #26 eliminates the issue for KDE app popups/menus (at least on my PCs), the issue remains for GTK-X11-based apps such as Audacity, and Mozilla Firefox/Thunderbird.

So perhaps this is not really an X bug, but rather sloppy X apps implicitly taking advantage of an X loophole closed as of xorg-server-1.10 ?

By the way the brief appearance of a small 'empty' upper-left corner rectangle in Mozilla Firefox and Thunderbird has been around for a long time (and unrelated to the issue here) -- and has actually finally been fixed in Firefox-4.0, but not yet in Thunderbird.
Comment 28 Adam Rak 2011-06-16 15:39:12 UTC
I can confirm it on HD5970 too. with ddx git version: cbcc57b0fa6f581be777bef648f2bf3efe7443ee

I also have some corruption in Kate editor. Where is muddles up the screens, so very often I see different parts of the text I am editing (not memory junk).

They might be related. The Kate might be because of some implicit assumptions from qt/KDE too, I can post a screenshot if you really want.
Comment 29 boris64 2011-06-27 06:59:37 UTC
*** Bug 38711 has been marked as a duplicate of this bug. ***
Comment 30 Jeremy Huddleston Sequoia 2011-07-02 23:17:40 UTC

*** This bug has been marked as a duplicate of bug 31017 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.