Bug 74327

Summary: [snb] rendering corruption
Product: xorg Reporter: Bas Nieuwenhuizen <bas>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: andyrtr, fry.kun
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
screenshot of corruption
none
Corruption in Gnome dialog box
none
Black boxes in Chromium
none
Corruption in Chromium none

Description Bas Nieuwenhuizen 2014-02-01 16:27:05 UTC
Created attachment 93174 [details]
screenshot of corruption

After upgrading to 2.99.908, I got some rendering corruption. See the attached screenshot for an example.

As this is a regression for me, I bisected it and got

1f9a6156e9240a1efa8785ab5bca0a3b1757d08e is the first bad commit
commit 1f9a6156e9240a1efa8785ab5bca0a3b1757d08e
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jan 31 20:02:44 2014 +0000

    sna: remove short-circuit for move-to-CPU when damage covers region
    
    The short-circuit path missed translating the damage from drawable space
    into the pixmap (for Composite setups) which may have resulted in
    corruption. The path was also failing to consider the impact of reusing
    an active CPU bo when it could be discarding the unwanted damage and
    reallocating.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>


Reproducing the problem:

As far as I can see on Qt applications the corruptions appears mostly on areas with some kind of shadow or buttons behind the cursor. However on gedit it is the open file dialog that is almost completely black.

While I have not found a clear pattern to where it occurs, it occurs pretty reliably across restarts of the X server.

Applications I tested are kdevelop4, kate and gedit.

System environment:

xserver 1.15.0
mesa 10.0.2
kernel 3.12.9-1-ARCH on x86_64
window manager: xmonad

I configured xf86-video-intel with

./configure --prefix=/usr --enable-glamor
Comment 1 Chris Wilson 2014-02-01 16:34:55 UTC
I'll see if I can trigger it, but if you can run with ./configure --enable-debug (and I'm tempted to also check ./configure --enable-debug=pixmap) that would be useful. That will enable lots of asserts which may end up crashing the Xserver (having gdb handy would be invaluable) so be careful that you do not lose work.
Comment 2 Chris Wilson 2014-02-01 16:37:05 UTC
Note to self: this is usually a path that is too eager to discard CPU damage - which has will be exposed by 1f9a6156e9240a1efa8785ab5bca0a3b1757d08e.
Comment 3 Chris Wilson 2014-02-01 16:38:41 UTC
Can you also try reverting 4385724dc49dd090e0a5956e287f80b92ebd70e8? That adds a few more paths that try and discard CPU damage - which may have been hidden by the buggy path removed in 1f9a6156.
Comment 4 Chris Wilson 2014-02-01 17:06:08 UTC
xmonad requires 341M of diskspace! *gasp*
Comment 5 Chris Wilson 2014-02-01 17:20:34 UTC
Does not want to reproduce with trivial testing on F20. :(
Comment 6 Conley Moorhous 2014-02-01 17:22:27 UTC
Created attachment 93180 [details]
Corruption in Gnome dialog box
Comment 7 Conley Moorhous 2014-02-01 17:22:47 UTC
Created attachment 93181 [details]
Black boxes in Chromium
Comment 8 Conley Moorhous 2014-02-01 17:23:15 UTC
Created attachment 93182 [details]
Corruption in Chromium
Comment 9 Conley Moorhous 2014-02-01 17:26:10 UTC
I am also having this behavior with SNB (HD Graphics 3000) in Arch Linux (Gnome 3) after upgrading to 2.99.908. I added some pictures that might help.
Comment 10 Chris Wilson 2014-02-01 17:31:24 UTC
Ok, I get black areas in gtk widgets when starting xfce4 though, and reverting 1f9a6156e9240a1efa8785ab5bca0a3b1757d08e fixes it. Looks like it could be the same issue...
Comment 11 Chris Wilson 2014-02-01 17:47:38 UTC
I feel stupid. I should have not relied on the overnight tests before making the release.


commit 2814748b91c80c8935ea2f366e954a80bef69bb0
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Feb 1 17:37:42 2014 +0000

    sna: Only discard CPU damage for an replacing region
    
    When considering move-region-to-cpu, we need to take into account that
    the region may not replace the whole drawable, in which case we cannot
    simply dispose of an active CPU bo.
    
    Reported-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
    Reported-by: Conley Moorhous <conleymoorhous@gmail.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74327
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 12 Mike C 2014-02-01 21:17:57 UTC
I just updated to 2.99.909 - unfortunately I still get video corruption with this version in Thunderbird running in KDE in archlinux, so I have again downgraded back to .907

So unfortunately there is still a bug in version xf86-video-intel 2.99.909-1
Comment 13 Chris Wilson 2014-02-01 22:00:19 UTC
Yet another bug found:

commit 853588ad5be9407d2123f6055458ca84e72b8eb9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Feb 1 21:55:09 2014 +0000

    sna: If IGNORE_CPU is not set we must mark the move as MOVE_READ
    
    Logic reversal in discarding CPU damage. An old bug revealed by the more
    aggressive attempts to discard CPU damage.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Can you please upload fresh images if the corruption persists? I may also pester you for reproduction instructions in that case as well.
Comment 14 Bas Nieuwenhuizen 2014-02-01 22:09:46 UTC
Commit 2814748b91c80c8935ea2f366e954a80bef69bb0 indeed fixes the corruption for me and then commit 2ba8d40bf7e4d3e8fa541c001f82aa65f26bed3a breaks it again, although there is somewhat less corruption than before.

Finally, pulling the latest changes up to and including the commit you just posted, it is fixed again for me.
Comment 15 Conley Moorhous 2014-02-02 00:14:10 UTC
My corruption was fixed as of 2.99.909, just FYI.
Comment 16 Brian 2014-02-02 14:40:34 UTC
The problem is not yet fixed for me. I get video corruption in chromium on Arch Linux. Specifically, the X button on the tabs are corrupted when the mouse pointer moves over them. This occurs with both xf86-video-intel 2.99.908 and 2.99.909.
Comment 17 Chris Wilson 2014-02-02 17:49:55 UTC
(In reply to comment #16)
> The problem is not yet fixed for me. I get video corruption in chromium on
> Arch Linux. Specifically, the X button on the tabs are corrupted when the
> mouse pointer moves over them. This occurs with both xf86-video-intel
> 2.99.908 and 2.99.909.

Please note that the latest fixes in this bug report are post .909.
Comment 18 Chris Wilson 2014-02-02 19:52:35 UTC
*** Bug 74404 has been marked as a duplicate of this bug. ***
Comment 19 alium 2014-02-02 20:00:31 UTC
I have same problem, G4500MHD (gen4). I now test the git version (post .909)
Comment 20 alium 2014-02-02 20:08:10 UTC
(In reply to comment #19)
> I have same problem, G4500MHD (gen4). I now test the git version (post .909)

for me fixed in git, i see no artifacts or corruption more

intel-dri 10.0.2
mesa 10.0.2
glamor-egl 0.6.0
xf86-video-intel-git 2.99.909.6.g7f08250-1
linux 3.13.1
Comment 21 Mike C 2014-02-02 21:15:55 UTC
I have tested the arch newly released testing version 2.99.909-2, which has the additional commits from today entered since .909, and have had no further incidences of video corruption. So for me this latest package based on git looks good and appears to completely fix the problem. I also checked that I am using SNA:

[mike@home1 ~]$ grep -i sna /var/log/Xorg.0.log
[     4.665] (II) intel(0): SNA initialized with Ivybridge (gen7, gt1) backend

Thank you for the fine work in sorting this out, Chris.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.