Bug 25136

Summary: libfb's copy_drawable does unsafe HW accelerated ops, reads back way too much data
Product: xorg Reporter: Aaron Plattner <aplattner>
Component: Server/GeneralAssignee: Søren Sandmann Pedersen <soren.sandmann>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: aplattner, clee, freedesktop, gblist, jan, keithp, randrik, remi, rmj, soren.sandmann, stuffcorpse, thomas
Version: 7.5 (2009.10)   
Hardware: x86 (IA32)   
OS: All   
URL: http://lists.x.org/archives/xorg-devel/2009-November/003569.html
Whiteboard:
i915 platform: i915 features:

Description Aaron Plattner 2009-11-16 17:58:02 UTC
Commit e9aa61e9f0d663d5b34a397b943b4d1df44e873d added code to copy the contents of source windows into temporary scratch pixmaps before performing software compositing, in case that rendering reads outside the bounds of the source window's backing storage.  However, it's not safe to call the top-level CreatePixmap and corresponding rendering routines from inside an fb fallback because the driver doesn't get the appropriate opportunity to sync the GPU or set up the appropriate software access.  There's a Fedora patch to resolve that particular issue by replacing the accelerated calls with their software equivalents:

http://cvs.fedora.redhat.com/viewvc/rpms/xorg-x11-server/F-12/xserver-1.7.1-window-pictures.patch?revision=1.4&view=markup

However, that patch causes a severe performance regression on KDE because kwin maps a 1x1 InputOutput at (0,0).  The backfill code does a 1x1 composite from the root window to fill that window's backing pixmap, but copy_drawable doesn't check the sizes at all and reads the entire 1600x1200 root window back to malloc memory in software.

libfb should not bother with the copy if the source window is entirely contained within its bounding pixmap.  It should probably also determine exactly which pixels it needs to read back, but that can be a separate optimization.
Comment 1 Aaron Plattner 2009-11-20 07:52:21 UTC
For those watching the bug but not the mailing list, I sent out a hotfix for this: http://lists.x.org/archives/xorg-devel/2009-November/003571.html

It has yet to be pulled into git.
Comment 2 Adam Williamson 2009-11-25 11:27:55 UTC
A reporter from one of the Fedora bugs on this topic suggests that the patch may cause a regression in a different area (much less serious, but still):

"After testing the build from comment 12 further it appears that it introduces
graphical corruption of the mouse pointer in some circumstances, the mouse
pointer turns into a random looking (32x32?) square on one monitor.  This
problem only seems to occur on one monitor at a time, ie when the problem is
present sometimes the pointer is fine on the left monitor but corrupt on the
right, other times the situation is reversed, I've never seen the pointer
corrupted on both monitors at any one time.

I've rolled back to the build excluding the window-pictures patch to try to
confirm that the problem is actually introduced by the new build."

https://bugzilla.redhat.com/show_bug.cgi?id=538836#c15
Comment 3 Adam Williamson 2009-11-25 18:28:48 UTC
false alarm: "The mouse pointer corruption eventually showed up with the old build, so I don't think it's related to this bug."
Comment 4 Aaron Plattner 2009-11-29 12:14:49 UTC
The commit in question was reverted out of server-1.7-branch, but debate is still ongoing as to what the right fix is for master.  Reassigning to Soeren.
Comment 5 Jan de Groot 2009-12-07 05:35:35 UTC
This "fix" in the server-1.7 branch causes crashes as soon as I login to GNOME. The crash is reported in the archlinux bugtracker by someone using xf86-video-ati, I have exactly the same crash with xf86-video-intel:
http://bugs.archlinux.org/task/17387

When I login to GNOME, I see the gdm background wallpaper fading into a grey color, after which X crashes or locks up. I think GNOME does some xrandr magic as I'm running dual screen.

Reverting the reverted commit makes it work again, though I applied the fedora patch and suggested band-aid to make it work for everyone.
Comment 6 Keith Packard 2009-12-07 07:33:43 UTC
I posted a patch for review that should fix this bug, but haven't heard anything back about it.

<yunhbs96i3g.fsf@aiko.keithp.com>

http://lists.x.org/archives/xorg-devel/2009-December/003859.html

This avoids all copying while also ensuring that access from pixman never steps outside the bounds of the pixmap
Comment 7 Søren Sandmann Pedersen 2009-12-07 14:41:27 UTC
I am going to comment on it, probably tomorrow.
Comment 8 Andrew Randrianasulu 2010-06-06 23:03:09 UTC
(In reply to comment #7)
> I am going to comment on it, probably tomorrow.

Half year later - patches seems to be in xserver master, but bug still here? How it manifest itself, visually? I have strange corruption with fbdev and xserver master, my bug show itself on unaccelerated operations, it seems (fallbacks, pure software rendering)


http://img96.imageshack.us/img96/3135/eterme16fbdevcorruption.jpg

i'll open another bug report if this bug is not about corruption like you see on above screenshot, in any case sorry for noise.
Comment 9 Andrew Randrianasulu 2010-06-07 22:57:48 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > I am going to comment on it, probably tomorrow.
> 
> Half year later - patches seems to be in xserver master, but bug still here?
> How it manifest itself, visually? I have strange corruption with fbdev and
> xserver master, my bug show itself on unaccelerated operations, it seems
> (fallbacks, pure software rendering)
> 
> 
> http://img96.imageshack.us/img96/3135/eterme16fbdevcorruption.jpg
> 
> i'll open another bug report if this bug is not about corruption like you see
> on above screenshot, in any case sorry for noise.

It seems to be separate problem, introduced much more recently.

Fedora and Arch bugs seems to be closed as FIXED, not sure what to do with this bug ..... Ping?

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.