Soren, does this look sane?
Created attachment 8422 [details] [review]
Yes, it looks sane to me. If it tests ok, it should be fine to commit.
patch tests out OK on a few OLPC machines around here
Soren; can you double-check that the patch is correct? We're not showing it
making any real improvement on the OLPC machines; still trying to track down
why... Maybe I'm doing something really stupid in the patch.
+ assert (pSrc->pDrawable == pMask->pDrawable);
combined with the fact that the function is only called when pMask is NULL means the function always crashes, so the reason you didn't see any improvement has to be that the function was never called.
Also, the checks for alpha==0xff and 0x00 make sense in operations that are mainly used for glyphs, where many pixels will be either fully transparent or fully opaque, but for this function I'm not sure that's often going to be the case.
I'll fix those things and commit. You will probably want to make sure that the function is actually getting in your setup.
After adding that function, we did see it in sysprof profiles; I'm not sure why it wasn't crashing though...
Created attachment 8625 [details]
sysprof with fbCompositeSrc_8888x0565mmx patch
I guess your X server was compiled with -DNDEBUG then.
As for why it doesn't make a difference, my experience has been that when a single compositing op dominates as much as is the case in your profile, what's going on is most likely that it is reading video memory.
If that's the case MMX vs. plain C is not going to make a huge difference since reading video memory is so slow.
I built and tested out some RPMs with the fixes you committed to Red Hat pkgcvs. It seems that the cairo+alpha testcase got somewhat _slower_ than before your fixes. The gtk-pixbuf test (without alpha) was slower too. Any idea why that might be? I can't believe the generic code is actually faster than the mmx stuff... I'll have to test more, but can you think of anything off the top of your head as to why your fix might slower than my broken code?
Yes, assuming that
- it is actually reading from video memory, and
- the source images in your test have substantial parts that are either
completely transparent or completely opaque
the alpha checks that I removed would actually have a big positive effect. We may want to put them back in the X code. (They really don't make much sense in the cairo version where framebuffer reads can't happen).
However, if you are really getting video memory reads, then that's probably the first thing that needs to be fixed. Maybe try setting
in xorg.conf and see if that makes a big difference.
Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future.
A modified patch got committed to Cairo's pixman in Jan 2007:
and Soren merged it to Xorg's pixman in 2007 too before pixman got split into a standalone module.