Created attachment 28900 [details] [review]
patch to dramatically reduce CPU usage
I recently noticed that X seemed to take far more CPU when displaying a YV12 small rectangle (say 704x480) out of a large YUV source buffer (say 2048x1088)using XvShmPutImage than it was when the source YUV buffer was 704x480.
Looking at the code in i830PutImage (i830_video.c), I realized that the destination buffer for I830CopyPlanarData and the likes was way too large in the case I just described.
In fact, it was the same size (+/- pitch alignment) as the whole source YUV buffer.
Attached is a patch that sizes it as the source sub-rectangle within the whole source YUB buffer, adjusted for clipping and pitch alignment.
Performance improvement were dramatic: 30 frames/second of 704x480 from a 2048x1088 source used to have X take 10% of the CPU. Now it merely takes 3%.
Please review (I'm not an Intel video driver guru and I might have overlooked some issues. Plus I could not test all the possible cases.) and commit for everybody's enjoyment :)
i810PutImage might eventually also benefit from this type of improvement.
haihao, can you review?
The current code is using nlines and npixels to trim the copied region.