|Summary:||Speeding up render on Alpha with MVI instructions|
|Product:||pixman||Reporter:||Falk Hueffner <falk>|
|Component:||pixman||Assignee:||Xorg Project Team <xorg-team>|
|Status:||RESOLVED NOTABUG||QA Contact:||Xorg Project Team <xorg-team>|
|Priority:||high||CC:||ajax, mattst88, roland.mainz|
|i915 platform:||i915 features:|
|Attachments:||Patch for using MVI instructions in rendering|
Description Falk Hueffner 2005-01-22 13:11:17 UTC
This is a patch to use special instructions on the Alpha architecture to speed up rendering operations. It is based on the similar works for i386 using MMX instructions (Bug 839). Instead of modifying fbmmx.c, I chose to reimplement mmintrin.h for Alpha. This should already give noticeably better code for most cases. The patch is not quite finished, but I was hoping to get some feedback. In particular, is there any documentation on the functions in fbmmx.c? For example about expected alignment of arguments? I'm currently running into unaligned accesses for example in fbCompositeSrcAdd_8888x8888mmx. Apparently i386 does not care, but on the Alpha architecture, this incurs a trap to the OS, which is why this patch doesn't really speed anything up yet...
Comment 1 Falk Hueffner 2005-01-22 13:12:58 UTC
Created attachment 1737 [details] [review] Patch for using MVI instructions in rendering
Comment 2 Søren Sandmann Pedersen 2005-01-22 16:43:22 UTC
i386 does not "not care" about unaligned access. They don't generate traps, but they are much slower than aligned accesses. The code in fbCompositeSrcAdd_8888x8888mmx() assumes that the drawables are aligned on 4 byte boundaries. Since the drawables are 32 bits per pixel, anything else would be insane. Are you seeing drawables that are aligned on a 16 bit boundary?
Comment 3 Falk Hueffner 2005-01-23 13:34:51 UTC
(In reply to comment #2) > i386 does not "not care" about unaligned access. They don't generate traps, but > they are much slower than aligned accesses. The code in > fbCompositeSrcAdd_8888x8888mmx() assumes that the drawables are aligned on 4 > byte boundaries. Since the drawables are 32 bits per pixel, anything else would > be insane. > > Are you seeing drawables that are aligned on a 16 bit boundary? No. However, it happens frequently that src and dst aren't co-aligned, that is, while (w && (unsigned long)dst & 7) will make dst 8-byte-aligned, but (src % 8) == 4. Is it possible to somehow ensure co-alignment? Otherwise, I would have to compensate for this in the source (and that would probably also be helpful for i386, depending on how slow unaligned accesses there really are...)
Comment 4 Daniel Stone 2007-02-27 01:25:12 UTC
Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Comment 5 Matt Turner 2009-01-31 10:11:13 UTC
I've been playing with implementing Alpha fast paths for pixman, but writing code using MVI instructions is very difficult as MVI is very limited. Basic operations such as addition need to be simulated. To compound this, MVI instructions have a latency of 3 on EV6, which is awful (only a latency of 2 on PCA56) so to get good performance on both platforms separate code paths need to be written for each to reduce stalling. Nothing to show yet. May not be worthwhile at all.