Summary: | Speeding up render on Alpha with MVI instructions | ||
---|---|---|---|
Product: | pixman | Reporter: | Falk Hueffner <falk> |
Component: | pixman | Assignee: | Xorg Project Team <xorg-team> |
Status: | RESOLVED NOTABUG | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | normal | ||
Priority: | high | CC: | ajax, mattst88, roland.mainz |
Version: | other | ||
Hardware: | Alpha | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | Patch for using MVI instructions in rendering |
Description
Falk Hueffner
2005-01-22 13:11:17 UTC
Created attachment 1737 [details] [review] Patch for using MVI instructions in rendering i386 does not "not care" about unaligned access. They don't generate traps, but they are much slower than aligned accesses. The code in fbCompositeSrcAdd_8888x8888mmx() assumes that the drawables are aligned on 4 byte boundaries. Since the drawables are 32 bits per pixel, anything else would be insane. Are you seeing drawables that are aligned on a 16 bit boundary? (In reply to comment #2) > i386 does not "not care" about unaligned access. They don't generate traps, but > they are much slower than aligned accesses. The code in > fbCompositeSrcAdd_8888x8888mmx() assumes that the drawables are aligned on 4 > byte boundaries. Since the drawables are 32 bits per pixel, anything else would > be insane. > > Are you seeing drawables that are aligned on a 16 bit boundary? No. However, it happens frequently that src and dst aren't co-aligned, that is, while (w && (unsigned long)dst & 7) will make dst 8-byte-aligned, but (src % 8) == 4. Is it possible to somehow ensure co-alignment? Otherwise, I would have to compensate for this in the source (and that would probably also be helpful for i386, depending on how slow unaligned accesses there really are...) Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future. I've been playing with implementing Alpha fast paths for pixman, but writing code using MVI instructions is very difficult as MVI is very limited. Basic operations such as addition need to be simulated. To compound this, MVI instructions have a latency of 3 on EV6, which is awful (only a latency of 2 on PCA56) so to get good performance on both platforms separate code paths need to be written for each to reduce stalling. Nothing to show yet. May not be worthwhile at all. If you do write fast paths for Alpha, please file a new bug against pixman, or send mail to cairo@cairographics.org. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.