Bug 13445

Summary: Optimized fill routines for ARM
Product: pixman Reporter: Ilpo Ruotsalainen <ilpo.ruotsalainen>
Component: pixmanAssignee: Søren Sandmann Pedersen <soren.sandmann>
Status: RESOLVED FIXED QA Contact: Søren Sandmann Pedersen <soren.sandmann>
Severity: enhancement    
Priority: medium CC: adrian.bunk, jmuizelaar
Version: git masterKeywords: patch
Hardware: ARM   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 18106    
Bug Blocks:    
Attachments: ARM optimized fill routines
patch against pixman 0.11.8

Description Ilpo Ruotsalainen 2007-11-29 05:28:16 UTC
This patch implements a framework for platform-specific optimized versions of routines and ARM assembler optimized version of the pixman_fill() routine.
Comment 1 Ilpo Ruotsalainen 2007-11-29 05:29:07 UTC
Created attachment 12825 [details] [review]
ARM optimized fill routines
Comment 2 Søren Sandmann Pedersen 2007-11-29 09:48:54 UTC
This code itself looks fine (although I haven't written ARM code for a long time), but the build system integration should probably be done similarly to the MMX special-casing. Ie.:

- Add new pixman-arm.[ch] files with the ARM specific code named as pixman_fill_arm()

- in the existing pixman_fill() in pixman-utils.c, add another case like this:

#ifdef USE_ARM
  pixman_fill_arm (...);
#else
#ifdef USE_MMX
  if (...)
#endif
  {
      switch (bpp)
      ...
#endif

Alternatively, the MMX code could be changed to be done the way you have done the ARM code.
Comment 3 Ilpo Ruotsalainen 2007-11-30 04:53:51 UTC
(In reply to comment #2)
> This code itself looks fine (although I haven't written ARM code for a long
> time), but the build system integration should probably be done similarly to
> the MMX special-casing. Ie.:
> 
> - Add new pixman-arm.[ch] files with the ARM specific code named as
> pixman_fill_arm()
> 
> - in the existing pixman_fill() in pixman-utils.c, add another case like this:
> 
> #ifdef USE_ARM
>   pixman_fill_arm (...);
> #else
> #ifdef USE_MMX
>   if (...)
> #endif
>   {
>       switch (bpp)
>       ...
> #endif
> 
> Alternatively, the MMX code could be changed to be done the way you have done
> the ARM code.

The reason I didn't use the MMX-style integration is twofold; first of all, I don't like having #ifdef-blocks for every architechture that might eventually have optimized code in every function, and secondly to avoid function call overhead.

Changing the MMX code is unfortunately out of the scope of the project I am working on, and is somewhat troublesome because it wants to be able to fall back to the normal codepath if it doesn't like the arguments it's being passed.
Comment 4 Koen Kooi 2008-08-14 03:49:23 UTC
Created attachment 18279 [details] [review]
patch against pixman 0.11.8

Fix trivial patch conflict for 0.11.8
Comment 5 Jeff Muizelaar 2008-08-21 13:51:46 UTC
The existing C routines should likely be optimized before resorting to arm assembler. That way we can measure any real improvement that comes from this patch.

When I tried the patch only pixman_fill8 and pixman_fill16 were noticeably faster. I assume the reason for this is avoiding byte accesses which could just as easily be implemented in C.
Comment 6 Siarhei Siamashka 2009-07-19 07:34:41 UTC
Optimized fill for older ARM cores needs to use 16-byte aligned writes with STM instruction. This way it can utilize burst writes to SDRAM and roughly double the performance. This trick works fine on ARM9E (OMAP1710) and ARM11 (OMAP2420), not sure about other ARM microarchitectures line XScale or anything else. It is used in Xomap.

Naturally this is not possible to implement in C, so a bit of assembly is needed.
Comment 7 Corbin Simpson 2011-09-13 22:00:30 UTC
Tagging patch so it won't get lost in the future.
Comment 8 Siarhei Siamashka 2014-09-21 20:13:22 UTC
The fill operation has been optimized for ARMv6 in http://cgit.freedesktop.org/pixman/commit/?id=3cff56c5b091d2e584503e7887414e224876de37

Nowadays almost everyone is using ARMv7 and the move to 64-bit ARMv8 is underway. ARMv6 is already rather rare and primarily represented by just the Raspberry Pi (which is old and slow). So closing this ticket makes sense.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.