I'm not sure what gcc versions "Fortunately gcc has no problems compiling alternative implementation" is referring to, but with cs2009q1 commit 29c2ae4a breaks the build: <-- snip --> ... gcc -DHAVE_CONFIG_H -I. -I. -I.. -mfpu=neon -mcpu=cortex-a8 -O2 -mfloat-abi=softfp -mfpu=neon -Wall -fno-strict-aliasing -fvisibility=hidden -MT libpixman_arm_neon_la-pixman-arm-neon.lo -MD -MP -MF .deps/libpixman_arm_neon_la-pixman-arm-neon.Tpo -c pixman-arm-neon.c -fPIC -DPIC -o .libs/libpixman_arm_neon_la-pixman-arm-neon.o pixman-arm-neon.c: In function 'neon_composite_add_8000_8000': pixman-arm-neon.c:163: warning: unused variable 'temp' pixman-arm-neon.c:163: warning: unused variable 'dval' pixman-arm-neon.c:163: warning: unused variable 'sval' pixman-arm-neon.c: In function 'neon_composite_over_8888_8888': pixman-arm-neon.c:322: warning: unused variable 'temp' pixman-arm-neon.c:322: warning: unused variable 'dval' pixman-arm-neon.c:322: warning: unused variable 'sval' pixman-arm-neon.c: At top level: pixman-arm-neon.c:2319: warning: 'neon_composite_over_n_0565' defined but not used pixman-arm-neon.c:2477: warning: 'neon_composite_over_8888_0565' defined but not used pixman-arm-neon.c:2729: warning: 'arm_neon_blt' defined but not used pixman-arm-neon.c: In function 'neon_composite_over_n_8_0565': pixman-arm-neon.c:840: warning: 'dval' may be used uninitialized in this function pixman-arm-neon.c:839: warning: 'alpha' may be used uninitialized in this function pixman-arm-neon.c:982: error: constant out of range pixman-arm-neon.c:982: error: constant out of range pixman-arm-neon.c:982: error: constant out of range make[3]: *** [libpixman_arm_neon_la-pixman-arm-neon.lo] Error 1 ... <-- snip -->
Appears that I only tested it with cs2007q3. And additionally looks like like pixman git did not get much testing on ARM lately by anyone else. Intrinsics seem to be quite fragile :( As a workaround you can try to either revert this commit (and have a somewhat broken optimization). Or revert the commit and also adidtionally comment out all the entries with 'neon_composite_over_n_8_0565' the fastpath array.
Created attachment 28986 [details] [review] 0001-ARM-workaround-for-gcc-bug-in-vshll_n_u8-intrinsic.patch This looks like a regression in gcc (I'm going to file a bug to gcc bugzilla shortly). Adrian, could you please try the attached patch with your toolchain? After you (successfully) build pixman, also please try to run 'test/blitters-test' program to verify correctness of the resulting code.
A reference to gcc bugzilla: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41196
Adrian, could you please confirm that the attached patch works with cs2009q1 and there are no other problems? I only tested the fix with gcc 4.4.1 and also verified that it still works fine with cs2007q3...
(In reply to comment #2) > Created an attachment (id=28986) [details] > 0001-ARM-workaround-for-gcc-bug-in-vshll_n_u8-intrinsic.patch > > This looks like a regression in gcc (I'm going to file a bug to gcc bugzilla > shortly). > > Adrian, could you please try the attached patch with your toolchain? After you > (successfully) build pixman, also please try to run 'test/blitters-test' > program to verify correctness of the resulting code. Thanks, build worked, and blitters-test passed on an OMAP EVM board. But the patch is very ugly, and that is partially due to the fact that this is in pixman a workaround for a problem with a workaround for a workaround...
(In reply to comment #5) > Thanks, build worked, and blitters-test passed on an OMAP EVM board. Thanks, looks like this works fine with most versions of gcc that are in use. > But the patch is very ugly, and that is partially due to the fact that this is > in pixman a workaround for a problem with a workaround for a workaround... Do you have a better solution? The current code is not perfect, but it produces correct results in tests and shows no problems when using it on real device. Also it is faster than pure C or older ARMv6 optimizations. It should be sufficient for 0.16 version of pixman and sets some performance baseline. This code will be eventually (and quite likely soon) replaced with something better. The patch attached here is a workaround for a problem/regression in gcc. Hopefully it will be fixed soon and this non-perfect world will become a little bit better. If somebody is up for such task, it probably also makes sense to try compiling pixman NEON optimizations with gcc inline assembly disabled and just RVCT-style intrinsics, identify all the problems and also report them to gcc bugzilla.
(In reply to comment #6) > (In reply to comment #5) > > Thanks, build worked, and blitters-test passed on an OMAP EVM board. > > Thanks, looks like this works fine with most versions of gcc that are in use. > > > But the patch is very ugly, and that is partially due to the fact that this is > > in pixman a workaround for a problem with a workaround for a workaround... > > Do you have a better solution? The current code is not perfect, but it produces > correct results in tests and shows no problems when using it on real device. > Also it is faster than pure C or older ARMv6 optimizations. It should be > sufficient for 0.16 version of pixman and sets some performance baseline. This > code will be eventually (and quite likely soon) replaced with something better. >... I'm a guy who prefers solving problems properly over quick fixes, that should explain my comment about me being unhappy with 3 levels of workarounds (with a workaround fixing a problem in the previous workaround). BTW: I'll send a fix for a build error with the NEON code I ran into during a Thumb-2 build soon...
(In reply to comment #7) > I'm a guy who prefers solving problems properly over quick fixes, that should > explain my comment about me being unhappy with 3 levels of workarounds (with a > workaround fixing a problem in the previous workaround). 'properly' just needs proper definition :) I think that git head must be always in a good shape and mostly bug free. Fast and dirty fix which solves problems (of course if it really solves them, not just hides) is better than keeping code broken for a long time while working on a proper and clean solution. But from my point of view, current NEON code does not have a future just because it does not provide the best performance. So I don't feel like changing it for anything except minor bugfixes and workarounds. I have a work-in-progress branch, implementing optimizations which have been prototyped and tried in http://lists.cairographics.org/archives/cairo/2009-July/017799.html This branch is available at: http://cgit.freedesktop.org/~siamashka/pixman/log/?h=arm-neon-asm But it still needs a rebase to the current pixman head, final review, testing, more comments in the code. I expect to clean it up and submit to the mailing list within a few days. I hope that it could be merged for one of the future 0.17.x pixman releases. > BTW: I'll send a fix for a build error with the NEON code I ran into during a > Thumb-2 build soon... Nice. Also Soeren can probably comment regarding the plans for 0.16 version maintenance and next unstable versions development. I'm not quite sure about the next steps.
Created attachment 29109 [details] [review] 0002-ARM-Remove-fallback-to-ARMv6-implementation-from-NE.patch Would this patch be a good solution for thumb2 build problems?
(In reply to comment #9) > Created an attachment (id=29109) [details] > 0002-ARM-Remove-fallback-to-ARMv6-implementation-from-NE.patch > > Would this patch be a good solution for thumb2 build problems? In the normal case both NEON and SIMD code being included, this would result in _pixman_implementation_create_fast_path() being called directly by both files. Is this really correct? But let's not mix two different issues in one bug report. I've opened #23677 for this bug and attched my proposed patch there.
(In reply to comment #10) > (In reply to comment #9) > > Created an attachment (id=29109) [details] [details] > > 0002-ARM-Remove-fallback-to-ARMv6-implementation-from-NE.patch > > > > Would this patch be a good solution for thumb2 build problems? > > In the normal case both NEON and SIMD code being included, this would result in > _pixman_implementation_create_fast_path() being called directly by both files. > Is this really correct? Yes. Currently pixman-arm-simd.c does not provide any extra fast path functions in addition to what pixman-arm-neon.c already has. So this dependency is currently a bit redundant. PS. I committed vshll_n_u8 workaround as it works everywhere according to the reports and there were no further objections from you.
Marking this bug as resolved. All the NEON related build problems should be fixed now. Thanks a lot for reporting these problems, and also for discussing and proposing possible solutions.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.