[cairo] Pixman ARM Performance

Mon Jul 27 03:13:59 PDT 2009

Siarhei Siamashka wrote:
> On Friday 24 July 2009, Koen Kooi wrote:
>> On 24-07-09 18:15, Jacob Bramley wrote:
>>> As a side note, I found that the existing Neon switch in the
>>> configuration file is incorrect. If Neon is detected, it sets
>>> "-mcpu=cortex-a8 -mfpu=neon".
> 
> It's actually the other way around :-)
> 
> The option -mcpu=cortex-a8 is set in order to get neon actually 'detected'.
> See 
> http://cgit.freedesktop.org/pixman/commit/?id=767542cfb955ba22dad1259eff8a2fe16e7b8ba4
> http://cgit.freedesktop.org/pixman/commit/?id=9837465fd9a5d4e7280d4c79c41d2d9a9c8f71c0

Ah, Ok. I'm not very familiar with the details of configure scripts and 
shell scripting, and the ARM_NEON_CFLAGS appeared to be what it added to 
the compiler command line for the actual build too. This could have just 
been coincidence.

My next question was going to ask why we set -mcpu=arm1136j-s for SIMD, 
but you've already answered that now! :-)

> In order to solve all this in a reliable way, probably configure script can do
> some extra 'gymnastics' trying to mix and match different gcc flags and trying
> to find a working combination. Something like this:
> 1. Try to compile neon code with just standard CFLAGS, if it's OK, then we are 
> fine
> 2. Try to add '-mfpu=neon -mfloat-abi=softfp' and then try to actually link it 
> with something else
> 3. Try to add '-mfpu=neon -mfloat-abi=hardfp' and then try to actually link it 
> with something else
> 4. Repeat 2. and 3. also adding -mcpu=cortex-a8 (for solving potential 
> problems with PLD)

This solution looks good to me.

> If none of these steps succeeds, then the toolchain has no support for
> building NEON code or is inherently ABI incompatible or something else.
> 
> 
> Another option is to just use plain *.S files for all the NEON assembly and
> avoid dealing with all this gcc command line switches madness :-)

If I remember correctly, you still need to set -mfpu and the rest in 
order to prevent GCC from complaining about the Neon instructions. This 
is certainly true for inline assembly on the tool-chains I've used. 
Having an assembly file for the Neon stuff might be a good idea, as much 
of arm-neon.c consists of inline assembly anyway, but it would mean that 
we'd be locked in to a particular ABI, or would have many #ifdefs to 
select appropriate ABI code for each function. Also, porting it could be 
quite a bit of work as the code currently uses several Neon intrinsics.

Thanks,
Jacob