[Pixman] Faster unorm_to_unorm for wide path processing

Sun Jun 10 10:03:53 PDT 2012

"Antti S. Lankila" <alankila at bel.fi> writes:

> Attached is a simple patch that produces around 20 % Mpix/s
> improvement for wide path processing due to significant optimization
> of pixman_expand. On my i7 laptop, we go from:
>
>> src_8888_2x10 =  L1:  62.08  L2:  60.73  M: 59.61
>>                   (  4.30%)  HT: 46.81  VT: 42.17  R: 43.18  RT: 26.01 (
>>                   325Kops/s)
>
> to
>
>>  src_8888_2x10 =  L1:  76.94  L2:  78.43  M: 75.87
>>                   (  5.59%)  HT: 56.73  VT: 52.39  R: 53.00  RT: 29.29 (
>>                   363Kops/s)
>
> The key of the patch is the observation that unorm_to_unorm's work can
> more easily be done with a simple multiplication and shift, when the
> function is applied repeatedly and the parameters are not compile-time
> constants. For instance, converting from 0xfe to 0xfefe (expanding
> from 8 bits to 16 bits) can be done by calculating
>
> c = c * 0x101
>
> However, sometimes the result is not a neat replication of all the
> bits. For instance, going from 10 bits to 16 bits can be done by
> calculating
>
> c = c * 0x401UL >> 4
>
> where the intermediate result is 20 bit wide repetition of the 10-bit
> pattern followed by shifting off the unnecessary lowest bits.
>
> The patch has the algorithm to calculate the factor and the shift, and
> converts the code to use it.

This patch looks basically good to me provided that make check still
passes. The comments I have are mainly about coding style (please see
the CODING_STYLE file). In particular:

- All the information in the mail would be useful to have in the commit
  message: the speed-up, how it works, etc.

- The function unorm_to_unorm_params() should be static

- Space before the left parenthesis

- Avoid variable declarations in the middle of the code

- Indent is four spaces

- Braces go on their own line

But other than that, this looks like a nice speedup.

Thanks,
Søren