[Pixman] [cairo] pixman: New ARM NEON optimizations
Andrea Canciani
ranma42 at gmail.com
Sat Feb 12 14:52:18 PST 2011
On Fri, Feb 11, 2011 at 11:30 AM, Soeren Sandmann <sandmann at cs.au.dk> wrote:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
>> But at the moment, about the only thing I truly want to add to the API
>> is a LERP operator.
>
>> Cairo has a slightly different definition for [clip-] masked operators
>> where the operation is defined as:
>>
>> ((src IN mask) OP dst) LERP_clip dst
>>
>> and for SOURCE and CLEAR:
>>
>> (src OP dst) LERP_(clip IN mask) dst
>>
>> Cairo currently implements LERP as:
>>
>> a LERP_t b := (a IN t) ADD (b OUT t)
>>
>> So for example, for the typical unbounded operation (such as IN) this
>> translates to:
>>
>> result = (((src IN mask) OP dst) IN clip) ADD (dst OUT clip)
>>
>> which currently requires 3 composite operations:
>>
>> dst' = dst; /* may require a blt using SRC */
>> pixman_image_composite (OP, src, mask, dst');
>> pixman_image_composite (OUT_REVERSE, clip, NULL, dst);
>> pixman_image_composite (ADD, dst', clip, dst);
>>
>> By introducing the LERP operator we can reduce this down to 2:
>>
>> dst' = dst; /* may require a blt using SRC */
>> pixman_image_composite (OP, src, mask, dst');
>> pixman_image_composite (LERP, dst', clip, dst);
>>
>> Obviously, we could introduce a quaternary operator and do this in a
>> single pass - though the complexity would seem to outweigh this rarely
>> used slow path.
>
> So, I realized I never actually responded to this, sorry about that.
>
> If I had a time machine, I would go back and make - among others -
> these two changes to Render:
>
> (1) The equation would be
>
> (src OP dst) LERP_mask dst
>
> and not
>
> (src IN mask) OP dst
>
> (2) The RGB channels of Alpha-only images would be considered to be
> the same as the alpha channel, and not 0 as they are now. For
> example, a 0xb9 pixel in an a8 image would be considered
> equivalent to 0xb9b9b9b9 and not to 0xb9000000. That is, they
> would be considered a translucent white rather than a translucent
> black.
Isn't it possible for pixman to dynamically do that using a new iterator
type ("mask")?
Pixman would then only need component-alpha operators, but I'm
afraid it would still need to provide fastpaths for the old non-ca
operators to avoid performance regressions.
>
> These two changes together would have the effect that (a) the equation
> would be much easier to understand visually (composite src and dst,
> then clip to the mask and write back), and (b) component alpha would
> become completely regular with no need for the "component_alpha" bit
> in pictures.
>
> Given the lack of a time machine, a possible direction might be to add
> a new set of operators CLEAR_LERP, SRC_LERP, DST_LERP, ..., that would
> follow the equation above, and also a new format type
> PIXMAN_FORMAT_TYPE_W where W stands for 'white', and the missing color
> channels would be defined to be copies of the W channel.
>
> Even better might be to add a new entry-point:
>
> pixman_image_composite_lerp (...):
>
> that would use the LERP equation and treat alpha-only formats as
> described above.
>
> With either, your LERP operator would simply be SRC_LERP, but the
> other other _LERP operators might be useful for cairo too. For
> example, the
>
> ((src IN mask) OP dst) LERP_clip dst
>
> equation could become
>
> composite (SRC, src, mask, tmp);
> composite (OP_LERP, tmp, clip, dst);
>
> instead of
>
> composite (SRC, dst, NULL, tmp);
> composite (OP, src, mask, tmp);
> composite (SRC_LERP, tmp, clip, dst);
>
> Adding all those operators is obviously a lot of work, so I can see
> why it would be tempting to just add the LERP operator. However, it
> would annoy me to have one single weird operator that would behave
> according to a totally different equation than all the other
> operators.
>
> If it were a huge speedup, then maybe, but it seems to be that it's
> only useful in relatively few cases. Or alternatively, if we have to
> bite the bullet and add special-cased support for this, then maybe it
> would be better to just add the full quaternary operator as a new
> entry point, and not pretend it's just another regular operator.
Quaternary operators would avoid the temporary image, which might
provide a very nice performance boost. An even better performance
improvement might be possible if the clip was handled in pixman
as spans (either emitted by cairo or by a pixman polygon image).
Unfortunately I don't have any data to back there opinions, but IIRC
Chris has been experimenting with spans in xcb-xrender, so he
might be able to confirm or correct me.
Andrea
More information about the Pixman
mailing list