[cairo] pixman: New ARM NEON optimizations
sandmann at daimi.au.dk
Thu Dec 10 09:41:49 PST 2009
Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> On Wednesday 02 December 2009, Soeren Sandmann wrote:
> > Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> > > As you noticed earlier, software RENDER extension implementation in
> > > xserver suffers from creating and destroying temporary pixman_image_t
> > > structures for each operation in fbComposite function (PicturePtr and
> > > pixman_image_t are practically duplicates of each other). But this is not
> > > a good excuse to be wasteful regarding CPU cycles in pixman too. If
> > > anything can be simplified and optimized even a bit with relatively
> > > little efforts, probably this should be done. Or is it better to fix
> > > xserver first and then look at pixman performance again?
> > As long as the X server is creating and destroying images all the
> > time, I don't think it makes a lot of sense to optimize pixman for
> > tiny images.
> X server is an important pixman user, but it is not the only one. Cairo with
> image backend is one of the examples.
> Removing delegates just:
> 1. makes code smaller
> 2. makes it a bit faster
> Here is a branch for delegates removal (for pixman_blt so far)
> This can be also easily done for pixman_fill and combiners.
> This issue definitely starts taking much more time than it is deserving (it's
> not something critical, but just a kind of low hanging fruit). If it's a no
> go and delegates are going to stay, then I'm done with it and will stop
> spamming here.
I think I should probably have called them 'fallbacks' instead of
delegates. 'Delegates' sounds like some over-abstracted Enterprise
Design Pattern Disaster, which I don't think is the case here.
The whole point of the implementation/delegate mechanism is to allow
falling back from more specific implementations to more generic
ones. None of the current blt operations actually make use of this
right now, but it is easy to imagine that you could write an 8 bit
generic blt for the fast path implementation that you would then fall
back to from the architecture specific ones.
If you have a better way to deal with fallbacks than what we have now,
I'm listening. But inlining a generic implementation in every
architecture specific operation is not the answer, given that this
really is not a huge performance issue.
> The point is that the delegates are easy to remove and they are easy to add if
> really needed. Right now they have no practical use, but slow down code and
> clutter it a bit.
> > When the X server is fixed, it would make a lot of sense to look into
> > how to get pixman to deal with tiny images, whether they are glyphs or
> > just small, general images. The flags branch helps a bit with that,
> > but there is still overhead on some profiles.
> I'm not sure if it is easy to fix X server properly. Keeping an extra
> pixman_image_t copy for each picture and doing lazy updates for it is going to
> increase memory overhead (pixman_image_t itself is larger than 300 bytes).
> With lots of small images (those which should benefit the most) extra storage
> overhead becomes quite apparent. There could be other solutions tried but this
> is better to be discussed on xorg-devel list.
It may make sense to reduce the size of pixman images. There are some
fairly straight forward ways to that, one of which is to move the
virtual functions out into their own structure, much like a C++
compiler would do. Another one is to consolidate the various
pixman_bools into flags or bit fields.
Also, at least for glyphs, I still think that a pixman_glyph_set
feature would make a lot of sense.
More information about the cairo