[Pixman] [PATCH 2/3] Optimize BILINEAR filter to NEAREST for identity transforms

Thu Jun 2 09:53:33 PDT 2011

Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:

>>> Even though both FILTER_NEAREST and FILTER_BILINEAR flags may be valid
>>> for certain operations, setting both of them may be no very practical,
>>> or even harmful. Just one example. Currently for ARM target we
>>> implement bilinear scaling using NEON instructions (fast path
>>> registered in 'pixman-arm-neon.c'), but nearest scaling does not need
>>> it, so we do it with just ARMv6 (with fast path registered in
>>> 'pixman-arm-simd.c'). Now let's suppose that we have both
>>> FILTER_NEAREST and FILTER_BILINEAR flags set for some scaling
>>> operation. And in this case NEON bilinear fast path will be selected
>>> instead of a better nearest fast path which is available further in
>>> the list after fallback.
>>
>> This example is an instance of a more general problem:
>>
>>   http://www.mail-archive.com/pixman@lists.freedesktop.org/msg00888.html
>>
>> It's not an easy one to fix. Setting at most one of the filter flags is
>> a band-aid for one particular case, not a fix for the underlying
>> problem. It may be worth applying this band-aid and avoiding setting
>> both flags, but this could be done in the COVER_NEAREST/BILINEAR flags
>> as well.
>>
>> The main point I wanted to make is that it would be *legal* (and
>> potentially useful) to set both. It wouldn't make pixman behave
>> incorrectly.
>
> Right now I just can't imagine the situation when it can be useful to
> set both bilinear and nearest flags. In my opinion it's a redundant
> extra and potentially confusing information.
>
> So taking special care to implement the code to set both flags and
> then additionally taking care in the fast path selection logic to
> avoid matching less optimal fast path functions seems rather counter
> productive. It looks like first creating the problem, and then working
> hard to solve it, with zero overall effect in the end. Or more like
> negative overall effect because the performance is likely going to be
> negatively impacted to some extent.

I think the fundamental problem here is that the whole flags based
selection is starting to show cracks because the underlying assumption
that you can statically order the fast paths according to how fast they
are, is not really true any more.

However, if it *were* true, then there could be cases where a
bilinear-capable fast path would be faster than a fast path that could
only deal with nearest. Maybe the bilinear fast path would only deal
with scaling for example, whereas the nearest one would deal with
general affine transformations.

My overall point is that it is important to answer this kind of
conceptual question in a clear way to preserve the sanity of the
codebase. Every time the answers get more complicated, fewer people can
write correct code, and fewer people can review patches.

I realize that some flags could be redefined to have both positive and
negative semantics, and the world probably wouldn't come to an end. I
just don't think this particular problem is important enough to change
the answer from

    "flags indicate properties that the image can provide"

to

    "flags indicate properties that the image can provide, EXCEPT one
     particular flag, which must be set to the exact value of the
     property in question, oh, and also the user-set variable for that
     property will no longer contain useful information, which you can
     tell because it now has a weird name"

Not considering that the problem can be solved within the existing
meaning.

>>> I still think that optimizing the filter as early as possible in the
>>>pipeline is the best choice. So that the rest of the pipeline just
>>>does not need to care. And while it's probably just more like a band
>>>aid (the performance is still not going to be great), optimized
>>>filter is also somewhat helping here:
>>>http://cgit.freedesktop.org/pixman/tree/pixman/pixman-bits-image.c?id=pixman-0.22.0#n565
>>
>> If this patch were the *only* way to fix an important performance
>> problem, it could be considered, but in this case there is a
>> straightforward other way: just compute COVER_BILINEAR/COVER_NEAREST
>> flags. That doesn't break the 'flags are interpreted positively' rule,
>> and doesn't impose additional rules on the compositing code such as "you
>> must access the filter through the flag bits, not through the regular
>> variable".
>
> If the call overhead is not taken into account at all, the problem can
> indeed be solved in a wide variety of different ways. If it is, then
> some solutions may suddenly become less attractive.

I don't necessarily disagree with this, but I am questioning that
computing COVER_BILINEAR/COVER_NEAREST flags is actually more overhead
than what we do now, where computing the COVER flag has to take the
extent of the filter into account.

> A relatively high call overhead is observed for fonts rendering (which
> is going to be solved eventually by introducing a better API?).

I'm fairly happy with my glyph2 branch now. There is an initial patch to
use it in the X server here:

    http://cgit.freedesktop.org/~sandmann/xserver/commit/?h=render&id=5eb61ac775cf5f81dbc28bb9f246e52fc4c45a0f

It made glyph rendering twice as fast as measured by x11perf, and more
speedup would be possible by removing overhead from
pixman_composite_glyphs().

> But there are also other use cases where call overhead may be
> important, for example a demo like this:

>     http://people.freedesktop.org/~siamashka/files/20110529/

> Still javascript overhead is dominating there. However some C demos
> might be better for benchmarking call overhead, like animating
> something like snowflakes, fireflies or just anything involving a huge
> number of really small images. Do we need such a demo/benchmark
> program?

I think such a program would be very useful. It is a pretty important
usecase.

> PS. I think that COVER flag is mostly useless together with bilinear
> filter. Because unlike nearest filter case, it is relatively common to
> sample some pixels outside the source image with bilinear filter, so
> proper support for NONE/PAD repeat in the fast path functions is
> important anyway (and it is more or less well covered now).

If that's the case, then maybe we should just redefine COVER to mean
"has enough space for a nearest filter" and delete the bilinear/cover
fast paths.

That would fix the problem at hand while reducing call overhead.

Soren