[Pixman] [PATCH 2/3] Optimize BILINEAR filter to NEAREST for identity transforms

Tue May 31 14:32:11 PDT 2011

On Tue, May 31, 2011 at 8:47 PM, Soeren Sandmann <sandmann at cs.au.dk> wrote:
> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>
>>> But this breaks down if the compositing code starts *relying* on the
>>> flags being set, for example to know what the filter is; it will no
>>> longer be the case that "Forgetting to set some of the flags is safe in
>>> the sense that it will not make pixman misbehave, pixman will just run
>>> slower because of not taking some fast paths".
>>>
>>> This is an important rule to preserve, because it allows people to work
>>> on the flag computing code without knowing the details of the
>>> compositing code, and it allows people to write a new compositing
>>> routine or implementation without knowing that the flags even exist.
>>
>> I see it more like as an extra convenience feature. It's just because
>> some flags may be difficult to set precisely, so that it is always set
>> when we do have some feature and not set when we don't. One example is
>> SAMPLE_OPAQUE flag (we would need to check all the pixels of a8r8g8b8
>> images when setting flags to be sure that we never miss any opaque
>> image, and this naturally does not make sense). But some of the flags
>> may be easily set precisely.
>
> The rule has at least two benefits: First, the code that computes flags
> is allowed to just not set some flags if it would be inconvenient. We
> take advantage of this in this piece of code:
>
>    /* Both alpha maps and convolution filters can introduce
>     * non-opaqueness in otherwise opaque images. Also
>     * an image with component alpha turned on is only opaque
>     * if all channels are opaque, so we simply turn it off
>     * unconditionally for those images.
>     */
>    if (image->common.alpha_map                                 ||
>        image->common.filter == PIXMAN_FILTER_CONVOLUTION       ||
>        image->common.component_alpha)
>    {
>        flags &= ~(FAST_PATH_IS_OPAQUE | FAST_PATH_SAMPLES_OPAQUE);
>    }
>
> where some unlikely special cases means the opaque flags can't be
> set. If the opaque flags had to be precise, this issue couldn't be dealt
> with in such a simple way, and writing and reviewing the code would
> become harder.

Nobody suggested making opaque flags precise. And my comment above
also tried to explain why it is a bad idea for this particular flag.
But there 32 bits available for various flags. And *all* flags can be
used for positive testing of the features they represent. But at the
same time, *some* of the flags de-facto can be used for negative
testing too. Just like:

if (!(image->common.flags & FAST_PATH_SAMPLES_OPAQUE))
{
    /* We can't be sure about anything here */
}

if (!(image->common.flags & FAST_PATH_ID_TRANSFORM))
{
    /* We are sure that we don't have identity transform here */
}

> Second, currently, if compositing code is relying on flags for negative
> information, it is almost certainly broken. With this patch, compositing
> code would be in some cases *required* to treat the flags in a negative
> way, and in other cases *broken* if it did. This means such code can no
> longer be written or reviewed without referencing the documentation for
> individual flags.

Proper constants naming can come to the rescue. These special flags
can get 'GUARANTEED', 'PRECISE', 'EXACT', 'USABLEFORNEGATIVETESTING'
or maybe some other better word as part of their name.

Some of the flags (accidentally or not) have a nice property of being
also useful for negative testing. This information can be either taken
into use, or ignored and obtained in some other way. And there is
surely nothing wrong with obtaining the information in another way,
but it may be just less optimal for performance and memory footprint.

>>> When such general rules break down is when the code base becomes
>>> incomprehensible and only understandable to people who are intimately
>>> familiar with it.
>>
>> The rules just need to be properly documented. Fast path flags can
>> have a longer comment section, explaining what they actually mean and
>> whether they are precise.
>
> Having to document individual flags in greater detail is another symptom
> of a misdesign going on.

It's difficult to understand what some flags mean by just looking at
their names alone (the name can't always describe the whole feature).
The names indeed provide a good hint, but reading the code is still
necessary. And the real symptom of misdesign is when the flag can't be
interpreted unambiguously even after reading the code. Adding
comments, briefly describing the purpose of less obvious flags can
save some time for the people getting familiar with the code. And also
such comments can be used for additional verification if there are
some doubts whether the code really does what was indented.

>> Even though both FILTER_NEAREST and FILTER_BILINEAR flags may be valid
>> for certain operations, setting both of them may be no very practical,
>> or even harmful. Just one example. Currently for ARM target we
>> implement bilinear scaling using NEON instructions (fast path
>> registered in 'pixman-arm-neon.c'), but nearest scaling does not need
>> it, so we do it with just ARMv6 (with fast path registered in
>> 'pixman-arm-simd.c'). Now let's suppose that we have both
>> FILTER_NEAREST and FILTER_BILINEAR flags set for some scaling
>> operation. And in this case NEON bilinear fast path will be selected
>> instead of a better nearest fast path which is available further in
>> the list after fallback.
>
> This example is an instance of a more general problem:
>
>   http://www.mail-archive.com/pixman@lists.freedesktop.org/msg00888.html
>
> It's not an easy one to fix. Setting at most one of the filter flags is
> a band-aid for one particular case, not a fix for the underlying
> problem. It may be worth applying this band-aid and avoiding setting
> both flags, but this could be done in the COVER_NEAREST/BILINEAR flags
> as well.
>
> The main point I wanted to make is that it would be *legal* (and
> potentially useful) to set both. It wouldn't make pixman behave
> incorrectly.

Right now I just can't imagine the situation when it can be useful to
set both bilinear and nearest flags. In my opinion it's a redundant
extra and potentially confusing information.

So taking special care to implement the code to set both flags and
then additionally taking care in the fast path selection logic to
avoid matching less optimal fast path functions seems rather counter
productive. It looks like first creating the problem, and then working
hard to solve it, with zero overall effect in the end. Or more like
negative overall effect because the performance is likely going to be
negatively impacted to some extent.

>> I still think that optimizing the filter as early as possible in the
>>pipeline is the best choice. So that the rest of the pipeline just does
>>not need to care. And while it's probably just more like a band aid
>>(the performance is still not going to be great), optimized filter is
>>also somewhat helping here:
>>http://cgit.freedesktop.org/pixman/tree/pixman/pixman-bits-image.c?id=pixman-0.22.0#n565
>
> If this patch were the *only* way to fix an important performance
> problem, it could be considered, but in this case there is a
> straightforward other way: just compute COVER_BILINEAR/COVER_NEAREST
> flags. That doesn't break the 'flags are interpreted positively' rule,
> and doesn't impose additional rules on the compositing code such as "you
> must access the filter through the flag bits, not through the regular
> variable".

If the call overhead is not taken into account at all, the problem can
indeed be solved in a wide variety of different ways. If it is, then
some solutions may suddenly become less attractive.

A relatively high call overhead is observed for fonts rendering (which
is going to be solved eventually by introducing a better API?). But
there are also other use cases where call overhead may be important,
for example a demo like this:
    http://people.freedesktop.org/~siamashka/files/20110529/
Still javascript overhead is dominating there. However some C demos
might be better for benchmarking call overhead, like animating
something like snowflakes, fireflies or just anything involving a huge
number of really small images. Do we need such a demo/benchmark
program?

PS. I think that COVER flag is mostly useless together with bilinear
filter. Because unlike nearest filter case, it is relatively common to
sample some pixels outside the source image with bilinear filter, so
proper support for NONE/PAD repeat in the fast path functions is
important anyway (and it is more or less well covered now).

-- 
Best regards,
Siarhei Siamashka