[Pixman] [PATCH 1/5] REPEAT_NORMAL support for nearest bilinear fast path

Wed May 25 05:15:43 PDT 2011

On Tue, May 24, 2011 at 6:02 PM, Taekyun Kim <podain77 at gmail.com> wrote:
> For sse2 and NEON fast path, nearest was also quite faster than general
> path. I will post some performance data for various nearest/bilinear cases
> later with your comments are applied.
>
> As you mentioned,  C fast path is slower than before due to function call
> overhead and division inside the loop.

Yes, we have a choice of implementing normal repeat by either having a
very small overhead per pixel in the scanline, or by taking your
approach and using existing none/pad scanline functions and stitching
them together, with less frequent but higher overhead between scanline
function calls. Both have their own advantages and weak points.

> To achieve REPEAT_NORMAL support, we can make scanline functions to support
> it(Plan A) or we can choose break-down-to-non-repeat approach(Plan B).
> Obviously the plan B causes function call overhead, so basically I think
> the plan A could be an optimal solution. But it requires a lot of changes in
> current fast path implementations.

Appears that "Plan A" is not so difficult to implement. I have a
nearly finished work-in-progress implementation here:
    http://cgit.freedesktop.org/~siamashka/pixman/log/?h=nearest-normal-repeat

And the overhead can be reduced a bit more by replacing "while ()"
with "if ()" in the case if we restrict fast paths only to be used for
the cases when "unit_x < source image width". I'll try to tweak the
code a bit to see if the performance can be improved. The ARM NEON
fast paths already support nearest scaling for 16bpp and 32bpp
sources, and normal repeat actually fits quite naturally there.

> As an alternative, we can first take plan B (because we can benifit from
> already implemented sse2 and NEON fast paths) and later move to plan A one
> by one if it is requied.

I propose not to run after two hares and just focus on implementing
your method for bilinear scaling first. Then decide what to do with
the nearest scaling later.

-- 
Best regards,
Siarhei Siamashka