[Pixman] fast-scale branch performance improvements

Soeren Sandmann sandmann at daimi.au.dk
Tue Mar 16 05:26:32 PDT 2010


Hi,

> On Mon, 2010-03-15 at 03:25 +0200, Siarhei Siamashka wrote:
> 
> > But before it gets committed, some problems with potential fixed point
> > overflows when dealing with large images need to be addressed. I have
> > made a test program which can expose some of these problems:
> > http://cgit.freedesktop.org/~siamashka/pixman/log/?h=largescaling-test
> 
> I pushed a new version of my branch, rebased on:
> http://cgit.freedesktop.org/~sandmann/pixman/log/?h=alex-scaler
> on (as before):
> http://cgit.freedesktop.org/~alexl/pixman/log/?h=alex-scaler

Some comments:

* The warnings that Siarhei mentioned out are because this macro:

        #define CONVERT_0565_TO_8888(s) (CONVERT_0565_TO_0888(s) | 0xff0000000)

  has an extra zero in 0xff0000000. However, fixing this is not enough
  to get scaling-test to pass.

* The macros

        CONVERT_8888_TO_8888 and 0565_TO_0565

  should have parentheses around their expansion.

* box32_to_16_with_offsets():

  Is it safe to rely in narrowing conversions as a way of checking for
  out-of-range numbers? Maybe the check could be written like this
  instead:

     if ((box32->x1 + dx > INT16_MIN) && ((box32->x1 + dx) < INT16_MAX) &&
          ...

  That also makes it clearer that what's going on here is an overflow
  check.

  With that, it seems like the box32_to_16 function could be written
  inline in the src_extents_flags() function and be just as clear.

* src_extents_flags():

  - Similarly, it would be clearer to just directly check whether
    extents16 fits in the range INT16_MIN + 1 to INT16_MAX - 1.

  - Could image_covers() be subsumed into src_extents_flags()? They
    are doing very similar things, and the name src_extents_flags
    already suggests that it is computing all the flags based on the
    extents of the source.

* Naming

  - The macro could generate the 'fast_composite_scaled_nearest' part
    of the names with "fast_composite_scaled_nearest ## ... ".

  - Can we just pass in the operator (OVER, SRC) and the repeat mode
    (NORMAL, NONE) and check against those?  The booleans are slightly
    more difficult to understand, even with the comments.

  - If we do that, the operator and repeat parts of the name could
    also be generated.

  - FAST_PATH_SAMPLES_COVERS_CLIP should be FAST_PATH_SAMPLES_COVER_CLIP

  - I think I'd prefer compute_src_extent_flags()

* This chunk:

        if (unit_x >= 0)
            while ...
        else
            while ...

  is repeated multiple times.

  Would it be possible to simply call the existing 'repeat' inline,
  possibly extended with a 'unit' parameter?  If it is, then it seems
  like all the repeat modes could be supported fairly easily and the
  existing fast_composite_scale_nearest() could go away.

  If that is not possible, then I think the "if (unit_x >= 0) ..."
  chunk should become an inline function.

* Formatting:

  - While bodies on their own line
  - Braces around multiline statements
  - Braces around bodies when the condition is multiline. Here for
    example:

            if (extents16.x1 >= 0  && extents16.y1 >= 0 &&
                extents16.x2 <= image->bits.width &&
                extents16.y2 <= image->bits.height)
                flags |= FAST_PATH_SAMPLES_COVERS_CLIP;

  - Indentation. There are a couple of

          } else /* src */
          {
                ...
          }

   where the else should be on it's own line, and the indent of the
   else body should be four spaces.

* I think Siarhei should be mentioned in the
  feb34bd2a0e53ffec5c33c0f52daa1ae27b3fb84 commit. "Based on work by
  Siarhei Siamashka" or something like that.


Thanks,
Søren


More information about the Pixman mailing list