[Pixman] [PATCH 23 & 24/32] armv6: Add optimised scanline fetchers

Ben Avison bavison at riscosopen.org
Wed Apr 22 16:53:21 PDT 2015


On Sun, 05 Oct 2014 20:03:42 +0100, Søren Sandmann <soren.sandmann at gmail.com> wrote:
> Don't you need "FAST_PATH_STANDARD_DST_FLAGS" in the ITER_DEST case?

The selection of the correct flags in the iter_info_t struct (and in the
pixman_fast_path_t struct too for that matter) always seemed to be
something of a black art to me. Too many bits set in the required-flags
bitmasks and the accelerated routine isn't used to its fullest. Too few
and it gets used when it shouldn't - and this is doubly dangerous because
I'm not sure that we can count on the test suite to exercise every
possible flag combination and thereby call any bugs to our attention.

As far as I recall, I wasn't comfortable with the meaning of all the
flags when I wrote this, and erred on the side of caution - I started
with a bitmask for destination images that matched the one I'd deduced
for source images, and only knocked out the minimum number of bits to
ensure a match when running under lowlevel-blt-bench. If you want the
flags to be reduced to FAST_PATH_STD_DEST_FLAGS (= FAST_PATH_NO_ACCESSORS
+ FAST_PATH_NO_ALPHA_MAP + FAST_PATH_NARROW_FORMAT) then I think we need
a careful bit-by-bit analysis to ensure it's warranted.

To start with, let's look at a table of the image and iter flags that are
set for an operation using r5g6b5 source and destination. These represent
the maximum number of flags that we need to consider:

  Flag bit                                Unscaled       Scaled
                                                    Nearest   Bilinear
                                          Src  Dst  Src  Dst  Src  Dst
  Image flags
   0 FAST_PATH_ID_TRANSFORM                *    *         *         *
   1 FAST_PATH_NO_ALPHA_MAP                *    *    *    *    *    *
   2 FAST_PATH_NO_CONVOLUTION_FILTER       *    *    *    *    *    *
   3 FAST_PATH_NO_PAD_REPEAT               *    *    *    *    *    *
   4 FAST_PATH_NO_REFLECT_REPEAT           *    *    *    *    *    *
   5 FAST_PATH_NO_ACCESSORS                *    *    *    *    *    *
   6 FAST_PATH_NARROW_FORMAT               *    *    *    *    *    *
   7 FAST_PATH_SAMPLES_OPAQUE              *    *    *    *    *    *
   9 FAST_PATH_UNIFIED_ALPHA               *    *    *    *    *    *
  10 FAST_PATH_SCALE_TRANSFORM                       *         *
  11 FAST_PATH_NEAREST_FILTER              *    *    *    *         *
  12 FAST_PATH_HAS_TRANSFORM                         *         *
  13 FAST_PATH_IS_OPAQUE                   *         *
  14 FAST_PATH_NO_NORMAL_REPEAT            *    *    *    *    *    *
  16 FAST_PATH_X_UNIT_POSITIVE             *    *    *    *    *    *
  17 FAST_PATH_AFFINE_TRANSFORM            *    *    *    *    *    *
  18 FAST_PATH_Y_UNIT_ZERO                 *    *    *    *    *    *
  19 FAST_PATH_BILINEAR_FILTER                                 *
  23 FAST_PATH_SAMPLES_COVER_CLIP_NEAREST  *         *         *
  25 FAST_PATH_BITS_IMAGE                  *    *    *    *    *    *
  Iter flags
   0 ITER_NARROW                           *    *    *    *    *    *
   2 ITER_LOCALIZED_ALPHA                  *         *
   4 ITER_IGNORE_RGB                            *         *
   5 ITER_SRC                              *         *         *
   6 ITER_DEST                                  *         *         *

The set of destination image flags in common between the three cases are
the ones shown below. The flags included in my original patch are marked
with an A, and the flags included in FAST_PATH_STD_DEST_FLAGS with a B.
Those needed to pass "make check" have a C.

   0 FAST_PATH_ID_TRANSFORM           A   D
   1 FAST_PATH_NO_ALPHA_MAP           ABC
   2 FAST_PATH_NO_CONVOLUTION_FILTER  A     F
   3 FAST_PATH_NO_PAD_REPEAT               E
   4 FAST_PATH_NO_REFLECT_REPEAT           E
   5 FAST_PATH_NO_ACCESSORS           ABC
   6 FAST_PATH_NARROW_FORMAT          AB
   7 FAST_PATH_SAMPLES_OPAQUE
   9 FAST_PATH_UNIFIED_ALPHA
  11 FAST_PATH_NEAREST_FILTER         A     F
  14 FAST_PATH_NO_NORMAL_REPEAT            E
  16 FAST_PATH_X_UNIT_POSITIVE            D
  17 FAST_PATH_AFFINE_TRANSFORM           D
  18 FAST_PATH_Y_UNIT_ZERO                D
  25 FAST_PATH_BITS_IMAGE             A

Having thought about things, I would contend that we shouldn't care about:
* any flag related to transforms (D) - although any image can have a
   transform assigned to it, it is only significant when that image is
   used as a source or mask
* any flag related to repeat type (E) - again, although any image can
   have a repeat type, it is not significant when used as a destination
* any flag related to filtering (F) - same again
* FAST_PATH_NARROW_FORMAT because it's implied by the fact that
   ITER_NARROW is set
* FAST_PATH_SAMPLES_OPAQUE because it isn't used directly either in
   selecting fast paths or in rendering, but only for calculating
   FAST_PATH_IS_OPAQUE (which is unset due to the COVER_CLIP flags being
   unset)
* FAST_PATH_UNIFIED_ALPHA because it's only significant when the image is
   used as a mask

That leaves the significant flags for destination images as
FAST_PATH_NO_ALPHA_MAP
FAST_PATH_NO_ACCESSORS
FAST_PATH_BITS_IMAGE

Bearing in mind the segfault that Pekka noted when selecting a solid
destination (see his email of 2015-04-13) I'm tempted to say we should
actually add FAST_PATH_BITS_IMAGE to FAST_PATH_STD_DEST_FLAGS. I'll post
a patch to do this next, then updated versions of the ARMv6 unscaled
fetcher patches, modified to use FAST_PATH_STD_DEST_FLAGS for writeback
routines and to drop FAST_PATH_NEAREST_FILTER from the fetchers' required
image flags (this being implied by the combination of
FAST_PATH_ID_TRANSFORM and FAST_PATH_NO_CONVOLUTION_FILTER).

> Also, since you have a writeback function for 565, you might want to
> copy the fast_dest_fetch_noop setup from pixman-fast-path.c

As far as I can see, this writeback function can only be used for CLEAR
and SRC combiners so it has limited use. However, it's easy enough to
add. I've included it in patch 23 v2.

Ben


More information about the Pixman mailing list