[Pixman] [ssse3]Optimization for fetch_scanline_x8r8g8b8

Soeren Sandmann sandmann at daimi.au.dk
Thu Sep 2 15:31:52 PDT 2010

"Xu, Samuel" <samuel.xu at intel.com> writes:

> Hi, Siarhei Siamashka:
> 	Attached patch has updated copyright part (only copyright change). we referred http://cgit.freedesktop.org/pixman/tree/COPYING.
> 	Yes, As you assumed, we tested on multiple 32/64 bit boxes w/o
> and w/ SSSE3. 

Here are some more comments:

* The 64 bit CPU detection is broken. It doesn't use SSE2 and MMX when
  SSSE3 is not detected.

* Siarhei asked whether it would be possible to unify the 32 and 64
  bit assembly sources. I don't think you commented on that.

* We need a more detailed description in the git commit log

* The check for the Sun Studio compiler is wrong. If SSSE3 is only
  supported in the latest version of Sun Studio, then configure should
  check for that version.

* Will Sun Studio actually accept GNU assembly files?

* Why check for SSSE3 intrinsics at all? You don't use them for
  anything. All use of SSSE3 is in assembly files. The thing that
  needs to be checked for is whether the assembly files will pass
  through the compiler.

* Store forwarding

  - We need some comments in the assembly about the store forwarding
    that Ma Ling described.

  - Siarhei's questions about it should be answered, preferably in
    those comments:

     So it is basically a store forwarding aliasing problem which is
     described in " Store Forwarding" section from "Intel® 64
     and IA-32 Architectures Optimization Reference Manual", right?

     Wouldn't just the use of MOVD/MOVSS instructions here also solve
     this problem?  Store forwarding does not seem to be used for SIMD
     according to the manual. I haven't benchmarked anything yet


* About this line in the 64 bit assembly and the corresponding one in
  the 32 bit assembly:

   #if (defined(__amd64__) || defined(__x86_64__) ||defined(_M_AMD64))

   Makoto Kato asked:

        Why _M_AMD64?  _M_AMD64 macro is for MSVC++ x64 version.  This
        assembler is for GNU as.

   You didn't comment on this.

* We might as well be consistent with the naming used in the ARM NEON
  assembly files:

  - The assembly file should be called "pixman-ssse3-asm.S", or
    "pixman-ssse3-x86-32-asm.S and pixman-ssse3-x86-64-asm.S" if there
    is a good reason they can't be unified.

  - The function in the assembly should be called something like


    "line" to indicate the fact that it doesn't do full 2D, and the
    "fast_path" in the current name is confusing because it suggest a
    connection to the plain C 'fast path' implementation (there is no
    such connection).

* Cut-and-paste issues in pixman-ssse3:

  - #include "pixman-combine32.h"

  This is not used as far as I can tell

  - /* PIXMAN_OP_OVER */
  The two fast paths are both SRC fast paths, not OVER.

  - /*---------------------------------------------------------------------
    * composite_over_8888_n_8888
    The function is src_x888_8888(), not over_8888_n_8888(). But the
    comment in unneccessary since there is only one function.


More information about the Pixman mailing list