[Pixman] [ssse3]Optimization for fetch_scanline_x8r8g8b8
Soeren Sandmann
sandmann at daimi.au.dk
Thu Sep 2 15:31:52 PDT 2010
"Xu, Samuel" <samuel.xu at intel.com> writes:
> Hi, Siarhei Siamashka:
> Attached patch has updated copyright part (only copyright change). we referred http://cgit.freedesktop.org/pixman/tree/COPYING.
> Yes, As you assumed, we tested on multiple 32/64 bit boxes w/o
> and w/ SSSE3.
Here are some more comments:
* The 64 bit CPU detection is broken. It doesn't use SSE2 and MMX when
SSSE3 is not detected.
* Siarhei asked whether it would be possible to unify the 32 and 64
bit assembly sources. I don't think you commented on that.
* We need a more detailed description in the git commit log
* The check for the Sun Studio compiler is wrong. If SSSE3 is only
supported in the latest version of Sun Studio, then configure should
check for that version.
* Will Sun Studio actually accept GNU assembly files?
* Why check for SSSE3 intrinsics at all? You don't use them for
anything. All use of SSSE3 is in assembly files. The thing that
needs to be checked for is whether the assembly files will pass
through the compiler.
* Store forwarding
- We need some comments in the assembly about the store forwarding
that Ma Ling described.
- Siarhei's questions about it should be answered, preferably in
those comments:
So it is basically a store forwarding aliasing problem which is
described in "12.3.3.1 Store Forwarding" section from "Intel® 64
and IA-32 Architectures Optimization Reference Manual", right?
Wouldn't just the use of MOVD/MOVSS instructions here also solve
this problem? Store forwarding does not seem to be used for SIMD
according to the manual. I haven't benchmarked anything yet
though.
http://lists.cairographics.org/archives/pixman/2010-August/000425.html
* About this line in the 64 bit assembly and the corresponding one in
the 32 bit assembly:
#if (defined(__amd64__) || defined(__x86_64__) ||defined(_M_AMD64))
Makoto Kato asked:
Why _M_AMD64? _M_AMD64 macro is for MSVC++ x64 version. This
assembler is for GNU as.
You didn't comment on this.
* We might as well be consistent with the naming used in the ARM NEON
assembly files:
- The assembly file should be called "pixman-ssse3-asm.S", or
"pixman-ssse3-x86-32-asm.S and pixman-ssse3-x86-64-asm.S" if there
is a good reason they can't be unified.
- The function in the assembly should be called something like
composite_line_src_x888_8888_ssse3
"line" to indicate the fact that it doesn't do full 2D, and the
"fast_path" in the current name is confusing because it suggest a
connection to the plain C 'fast path' implementation (there is no
such connection).
* Cut-and-paste issues in pixman-ssse3:
- #include "pixman-combine32.h"
This is not used as far as I can tell
- /* PIXMAN_OP_OVER */
The two fast paths are both SRC fast paths, not OVER.
- /*---------------------------------------------------------------------
* composite_over_8888_n_8888
*/
The function is src_x888_8888(), not over_8888_n_8888(). But the
comment in unneccessary since there is only one function.
Soren
More information about the Pixman
mailing list