[Pixman] [ssse3]Optimization for fetch_scanline_x8r8g8b8
Xu, Samuel
samuel.xu at intel.com
Tue Nov 16 21:47:39 PST 2010
Hi, Soeren Sandmann and Siarhei Siamashka:
Glad to send out this refreshed patch to address points we discussed for this SSSE3 patch.
In this new patch, we merged 32 bit and 64 bit asm code into unified code (Macros are moved to header file, and make them more reusable and easily to be read). We also fixed issue like CPU detection/git log/make file checking/file name...etc. For MOVD, we simplified the backward copy code since pervious code is too long and not gain obvious performance,
Thanks!
Samuel
-----Original Message-----
From: Xu, Samuel
Sent: Thursday, September 09, 2010 12:56 PM
To: 'sandmann at daimi.au.dk'; Siarhei Siamashka; Ma, Ling; Liu, Xinyun
Subject: RE: [Pixman] [ssse3]Optimization for fetch_scanline_x8r8g8b8
Hi, Soeren Sandmann and Siarhei Siamashka:
It is really a long discussion and thanks for your patience! I believe all of us has same target to make pixman faster and better. Let's continue...
Merge 32 bit and 64 bit asm code into unified asm file is possible while has pros and cons. E.g. I just tried Sun studio(my first experience on SS) and found SS 12 can't expand Macro correctly, so failed to build some asm files containing Macro. To make SS work, we originally want to remove existing macros, while it seems we should use more macros.
Anyway, we will go ahead to following your 2 more suggestions: 1) movd adoption 2) unify 32 bit and 64 bit asm code via macro.
So the new patch will be shaped as:
1. Fix 64 bit CPU detection issue for MMX and SSE2 2. Add more comments for git commit log 3. change SSSE3 intrinsics check to SSSE3 asm check in makefile 4. remove #include "pixman-combine32.h" and composite_over_8888_n_8888in pixman-ssse3.c 5. ASM files changes:
1) Unify 32 bit and 64 bit asm code via macro
2) change asm file name to pixman-ssse3-x86-32-asm.S and pixman-ssse3-x86-64-asm.S
3) change the asm function name to "composite_line_src_x888_8888_ssse3"
4) remove "defined(_M_AMD64))"
5) Comments in the assembly about the store forwarding
6) Use MOVD as Siarhei Siamashka's idea
Please raise the flag if those change are not completed or if Sun studio's priority is higher than unified 32 bit and 64 bit code.
Thanks!
Samuel
-----Original Message-----
From: sandmann at daimi.au.dk [mailto:sandmann at daimi.au.dk]
Sent: Thursday, September 09, 2010 10:16 AM
To: Xu, Samuel
Cc: Siarhei Siamashka; pixman at lists.freedesktop.org; Liu, Xinyun; Ma, Ling
Subject: Re: [Pixman] [ssse3]Optimization for fetch_scanline_x8r8g8b8
"Xu, Samuel" <samuel.xu at intel.com> writes:
> Hi, Soeren Sandmann and Siarhei Siamashka:
>
> As a wrap of current discussion, combining you two's comments, can we assume this new patch of SSSE3 is ok?
> New patch might contains:
> 1. Fix 64 bit CPU detection issue for MMX and SSE2 2. Add more
> comments for git commit log 3. change SSSE3 intrinsics check to SSSE3
> asm check in makefile 4. remove #include "pixman-combine32.h" and
> composite_over_8888_n_8888in pixman-ssse3.c 5. ASM files changes:
> 1) change asm file name to pixman-ssse3-x86-32-asm.S and pixman-ssse3-x86-64-asm.S
> 2) change the asm function name to "composite_line_src_x888_8888_ssse3"
> 3) remove "defined(_M_AMD64))"
> 4) Comments in the assembly about the store forwarding
Such changes will certainly improve the patch.
> We know there are some discussion on asm code, e.g. MOVD, unify 32 bit
> and 64 bit code. While it won't introduce real defection. We still can
> put further change in next wave, to avoid current sticking.
No, we are not going to apply any patches that we know will need to be fixed later, because I don't believe anyone is going to do that work.
It's fine to send a patch series if you think it's easier that way.
I am also interested in the answer to the question of whether this code was generated by passing intrinsics through the Intel compiler.
> I am not sure for the issue of sun studio. Sun studio declares GNU asm
> compatibility, I am not sure whether it is 100% compatible. If issue
> is caused by Sun Studio itself, can we add #ifdef to avoid
> SSSE3 patch of Sun studio? In this case, how to determine Sun studio?
There are various approaches you can take with respect to unusual compilers such as Sun Studio. The most common approach is to only enable the new optimizations for the compilers that you are personally interested in, and leave other compilers to people who are interested.
However, *you* introduced all this Sun Studio handling for SSSE3, which means *you* need to be able to explain what it does, and justify why it is there.
Or to put it even more bluntly: Do not cut and paste code you don't understand and can't test, and then ask me to rewrite the code for you until it is good enough to go in.
Soren
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-SSSE3-asm-composite_line_src_x888_8888.patch
Type: application/octet-stream
Size: 36980 bytes
Desc: 0001-SSSE3-asm-composite_line_src_x888_8888.patch
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20101117/e07dfecb/attachment-0001.obj>
More information about the Pixman
mailing list