[Pixman] [PATCH 2/2] ARM: Add 'neon_composite_over_n_8888_0565' fast path
Taekyun Kim
podain77 at gmail.com
Wed Apr 13 02:11:59 PDT 2011
For a particular pixel block we should do following steps.
(for over_n_8888_0565 case)
1. fetch dest
2. fetch mask
3. combine_mask_ca
4. convert dest to x888
5. combine_over_ca part A
6. combine_over_ca part B
7. convert result to 0565
8. store result
(put cache preload somewhere)
Your version is the case with
head = (3, 4, 5)
tail = (6, 7)
tail_head = (6, 7, 3, 4, 5, 8) with 1, 2 is in the middle of block 6
We can figure out input/output/temp registers of each block.
So the dependency chain and critical path can be identified.
Let's see core tail_head block
.macro n_8888_0565_ca_tail_head
6. combine_over_ca part B
vrshr.u16 q10, q6, #8
vrshr.u16 q14, q7, #8
1. fetch dest
vrshr.u16 q15, q11, #8
vraddhn.u16 d16, q10, q6
vraddhn.u16 d17, q14, q7
vraddhn.u16 d18, q15, q11
2. fetch mask
/* bubble if above block 2 does not exist */
vqadd.u8 q8, q0, q8
/* bubble if above block 2 does not exist */
vqadd.u8 d18, d2, d18
/* bubble with following block 7 */
7. convert result to 0565
vshll.u8 q14, d18, #8
vshll.u8 q10, d17, #8
vshll.u8 q15, d16, #8
vsri.u16 q14, q10, #5
/* bubble */
vsri.u16 q14, q15, #11
cache_preload 8, 8
3. combine_mask_ca
4. convert dest to x888
5. combine_over_ca part A
8. store destination
.endm
I marked bubbles that I could find.
Here we can make step 3 independent(or less dependent) from above step 6 and
7 by proper allocation of registers.
So we can insert some instructions of step 3 into the above bubble
positions.
Output of step 1(fetch dest) will be read in step 4 and output of step
2(fetch mask) will be read in step 3.
So I think you can fetch mask first and then dest at the beginning of
tail_head block and remaining bubbles can be filled with instructions from
step 3.
Maybe this does not work, or there can be some other better ways to achieve
optimal performance.
--
Best Regards,
Taekyun Kim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20110413/dd5d16ae/attachment.htm>
More information about the Pixman
mailing list