[Pixman] [RFC] Performance reporting capabilities for pixman?

Siarhei Siamashka siarhei.siamashka at gmail.com
Wed Dec 15 07:55:27 PST 2010


On Wednesday 27 October 2010 02:52:26 Siarhei Siamashka wrote:
> The slow path reporting code discovers some interesting things, for example
> 'over_n_8_8' fast path seems to be needed for the Firefox browser when
> opening http://pandaboard.org/ page:
> 
> Oct 27 02:38:15 i7 firefox: pixman slow path: op=3 s=00010000|002E2A7F
> m=08018000|002F0A7F d=08018000|002E0A7F - 99/45254 (30.818 MPix)
> OVER
>     solid                 a8                    a8
>     -- src --             -- mask --            -- dest --
>     NARROW_FORMAT         NARROW_FORMAT         NARROW_FORMAT
>     NO_ACCESSORS          NO_ACCESSORS          NO_ACCESSORS
>     NO_ALPHA_MAP          NO_ALPHA_MAP          NO_ALPHA_MAP
>     UNIFIED_ALPHA         UNIFIED_ALPHA         UNIFIED_ALPHA
>     NO_NORMAL_REPEAT      NO_NORMAL_REPEAT      NO_NORMAL_REPEAT
>     NO_PAD_REPEAT         NO_PAD_REPEAT         NO_PAD_REPEAT
>     NO_REFLECT_REPEAT     NO_REFLECT_REPEAT     NO_REFLECT_REPEAT
>     NEAREST_FILTER        NEAREST_FILTER        NEAREST_FILTER
>     NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
>     AFFINE_TRANSFORM      AFFINE_TRANSFORM      AFFINE_TRANSFORM
>     ID_TRANSFORM          ID_TRANSFORM          ID_TRANSFORM
>     X_UNIT_POSITIVE       X_UNIT_POSITIVE       X_UNIT_POSITIVE
>     Y_UNIT_ZERO           Y_UNIT_ZERO           Y_UNIT_ZERO
>     IS_OPAQUE             SAMPLES_COVER_CLIP

At least this one has been optimized for ARM NEON in pixman git master
recently, along with some others.
 
> Surely there are some other not yet optimized pixman usage cases which can
> be encountered in the wild. And revisiting cairo traces may make sense too
> in order to make sure that we have all the optimizations which could be
> easily done.
> 
> As there are no more comments/opinions, I'm going to prepare some more or
> less final patches based on what we have now. They will be posted to the
> mailing list shortly.

The final variant of this code may need to wait because I don't quite like how
it looks. And I also would like to test it more by using in it practice to hunt
for some pixman slow paths to see whether it is effective.

Anyway, I did run cairo-perf-trace benchmark with all the fast paths disabled 
just to see what kind of operations are used and how much. It may probably help 
when introducing optimizations for new platform or looking for the
opportunities of improving performance of the existing optimizations. 
A short snippet of the most heavily used operations is listed at the end, and
a full log is attached.

Basically, all the operations fall into several groups ranging by complexity:
1. nonscaled operation - easy to implement, except maybe for some cases
involving a1 format
2. nearest scaling without mask - also easy to implement because the main loop 
template code is now available in 'pixman-fast-path.h', with the possibility to
override single scanline processing
3. two variants of nearest scaling with mask: a8 mask with SAMPLES_COVER_CLIP 
flag (most heavily used cases) and just a solid mask (the rest of the cases). 
Support for both of these is reasonably easy to add to the existing main loop 
template.
4. bilinear scaling, which eventually has to be SIMD optimized

REFLECT repeat does not seem to be used anywhere (neither in pairo-perf-trace 
logs, nor in real applications on my typical linux desktop use). So is it even
worth getting any optimizations in pixman, considering that it is more complex 
than the other types of repeat? Yes, that's somewhat similar to rotation which
almost nobody uses, but at least it is easier to imagine some valid use cases
for rotation.

The other things not covered in this log are gradients. And gradients 
contribute a lot to the performance of some cairo traces. But they are
another story.


Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=1 s=20020888|000F6AFF 
m=00000000|00000000 d=20020888|000E4AFF - 52/18175 (1897.906 MPix)
SRC
    x8r8g8b8              null                  x8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT                               NARROW_FORMAT        
    NO_ACCESSORS                                NO_ACCESSORS         
    NO_ALPHA_MAP                                NO_ALPHA_MAP         
    UNIFIED_ALPHA                               UNIFIED_ALPHA        
    NO_NORMAL_REPEAT                            NO_NORMAL_REPEAT     
    NO_PAD_REPEAT                               NO_PAD_REPEAT        
    NO_REFLECT_REPEAT                           NO_REFLECT_REPEAT    
    NEAREST_FILTER                              NEAREST_FILTER       
    NO_CONVOLUTION_FILTER                       NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM                            AFFINE_TRANSFORM     
    ID_TRANSFORM                                ID_TRANSFORM         
    X_UNIT_POSITIVE                             X_UNIT_POSITIVE      
    Y_UNIT_ZERO                                 Y_UNIT_ZERO          
    IS_OPAQUE                                   SAMPLES_OPAQUE       
    SAMPLES_COVER_CLIP                                               
    SAMPLES_OPAQUE                                                   

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F 
m=08018000|000F4A7F d=20020888|000E4AFF - 8/1168 (1279.996 MPix)
OVER
    solid                 a8                    x8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT         NARROW_FORMAT         NARROW_FORMAT        
    NO_ACCESSORS          NO_ACCESSORS          NO_ACCESSORS         
    NO_ALPHA_MAP          NO_ALPHA_MAP          NO_ALPHA_MAP         
    UNIFIED_ALPHA         UNIFIED_ALPHA         UNIFIED_ALPHA        
    NO_NORMAL_REPEAT      NO_NORMAL_REPEAT      NO_NORMAL_REPEAT     
    NO_PAD_REPEAT         NO_PAD_REPEAT         NO_PAD_REPEAT        
    NO_REFLECT_REPEAT     NO_REFLECT_REPEAT     NO_REFLECT_REPEAT    
    NEAREST_FILTER        NEAREST_FILTER        NEAREST_FILTER       
    NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM      AFFINE_TRANSFORM      AFFINE_TRANSFORM     
    ID_TRANSFORM          ID_TRANSFORM          ID_TRANSFORM         
    X_UNIT_POSITIVE       X_UNIT_POSITIVE       X_UNIT_POSITIVE      
    Y_UNIT_ZERO           Y_UNIT_ZERO           Y_UNIT_ZERO          
    IS_OPAQUE             SAMPLES_COVER_CLIP    SAMPLES_OPAQUE       

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F 
m=08018000|000F4A7F d=20028888|000E4A7F - 3/638 (702.111 MPix)
OVER
    solid                 a8                    a8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT         NARROW_FORMAT         NARROW_FORMAT        
    NO_ACCESSORS          NO_ACCESSORS          NO_ACCESSORS         
    NO_ALPHA_MAP          NO_ALPHA_MAP          NO_ALPHA_MAP         
    UNIFIED_ALPHA         UNIFIED_ALPHA         UNIFIED_ALPHA        
    NO_NORMAL_REPEAT      NO_NORMAL_REPEAT      NO_NORMAL_REPEAT     
    NO_PAD_REPEAT         NO_PAD_REPEAT         NO_PAD_REPEAT        
    NO_REFLECT_REPEAT     NO_REFLECT_REPEAT     NO_REFLECT_REPEAT    
    NEAREST_FILTER        NEAREST_FILTER        NEAREST_FILTER       
    NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM      AFFINE_TRANSFORM      AFFINE_TRANSFORM     
    ID_TRANSFORM          ID_TRANSFORM          ID_TRANSFORM         
    X_UNIT_POSITIVE       X_UNIT_POSITIVE       X_UNIT_POSITIVE      
    Y_UNIT_ZERO           Y_UNIT_ZERO           Y_UNIT_ZERO          
    IS_OPAQUE             SAMPLES_COVER_CLIP                         

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F 
m=20028888|000F497F d=20020888|000E4AFF - 7/203 (677.611 MPix)
OVER
    solid                 a8r8g8b8              x8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT         NARROW_FORMAT         NARROW_FORMAT        
    NO_ACCESSORS          NO_ACCESSORS          NO_ACCESSORS         
    NO_ALPHA_MAP          COMPONENT_ALPHA       NO_ALPHA_MAP         
    UNIFIED_ALPHA         NO_ALPHA_MAP          UNIFIED_ALPHA        
    NO_NORMAL_REPEAT      NO_NORMAL_REPEAT      NO_NORMAL_REPEAT     
    NO_PAD_REPEAT         NO_PAD_REPEAT         NO_PAD_REPEAT        
    NO_REFLECT_REPEAT     NO_REFLECT_REPEAT     NO_REFLECT_REPEAT    
    NEAREST_FILTER        NEAREST_FILTER        NEAREST_FILTER       
    NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM      AFFINE_TRANSFORM      AFFINE_TRANSFORM     
    ID_TRANSFORM          ID_TRANSFORM          ID_TRANSFORM         
    X_UNIT_POSITIVE       X_UNIT_POSITIVE       X_UNIT_POSITIVE      
    Y_UNIT_ZERO           Y_UNIT_ZERO           Y_UNIT_ZERO          
    IS_OPAQUE             SAMPLES_COVER_CLIP    SAMPLES_OPAQUE       

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F 
m=01011000|000F4A7F d=20020888|000E4AFF - 21/1657 (610.733 MPix)
OVER
    solid                 a1                    x8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT         NARROW_FORMAT         NARROW_FORMAT        
    NO_ACCESSORS          NO_ACCESSORS          NO_ACCESSORS         
    NO_ALPHA_MAP          NO_ALPHA_MAP          NO_ALPHA_MAP         
    UNIFIED_ALPHA         UNIFIED_ALPHA         UNIFIED_ALPHA        
    NO_NORMAL_REPEAT      NO_NORMAL_REPEAT      NO_NORMAL_REPEAT     
    NO_PAD_REPEAT         NO_PAD_REPEAT         NO_PAD_REPEAT        
    NO_REFLECT_REPEAT     NO_REFLECT_REPEAT     NO_REFLECT_REPEAT    
    NEAREST_FILTER        NEAREST_FILTER        NEAREST_FILTER       
    NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM      AFFINE_TRANSFORM      AFFINE_TRANSFORM     
    ID_TRANSFORM          ID_TRANSFORM          ID_TRANSFORM         
    X_UNIT_POSITIVE       X_UNIT_POSITIVE       X_UNIT_POSITIVE      
    Y_UNIT_ZERO           Y_UNIT_ZERO           Y_UNIT_ZERO          
    IS_OPAQUE             SAMPLES_COVER_CLIP    SAMPLES_OPAQUE       

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=1 s=20020888|000F6AFF 
m=00000000|00000000 d=20028888|000E4A7F - 198/202467 (564.478 MPix)
SRC
    x8r8g8b8              null                  a8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT                               NARROW_FORMAT        
    NO_ACCESSORS                                NO_ACCESSORS         
    NO_ALPHA_MAP                                NO_ALPHA_MAP         
    UNIFIED_ALPHA                               UNIFIED_ALPHA        
    NO_NORMAL_REPEAT                            NO_NORMAL_REPEAT     
    NO_PAD_REPEAT                               NO_PAD_REPEAT        
    NO_REFLECT_REPEAT                           NO_REFLECT_REPEAT    
    NEAREST_FILTER                              NEAREST_FILTER       
    NO_CONVOLUTION_FILTER                       NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM                            AFFINE_TRANSFORM     
    ID_TRANSFORM                                ID_TRANSFORM         
    X_UNIT_POSITIVE                             X_UNIT_POSITIVE      
    Y_UNIT_ZERO                                 Y_UNIT_ZERO          
    IS_OPAQUE                                                        
    SAMPLES_COVER_CLIP                                               
    SAMPLES_OPAQUE                                                   

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E4A7F 
m=08018000|000F4A7F d=20028888|000E4A7F - 411/582133 (541.384 MPix)
OVER
    solid                 a8                    a8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT         NARROW_FORMAT         NARROW_FORMAT        
    NO_ACCESSORS          NO_ACCESSORS          NO_ACCESSORS         
    NO_ALPHA_MAP          NO_ALPHA_MAP          NO_ALPHA_MAP         
    UNIFIED_ALPHA         UNIFIED_ALPHA         UNIFIED_ALPHA        
    NO_NORMAL_REPEAT      NO_NORMAL_REPEAT      NO_NORMAL_REPEAT     
    NO_PAD_REPEAT         NO_PAD_REPEAT         NO_PAD_REPEAT        
    NO_REFLECT_REPEAT     NO_REFLECT_REPEAT     NO_REFLECT_REPEAT    
    NEAREST_FILTER        NEAREST_FILTER        NEAREST_FILTER       
    NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM      AFFINE_TRANSFORM      AFFINE_TRANSFORM     
    ID_TRANSFORM          ID_TRANSFORM          ID_TRANSFORM         
    X_UNIT_POSITIVE       X_UNIT_POSITIVE       X_UNIT_POSITIVE      
    Y_UNIT_ZERO           Y_UNIT_ZERO           Y_UNIT_ZERO          
                          SAMPLES_COVER_CLIP                         

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=12 s=20028888|
000F497F m=00000000|00000000 d=20028888|000E497F - 4/24 (511.706 MPix)
ADD
    a8r8g8b8              null                  a8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT                               NARROW_FORMAT        
    NO_ACCESSORS                                NO_ACCESSORS         
    COMPONENT_ALPHA                             COMPONENT_ALPHA      
    NO_ALPHA_MAP                                NO_ALPHA_MAP         
    NO_NORMAL_REPEAT                            NO_NORMAL_REPEAT     
    NO_PAD_REPEAT                               NO_PAD_REPEAT        
    NO_REFLECT_REPEAT                           NO_REFLECT_REPEAT    
    NEAREST_FILTER                              NEAREST_FILTER       
    NO_CONVOLUTION_FILTER                       NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM                            AFFINE_TRANSFORM     
    ID_TRANSFORM                                ID_TRANSFORM         
    X_UNIT_POSITIVE                             X_UNIT_POSITIVE      
    Y_UNIT_ZERO                                 Y_UNIT_ZERO          
    SAMPLES_COVER_CLIP                                               

Dec  1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=20028888|000E9E7E 
m=00000000|00000000 d=20020888|000E4AFF - 243/247950 (485.485 MPix)
OVER
    a8r8g8b8              null                  x8r8g8b8             
    -- src --             -- mask --            -- dest --           
    NARROW_FORMAT                               NARROW_FORMAT        
    NO_ACCESSORS                                NO_ACCESSORS         
    NO_ALPHA_MAP                                NO_ALPHA_MAP         
    UNIFIED_ALPHA                               UNIFIED_ALPHA        
    NO_NONE_REPEAT                              NO_NORMAL_REPEAT     
    NO_PAD_REPEAT                               NO_PAD_REPEAT        
    NO_REFLECT_REPEAT                           NO_REFLECT_REPEAT    
    NEAREST_FILTER                              NEAREST_FILTER       
    NO_CONVOLUTION_FILTER                       NO_CONVOLUTION_FILTER
    AFFINE_TRANSFORM                            AFFINE_TRANSFORM     
    HAS_TRANSFORM                               ID_TRANSFORM         
    SCALE_TRANSFORM                             X_UNIT_POSITIVE      
    X_UNIT_POSITIVE                             Y_UNIT_ZERO          
    Y_UNIT_ZERO                                 SAMPLES_OPAQUE       


-- 
Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cairo-perf-trace-all-fast-path.txt.gz
Type: application/x-gzip
Size: 2437 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20101215/c2ce482e/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20101215/c2ce482e/attachment.pgp>


More information about the Pixman mailing list