[Pixman] [RFC] Performance reporting capabilities for pixman?
Siarhei Siamashka
siarhei.siamashka at gmail.com
Wed Dec 15 07:55:27 PST 2010
On Wednesday 27 October 2010 02:52:26 Siarhei Siamashka wrote:
> The slow path reporting code discovers some interesting things, for example
> 'over_n_8_8' fast path seems to be needed for the Firefox browser when
> opening http://pandaboard.org/ page:
>
> Oct 27 02:38:15 i7 firefox: pixman slow path: op=3 s=00010000|002E2A7F
> m=08018000|002F0A7F d=08018000|002E0A7F - 99/45254 (30.818 MPix)
> OVER
> solid a8 a8
> -- src -- -- mask -- -- dest --
> NARROW_FORMAT NARROW_FORMAT NARROW_FORMAT
> NO_ACCESSORS NO_ACCESSORS NO_ACCESSORS
> NO_ALPHA_MAP NO_ALPHA_MAP NO_ALPHA_MAP
> UNIFIED_ALPHA UNIFIED_ALPHA UNIFIED_ALPHA
> NO_NORMAL_REPEAT NO_NORMAL_REPEAT NO_NORMAL_REPEAT
> NO_PAD_REPEAT NO_PAD_REPEAT NO_PAD_REPEAT
> NO_REFLECT_REPEAT NO_REFLECT_REPEAT NO_REFLECT_REPEAT
> NEAREST_FILTER NEAREST_FILTER NEAREST_FILTER
> NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
> AFFINE_TRANSFORM AFFINE_TRANSFORM AFFINE_TRANSFORM
> ID_TRANSFORM ID_TRANSFORM ID_TRANSFORM
> X_UNIT_POSITIVE X_UNIT_POSITIVE X_UNIT_POSITIVE
> Y_UNIT_ZERO Y_UNIT_ZERO Y_UNIT_ZERO
> IS_OPAQUE SAMPLES_COVER_CLIP
At least this one has been optimized for ARM NEON in pixman git master
recently, along with some others.
> Surely there are some other not yet optimized pixman usage cases which can
> be encountered in the wild. And revisiting cairo traces may make sense too
> in order to make sure that we have all the optimizations which could be
> easily done.
>
> As there are no more comments/opinions, I'm going to prepare some more or
> less final patches based on what we have now. They will be posted to the
> mailing list shortly.
The final variant of this code may need to wait because I don't quite like how
it looks. And I also would like to test it more by using in it practice to hunt
for some pixman slow paths to see whether it is effective.
Anyway, I did run cairo-perf-trace benchmark with all the fast paths disabled
just to see what kind of operations are used and how much. It may probably help
when introducing optimizations for new platform or looking for the
opportunities of improving performance of the existing optimizations.
A short snippet of the most heavily used operations is listed at the end, and
a full log is attached.
Basically, all the operations fall into several groups ranging by complexity:
1. nonscaled operation - easy to implement, except maybe for some cases
involving a1 format
2. nearest scaling without mask - also easy to implement because the main loop
template code is now available in 'pixman-fast-path.h', with the possibility to
override single scanline processing
3. two variants of nearest scaling with mask: a8 mask with SAMPLES_COVER_CLIP
flag (most heavily used cases) and just a solid mask (the rest of the cases).
Support for both of these is reasonably easy to add to the existing main loop
template.
4. bilinear scaling, which eventually has to be SIMD optimized
REFLECT repeat does not seem to be used anywhere (neither in pairo-perf-trace
logs, nor in real applications on my typical linux desktop use). So is it even
worth getting any optimizations in pixman, considering that it is more complex
than the other types of repeat? Yes, that's somewhat similar to rotation which
almost nobody uses, but at least it is easier to imagine some valid use cases
for rotation.
The other things not covered in this log are gradients. And gradients
contribute a lot to the performance of some cairo traces. But they are
another story.
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=1 s=20020888|000F6AFF
m=00000000|00000000 d=20020888|000E4AFF - 52/18175 (1897.906 MPix)
SRC
x8r8g8b8 null x8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP NO_ALPHA_MAP
UNIFIED_ALPHA UNIFIED_ALPHA
NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO
IS_OPAQUE SAMPLES_OPAQUE
SAMPLES_COVER_CLIP
SAMPLES_OPAQUE
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F
m=08018000|000F4A7F d=20020888|000E4AFF - 8/1168 (1279.996 MPix)
OVER
solid a8 x8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP NO_ALPHA_MAP NO_ALPHA_MAP
UNIFIED_ALPHA UNIFIED_ALPHA UNIFIED_ALPHA
NO_NORMAL_REPEAT NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO Y_UNIT_ZERO
IS_OPAQUE SAMPLES_COVER_CLIP SAMPLES_OPAQUE
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F
m=08018000|000F4A7F d=20028888|000E4A7F - 3/638 (702.111 MPix)
OVER
solid a8 a8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP NO_ALPHA_MAP NO_ALPHA_MAP
UNIFIED_ALPHA UNIFIED_ALPHA UNIFIED_ALPHA
NO_NORMAL_REPEAT NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO Y_UNIT_ZERO
IS_OPAQUE SAMPLES_COVER_CLIP
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F
m=20028888|000F497F d=20020888|000E4AFF - 7/203 (677.611 MPix)
OVER
solid a8r8g8b8 x8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP COMPONENT_ALPHA NO_ALPHA_MAP
UNIFIED_ALPHA NO_ALPHA_MAP UNIFIED_ALPHA
NO_NORMAL_REPEAT NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO Y_UNIT_ZERO
IS_OPAQUE SAMPLES_COVER_CLIP SAMPLES_OPAQUE
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E6A7F
m=01011000|000F4A7F d=20020888|000E4AFF - 21/1657 (610.733 MPix)
OVER
solid a1 x8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP NO_ALPHA_MAP NO_ALPHA_MAP
UNIFIED_ALPHA UNIFIED_ALPHA UNIFIED_ALPHA
NO_NORMAL_REPEAT NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO Y_UNIT_ZERO
IS_OPAQUE SAMPLES_COVER_CLIP SAMPLES_OPAQUE
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=1 s=20020888|000F6AFF
m=00000000|00000000 d=20028888|000E4A7F - 198/202467 (564.478 MPix)
SRC
x8r8g8b8 null a8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP NO_ALPHA_MAP
UNIFIED_ALPHA UNIFIED_ALPHA
NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO
IS_OPAQUE
SAMPLES_COVER_CLIP
SAMPLES_OPAQUE
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=00010000|000E4A7F
m=08018000|000F4A7F d=20028888|000E4A7F - 411/582133 (541.384 MPix)
OVER
solid a8 a8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP NO_ALPHA_MAP NO_ALPHA_MAP
UNIFIED_ALPHA UNIFIED_ALPHA UNIFIED_ALPHA
NO_NORMAL_REPEAT NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO Y_UNIT_ZERO
SAMPLES_COVER_CLIP
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=12 s=20028888|
000F497F m=00000000|00000000 d=20028888|000E497F - 4/24 (511.706 MPix)
ADD
a8r8g8b8 null a8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS
COMPONENT_ALPHA COMPONENT_ALPHA
NO_ALPHA_MAP NO_ALPHA_MAP
NO_NORMAL_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM
ID_TRANSFORM ID_TRANSFORM
X_UNIT_POSITIVE X_UNIT_POSITIVE
Y_UNIT_ZERO Y_UNIT_ZERO
SAMPLES_COVER_CLIP
Dec 1 02:18:31 i7 cairo-perf-trace: pixman slow path: op=3 s=20028888|000E9E7E
m=00000000|00000000 d=20020888|000E4AFF - 243/247950 (485.485 MPix)
OVER
a8r8g8b8 null x8r8g8b8
-- src -- -- mask -- -- dest --
NARROW_FORMAT NARROW_FORMAT
NO_ACCESSORS NO_ACCESSORS
NO_ALPHA_MAP NO_ALPHA_MAP
UNIFIED_ALPHA UNIFIED_ALPHA
NO_NONE_REPEAT NO_NORMAL_REPEAT
NO_PAD_REPEAT NO_PAD_REPEAT
NO_REFLECT_REPEAT NO_REFLECT_REPEAT
NEAREST_FILTER NEAREST_FILTER
NO_CONVOLUTION_FILTER NO_CONVOLUTION_FILTER
AFFINE_TRANSFORM AFFINE_TRANSFORM
HAS_TRANSFORM ID_TRANSFORM
SCALE_TRANSFORM X_UNIT_POSITIVE
X_UNIT_POSITIVE Y_UNIT_ZERO
Y_UNIT_ZERO SAMPLES_OPAQUE
--
Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cairo-perf-trace-all-fast-path.txt.gz
Type: application/x-gzip
Size: 2437 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20101215/c2ce482e/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20101215/c2ce482e/attachment.pgp>
More information about the Pixman
mailing list