[Pixman] [RFC] Performance reporting capabilities for pixman?

Wed Oct 20 05:46:57 PDT 2010

On Wednesday 20 October 2010 13:40:26 Maarten Bosmans wrote:
> 2010/10/20 Siarhei Siamashka <siarhei.siamashka at gmail.com>:
> > Here is a work-in-progress branch with the initial variant slow path
> > reporting code:
> > http://cgit.freedesktop.org/~siamashka/pixman/log/?h=perfstat-wip
> 
> I tried to compile it for Window using mingw. It failed because
> pthread and syslog are unavailable.

Thanks for testing this early preview code. I actually never expected that
somebody would be interesting in it working in windows and had plans to
only implement it as an optional feature for linux. It's great that there
is demand for this functionality in the other systems too.

> I "solved" the pthread issue by commenting out all the mutex lines.
> (my application does all drawing from one thread) The right approach
> is probably to include a cross-platform mutex in pixman-compiler.h. At
> least the mingw part can be factored out from the existing
> PIXMAN_DEFINE_THREAD_LOCAL implementation.
> 
> For missing syslog, a patch is attached. If no syslog is present, perf
> stats are just printed to stdout. It obviously isn't a complete patch,
> as I have not altered the autotools machinery to check for syslog.h
> and set HAVE_SYSLOG_H. Also the definition of PERF_LOG is rather ugly,
> with the missing opening brace on places the macro is used. My C
> preprocessor skills aren't sufficient to make something nice of it.
> But at least the patch suggests the changes necessary to make the perf
> stats work on mingw.

I still wonder about what would be the right place for the output (especially
on windows). By using syslog, my intention was to make it easy to collect some
pixman usage data from X server. Maybe environment variables can select whether 
we are going to use syslog, stdout, stderr or anything else?

> > Anyway, here is an example of usage of this code at the moment. It would
> > be nice if somebody could also try it and provide some feedback. And
> > there is also a possibility that it may help to find some good candidate
> > for optimizations even now.
> 
> Below are the results for the six worst slow paths, on Windows. My app
> does scrolling text (pre rendered to a cairo image surface), with
> individual lines fading in and out when they come into the viewport.

Thanks for sharing the data. Maybe you can try to make a cairo trace from
your application somehow? So that it could be used as one of the standard
cairo benchmarks?

> Perhaps an extra empty line after the decoded result and before the
> next undecoded line would be better for readability.

Right, thanks.

> pixman slow path: op=1 s=00000002|002EAA77 m=00000000|00000000
> d=20020888|002E0AFF - 21/1260 (0,054 MPix)
> 
> SRC
>     pixbuf                null                  x8r8g8b8
>     -- src --             -- mask --            -- dest --
>     NARROW_FORMAT                               NARROW_FORMAT
>     NO_ACCESSORS                                NO_ACCESSORS
>     NO_ALPHA_MAP                                NO_ALPHA_MAP
>     UNIFIED_ALPHA                               UNIFIED_ALPHA
>     NO_NONE_REPEAT                              NO_NORMAL_REPEAT
>     NO_NORMAL_REPEAT                            NO_PAD_REPEAT
>     NO_REFLECT_REPEAT                           NO_REFLECT_REPEAT
>     NEAREST_FILTER                              NEAREST_FILTER
>     NO_CONVOLUTION_FILTER                       NO_CONVOLUTION_FILTER
>     AFFINE_TRANSFORM                            AFFINE_TRANSFORM
>     ID_TRANSFORM                                ID_TRANSFORM
>     X_UNIT_POSITIVE                             X_UNIT_POSITIVE
>     Y_UNIT_ZERO                                 Y_UNIT_ZERO
>     IS_OPAQUE                                   SAMPLES_OPAQUE

Actually as I see it now, 'solid' and 'pixbuf' are just decoded wrong, they are
likely not even BITS types of images. So the reported "slow paths" involving
these are somewhat bogus. Still checking that there are no redundant pixel data
copies anywhere may be useful (be able to fetch directly to destination).

> pixman slow path: op=5 s=20028888|002F0A7F m=00000000|00000000
> d=20028888|002E0A7F - 465/352935 (321,171 MPix)
> 
> IN
>     a8r8g8b8              null                  a8r8g8b8
>     -- src --             -- mask --            -- dest --
>     [...]
>     SAMPLES_COVER_CLIP
> pixman slow path: op=5 s=20028888|002E0A7F m=00000000|00000000
> d=20028888|002E0A7F - 465/352935 (169,056 MPix)
> 
> IN
>     a8r8g8b8              null                  a8r8g8b8
>     -- src --             -- mask --            -- dest --

These look like valid candidates for optimization (unless there are some other 
bugs) and may provide a good performance improvement for your application once
optimized. I only wonder why we have SAMPLES_COVER_CLIP flag set for the source
image in one case, but not in the other?

-- 
Best regards,
Siarhei Siamashka
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/pixman/attachments/20101020/de35debb/attachment-0001.pgp>