[Pixman] [PATCH] MIPS: DSPr2: Added over_n_8888_8888_ca and over_n_8888_0565_ca fast paths.

Siarhei Siamashka siarhei.siamashka at gmail.com
Mon Mar 12 16:11:04 PDT 2012

On Mon, Mar 12, 2012 at 11:20 PM, Lukic, Nemanja <nlukic at mips.com> wrote:
> Hi Soren,
> I usually select cairo-perf-trace that utilize optimized fast path the most.
> In this case, xfce4-terminal-a1 proved to be that one. I use oprofile to check CPU utilization. Here is oprofile log I got for the xfce4-terminal-a1:
> CPU: MIPS 74K, speed 0 MHz (estimated)
> Counted CYCLES events (Cycles) with a unit mask of 0x00 (No unit mask) count 40000
> samples  %        image name               app name                 symbol name
> 2658517  50.3337  no-vmlinux               no-vmlinux               /no-vmlinux
> 1216517  23.0323  libpixman-1.so           libpixman-1.so           pixman_composite_over_n_8888_8888_ca_asm_mips
> 270995    5.1308  libc-2.11.2.so           libc-2.11.2.so           memset
> 165057 3.1250  libm-2.11.2.so           libm-2.11.2.so           floor
> 139880    2.6483  libpixman-1.so           libpixman-1.so           pixman_fill_buff32_mips_dsp
> 136303    2.5806  libpixman-1.so           libpixman-1.so           fetch_scanline_a8
> 61821     1.1705  libc-2.11.2.so           libc-2.11.2.so           memcpy
> ...
> All other traces don't utilize this fast-path that much (this is what my oprofile runs on the test system showed).
> If you know some more suitable trace (or system configuration I need to have, like fonts installed, etc), please let me know, and I'll re-run the benchmarks and update the commit.

You can try to install terminus font
(http://terminus-font.sourceforge.net/) just to check if this has any
effect on the fast paths used. However the trace will not be useful
for benchmarking your over_n_8888_8888_ca and over_n_8888_0565_ca
optimizations any more. Anyway, the purpose of running benchmarks is
to confirm the performance improvement, so I guess this trace is also
fine even though it does not behave as originally intended.

By the way, oprofile logs are also quite informative and may be useful
as part of the commit message. By the way, it is a good idea to
configure oprofile to collect statistics separately per process
instead of the flat report for the whole system. This can be done in
the following way:

    # opcontrol --deinit
    # opcontrol --separate=kernel
    # opcontrol --init

Then collect the statistics:

    # opcontrol --reset
    # opcontrol --start
    # ./some-test-binary
    # opcontrol --stop

And show it:

    # opreport -l ./some-test-binary

When the statistics is collected per process, the idle time currently
attributed to no-vmlinux will disappear, the results should become
perfectly reproducible across multiple runs and can be also used to
evaluate the effect of optimizations.

Best regards,
Siarhei Siamashka

More information about the Pixman mailing list