[Pixman] [PATCH 00/32] (Mostly) ARMv6 speedups

Ben Avison bavison at riscosopen.org
Thu Aug 7 09:49:56 PDT 2014


I'll apologise in advance for mailbombing everyone with 33 emails at once.
There's a team who have been working on optimising browser performance on
the ARM11, and they've identified a number of Pixman operations which
haven't been prioritised in the past due to them not being sufficiently
represented in the cairo-perf-trace benchmarks. I have been working away for
the last couple of months to try to address these; here are the results.

These patches fall into a number of categories:
* Improvements to benchmarking facilities
* Better handling of solid images
* Additional unscaled ARMv6 fast paths (and a few generic C ones)
* ARMv6 combiner functions
* ARMv6 unscaled scanline fetch and write-back functions
* ARMv6 nearest scaled scanline fetch functions
* ARMv6 nearest scaled fast paths

Don't worry if I take a while to follow up any comments - I'm just about to
go offline for a few days, but I will deal with them when I get back.

Ben Avison (32):
  armv6: Fix typo in preload macro
  lowlevel-blt-bench: Parse test name strings in general case
  test: Add a new benchmarker targeting affine operations
  pixman-general: Tighten up calculation of temporary buffer sizes
  pixman-image: Early detection of solid 1x1 repeating source images
  pixman-fast-path: Add over_n_8888 fast path
  armv6: Add over_n_8888 fast path
  pixman-fast-path: Add over_n_0565 fast path
  armv6: Add over_n_0565 fast path
  pixman-fast-path: Add in_8888_8 fast path
  armv6: Add in_8888_8 fast path
  armv6: Add over_8888_8_0565 fast path
  armv6: Add over_8888_n_0565 fast path
  armv6: Add in_n_8888 fast path
  armv6: Improved over_8888_8888 fast path
  armv6: Add ability to generate single-scanline fast paths
  arm: Move BIND_COMBINE_U from NEON code to a generic ARM header
  armv6: Add ADD combiner
  armv6: Add OVER_REVERSE combiner
  armv6: Add IN, IN_REVERSE, OUT and OUT_REVERSE combiners
  armv6: Add OVER combiner
  armv6: Add SRC combiner
  armv6: Add optimised scanline fetchers and writeback for r5g6b5 and
    a8
  armv6: Add optimised scanline fetcher for a1r5g5b5
  armv6: Add src_1555_8888 fast path
  armv6: Add fetcher for a8r8g8b8 nearest-neighbour transformed images
  armv6: Add fetcher for r5g6b5 nearest-neighbour transformed images
  armv6: Add fetcher for x8r8g8b8 nearest-neighbour transformed images
  armv6: Add fetcher for a8 nearest-neighbour transformed images
  armv6: Add nearest-scaled-cover src_8888_8888 fast path
  armv6: Add nearest-scaled-cover src_0565_0565 fast path
  armv6: Add four more nearest-scaled-cover fast paths

 pixman/pixman-arm-common.h          |   82 ++
 pixman/pixman-arm-neon.c            |   33 +-
 pixman/pixman-arm-simd-asm-scaled.S |   49 +
 pixman/pixman-arm-simd-asm-scaled.h |  418 ++++++++
 pixman/pixman-arm-simd-asm.S        | 1826 +++++++++++++++++++++++++++++++++--
 pixman/pixman-arm-simd-asm.h        |  108 ++-
 pixman/pixman-arm-simd.c            |  350 +++++++-
 pixman/pixman-fast-path.c           |  109 +++
 pixman/pixman-general.c             |    4 +-
 pixman/pixman-image.c               |    6 +
 test/Makefile.sources               |    1 +
 test/affine-bench.c                 |  416 ++++++++
 test/lowlevel-blt-bench.c           |  129 +++-
 13 files changed, 3404 insertions(+), 127 deletions(-)
 create mode 100644 pixman/pixman-arm-simd-asm-scaled.h
 create mode 100644 test/affine-bench.c

-- 
1.7.5.4



More information about the Pixman mailing list