[Pixman] Optimize Graphic Routines for s390x in Pixman - Queries
mattst88 at gmail.com
Sat Feb 8 22:49:15 UTC 2020
On Sat, Jan 25, 2020 at 4:57 AM Naveen Naidu <naveennaidu479 at gmail.com> wrote:
> Hello Everyone,
> I am Naveen a Senior Year Computer Science Undergraduate from India. I am planning to apply for Open Mainframe Project Internship(https://github.com/openmainframeproject-internship/resources) program, whose one of the proposed project is to Optimize graphics routines for s390x in pixman.
> The description of the project is as follows:
>> With the introduction of VirtIO GPU hardware (virtual graphic adapter for KVM-based virtual machines) for the s390x platform it makes sense to provide optimized routines in the pixman library also for the s390x architecture.
> From what I gather from the description, t s390x has support for vector instruction i.e SIMD instructions and since these instructions quicken the processing, the project asks us to write an implementation of pixman that uses the vector instructions for s390x.
> I have also been going through the Implementation for Power VMX SIMD, which was created to use the Vector instructions for Power PC. But I must confess that I am a little lost.
> It would be really kind of you all if you could guide me in what I would need to learn/do in order for me to be able to implement the project. I've had a course on computer graphics in our undergrad so I do understand the fundamentals. But I would really like to know the right way of steps to do the project so that I can get a better understanding of the project.
> Thank you very much for your time,
Here's some snippets of an email I sent to someone else interested in
contributing optimization to pixman:
Background information for the operations pixman implements:
http://ssp.impulsetrain.com/porterduff.html (written by the author of Pixman)
`lowlevel-blt-bench` lives in pixman's test/ directory. It's a small
self-contained benchmark. Run with
etc. The -b (bilinear) and -n (nearest) options are useful as well.
Firefox traces will show lots of usage of bilinear and nearest scaling
There's an environment variable named PIXMAN_DISABLE=... which is very
useful for getting side-by-side performance comparisons of MMX vs SSE2
vs AVX2. (For S390, since it doesn't already have some optimizations,
it may not be particularly useful). It works for both
lowlevel-blt-bench and cairo-perf-trace.
`cairo-perf-trace` lives in cairo's perf directory. Run with
CAIRO_TEST_TARGET=image16,image ./perf/cairo-perf-trace ~/path/to/trace
The trace files in cairo-traces are .lzma files which will have to be
decompressed. Decompress with lzma -dk trace.lzma or alternatively run
make in cairo-traces to uncompress them all. Pass the uncompressed
file to cairo-perf-trace. The arguments to CAIRO_TEST_TARGET specify
what backend Cairo should use. 'image' corresponds to 32-bit visuals,
and 'image16' is 16-bit visuals.
Here's a couple of my blog posts about some work I did on pixman.
Maybe you can find something valuable in them.
I would look at the pixman_sse2.c file for examples of what pixman
optimizations look like. That may be a better starting point than the
POWER optimizations. I have a small branch here
demonstrates adding a set of optimizations for a new instruction set.
I expect it would be helpful to look over.
More information about the Pixman