[Pixman] [PATCH 0/7] lowlevel-blt-bench test pattern parser

Tue Apr 14 01:36:38 PDT 2015

On Mon, 13 Apr 2015 18:42:45 +0100
"Ben Avison" <bavison at riscosopen.org> wrote:

> On Mon, 13 Apr 2015 12:31:35 +0100, Pekka Paalanen <ppaalanen at gmail.com> wrote:
> > Apart from restructuring, there is one significant(?) difference to
> > Ben's patches: for a solid mask, Ben used a8r8g8b8, but my code uses a8.
> 
> I had to remind myself how things hang together to check if this is
> significant or not...
> 
> When lowlevel-blt-bench tests a solid image (source or mask), it achieves
> it not by calling pixman_image_create_solid_fill() but by creating a 1x1
> pixel bitmap using pixman_image_create_bits(). However, the effect is
> effectively the same because compute_image_info() detects single-pixel
> bitmaps and records it in image->common.extended_format_code before we
> start doing fast path lookup. However, image->type remains as BITS rather
> than SOLID because the non-common parts of the pixman_image union retain
> their original meaning - for a native solid fill image, it's just the
> colour in three different formats, but for bitmap images it's a pointer
> to and sizes of the bitmap and colour LUT (if applicable), various helper
> function pointers and so on.
> 
> A typical ARM assembly fast path is written to assume that any argument
> corresponding to a solid colour is already in an 8888 format where the
> red-blue ordering is the same as the destination image. The magic that
> ensures this is hidden in the wrapper C function generated by
> PIXMAN_ARM_BIND_FAST_PATH_N_DST and similar, where it calls
> _pixman_image_get_solid(). You'll see similar calls more explicitly in
> pixman-fast-path.c (architecture-neutral fast paths) and the equivalent
> sources files for other architectures.
> 
> There are a few circumstances where you might not want to use
> _pixman_image_get_solid()  - perhaps most obviously are the operations
> which use a solid image in pixel format a2r10g10b10 because you'd lose 2
> bits of precision per colour component. There are a few example of those
> amongst those you picked out in special_patterns[].
> 
> It's worth noting that _pixman_image_get_solid() has optimised code to
> extract the colour from image->type==BITS images if the source/mask image
> is of format a8r8g8b8, x8r8g8b8 or a8. Other formats will still work, but
> more laboriously.
> 
> Not that any of this should matter at all for the purposes of
> lowlevel-blt-bench. _pixman_image_get_solid() is only called once per
> call of pixman_image_composite(), so it is part of the overhead that
> should be being accounted for by the code wrapped in #if EXCLUDE_OVERHEAD.

Ok, so if I understood you right, this difference should be harmless,
especially with the addition mentioned below.

> > should we also add a condition, that if a test has CA flag and a solid
> > mask, then that mask needs to be a8r8g8b8?
> 
> That might be desirable, if only because lowlevel-blt-bench initialises
> all its images using memset(0xCC) so an a8 solid image would be converted
> by _pixman_image_get_solid() to 0xCC000000 whereas an a8r8g8b8 would be
> 0xCCCCCCCC. When you're not in component alpha mode, only the alpha byte
> matters for the mask image, but in the case of component alpha
> operations, a fast path might decide that it can save itself a lot of
> multiplications if it spots that 3 constant mask components are already 0.

Ok, good. I think I will make that a separate follow-up patch, so I can
properly record the reasons and effects.

> > None of the existing tests has a solid mask for CA tests. I'm not sure
> > how much it makes sense.
> 
> When you're in component alpha mode, mathematically the source and mask
> images are interchangeable. So you could implement over_8888_n_8888_ca by
> calling over_n_8888_8888_ca, and we already have a number of fast paths
> for operations of the latter form. (I'm not aware that Pixman is - yet -
> aware that it can make such substitutions, but I'm sure it could be
> added.)

Hmm, yes, interesting, at least in the cases where the operator indeed
allows it... I don't know the operators by heart so I don't know if
there are any that wouldn't work like that. An idea to remember.

> > Ben, when we eventually get to the actual optimization patches, I expect
> > I will reproduce the benchmarks myself while reviewing. Is that ok?
> 
> That's definitely fine by me, thanks for offering. Fingers crossed it
> doesn't substantially change the results :-)
> 
> > What do you think of this series?
> 
> In general, it looks like an improvement to me. I don't think the test
> programs have had this much attention in a while! One minor point in
> patch 6's log message, you copied my typo:
> 
>    <operation>_<src>[_<mask]<dst>[_ca]
> 
> should be
> 
>    <operation>_<src>[_<mask]_<dst>[_ca]

Ahh, yes, I'll fix that.

So, can I take it that you gave your Reviewed-by for the whole series?

For the record, if the terminology of S-o-b, R-b, Acked-by, etc. is
foreign to you or anyone, they are documented in the Linux kernel:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n407
Sections 11-13. IMO it's enough to read those once and understand the
concepts. I have a feeling most projects use those tags on a bit looser
terms than the kernel.

Signed-off-by (depending on the project) and Reviewed-by are the most
important.

Thanks,
pq