[Pixman] Pixman glyph compositing
Søren Sandmann
sandmann at cs.au.dk
Wed Jan 23 15:43:31 PST 2013
David Herrmann <dh.herrmann at googlemail.com> writes:
> While working on kmscon the main rendering task I am faced with is
> blending a glyph into the main framebuffer with a constant foreground
> and background color. The code I have been using is a per-pixel
> blending operation on each color value:
>
> For each pixel "i" I do:
> r = alpha[i] * foreground.r + (255 - alpha[i]) * background.r
> g = alpha[i] * foreground.g + (255 - alpha[i]) * background.g
> b = alpha[i] * foreground.b + (255 - alpha[i]) * background.b
> r /= 255;
> g /= 255;
> b /= 255;
> dst[i] = (r << 16) | (g << 8) | b;
>
> So I have an 8bit alpha channel from the glyph as input and an xrgb32
> output framebuffer. The 24bit foreground/background values are
> constant during a single blending operation.
>
> I already optimized this by special-casing alpha[i] == 0 or 255 and I
This is usually a win if it avoid a memory access, but that's not the
case here where you don't read the destination at all.
> changed the division to 256 instead of 255. However, I was wondering
> whether pixman can provide a better alternative. Unfortunately, the
> fastest code I could come up with was (using shadow-buffer):
>
> pixman_fill(shadow, background);
> pixman_composite(OVER, foreground, alpha, shadow);
> pixman_blt(shadow, dst);
>
> I use a shadow buffer as I _really_ want to avoid to composite
> directly into the hardware buffer (which is in most cases way slower
> than the extra pixman_blt). However, this scenario requires writing
> the data three times and even reading it during the composite
> operation. But still, thanks to pixman-optimizations, this turns out
> to be almost exactly as fast as my own trivial implementation. So I
> was wondering whether anyone has ideas how to speed this up?
>
> Is there a way to perform this operation with a single pixman call?
If bg happens to be black, then the whole thing could be done with
composite (SRC, fg, alpha, hardware_buffer)
You could speed this up by caching the glyphs in a pixman_glyphs_t
structure and then using pixman_composite_glyphs(), or if you are sure
that your glyphs will never overlap each other,
pixman_composite_glyphs_no_mask().
But in the general case, I don't think it's possible to do this in one
pass with the current pixman API.
> If not, are there any other optimizations I should consider?
Some random comments:
- The x / 255 can be done with
t = x + 0x80
return (t + (t >> 8)) >> 8;
- If you want to stick with a division by 256, you may want to add 0xff
before shifting. That way 0xff * 0xff = 0xff instead of 0xfe.
- There are some macros in pixman/pixman-combine32.h that can do these
types of computations on two channels as a time.
- If you are using a shadow buffer that is the size of the full screen,
then it may be interesting to reduce it to the size of one glyph so
that it fits in L1.
- You might want to consider caching pre-composited glyphs indexed by
fg, bg under the assumption that the number of color combinations
isn't that large.
Søren
More information about the Pixman
mailing list