problem with exaBufferGlyph()
Michel Dänzer
michel at daenzer.net
Sat Jan 7 07:44:51 UTC 2017
On 23/12/16 03:39 PM, Michael wrote:
> Hello,
>
> first some context - I've been writing EXA support for Permedia 2 and
> 3, mostly because these cards are still kinda useful on sparc and alpha
> hardware. For pm2 there's actually documentation, and the chip can be
> used to accelerate at least some xrender operations.
> The problem - this chip can't deal with A8 masks for rendering glyphs.
> It's perfectly happy to render ARGB though, and that's a problem with
> current EXA.
> As it is right now, exaGlyphs() will call CheckComposite() with an A8
> Picture as destination to see if the driver supports that, and fall
> back to ARGB if it doesn't. That's fine, although it may be better to
> do that test once on startup instead of every time a glyph needs to be
> drawn.
> The problem is, that exaBufferGlyph() will always cache glyphs in the
> format returned by GetGlyphPicture(), not the one requested in the
> destination Picture handed to it. To drivers that can't support A8 this
> will render the cache unusable to the accelerator, which results in
> glyphs constantly being copied back & forth between video and main
> memory, which kills performance to the point that software rendering is
> faster.
>
> So, what I'm proposing is something like this:
> diff -u -r1.2 exa_glyphs.c
> --- exa_glyphs.c 22 Dec 2016 21:31:08 -0000 1.2
> +++ exa_glyphs.c 23 Dec 2016 05:42:08 -0000
> @@ -544,7 +544,20 @@
> INT16 ySrc, INT16 xMask, INT16 yMask, INT16 xDst, INT16 yDst)
> {
> ExaScreenPriv(pScreen);
> + /*
> + * XXX
> + * Request the new glyph in the format we need to draw in, not whatever
> + * GetGlyphPicture() hands us, which will (almost?) always be A8.
> + * That way drivers that can't handle A8 but can do Xrender ops in ARGB
> + * will be able to do hardware rendering in and out of the glyph cache.
> + * This results in a major performance boost on such hardware.
> + * Drivers that can handle A8 shouldn't see any difference.
> + */
> +#if 1
> + unsigned int format = pDst->format;
> +#else
> unsigned int format = (GetGlyphPicture(pGlyph, pScreen))->format;
> +#endif
> int width = pGlyph->info.width;
> int height = pGlyph->info.height;
> ExaCompositeRectPtr rect;
>
> Without this, I get about 9000/sec with x11perf -aaftext on an Ultra 60
> - software rendering yields 15000/s. With this it's 75000/s. Not
> earth-shatteringly fast but still more than what I expected from such
> an old chip that wasn't exactly known for its speed even back in its
> day.
>
> Any thoughts? Am I missing something?
Your change makes sense to me. Please submit a patch which just changes
the format assignment, no need for the comment and preprocessor guards.
Maybe it can also remove this code, since I don't think composite
operations to 1bpp destinations can ever be accelerated:
if (PICT_FORMAT_BPP(format) == 1)
format = PICT_a8;
If so, the local variable format could be eliminated in favour of using
pDst->format directly.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
More information about the xorg-devel
mailing list