CairoSDPR status
Michael Meeks
michael.meeks at collabora.com
Tue Jul 9 20:31:35 UTC 2024
Hi all,
I append Armin's update with his permission on some exciting work he's
been doing with us to improve Cairo rendering performance :-)
Bit of a thread here that should have happened in public:
FYI,
Michael.
Hi everyone,
I wanted to take the opportunity to give a short update on the state of
the development of the Cairo-based System-Dependent Primitive Renderer
(in short: CairoSDPR). I spent quite some time on June to get that going
(you will see :-) in the direction to have it working in a
product-quality manner. I started with the one provided by Caolan
(thanks!) that he based on cc-ing and adapting the one for
Direct2D/Windows prototype that I did.
I vastly extended it (leaving no stone where it was basically). It is
already a 'complete' renderer in the sense that all primitives that
*have* to be supported are supported. It is now quite beyond that
minimum requirements by supporting a number of primitives directly that
make rendering much faster/effective. As planned when designing that it
is possible to map a lot of stuff very directly to Cairo and I did that.
It is still not on product quality, mainly because direct text rendering
is missing - one of the big next Primitives to support directly. More on
that later.
For the basic/necessary primitives: It supports buffering of Bitmap data
(plus MipMapping, this time 'diagonal' with equal factors in X/Y to not
exceed mem limits) and path data using the stable system-dependent
buffering mechanism available for quite some time. Bitmaps are directly
supporting RGBA, but also RGB. I also experimented using RGB16, that
works well but does not really show huge speedups AFAICS - can be
checked by a local define if wanted (look for TEST_RGB16). That may need
some more love e.g. when creating masks - special processing of that
16bit rgb coding (R:5,G:6,B:5).
For in-between results the renderer can directly use RGBA buffers, thus
avoiding many adaptions/pixel calculations that the VCLRenderer has to
make. All in all I know not a single place where it is or can
theoretically be slower that the VCLRenderer using the backend,
including the 8-10 layers that VCLRenderer and backend have/do process.
The contrary: Much stuff can use Cairo directly now, and it shows. Some
examples:
UnifiedTransparencePrimitive2D: Most transparency cases are of that
type, avoid having to decompose and to use much more expensive general
TransparencePrimitive2D. Can be rendered directly in Cairo.
PolygonStrokePrimitive2D: Lines with widths and patter(s) are processed
directly, no decompose/own tessellation needed. Thankfully what Cairo
does is nearly 100% compatible what our definition does require.
FillGraphicPrimitive2D: Used for repeated graphics/Pattern fill
(including vector data). Uses size-dependent fallback for prepared
bitmap/direct vector data dependent on target display size (as the
Direct2D version does), also buffered, so no 'jitters' will appear when
zooming into vector graphics. Also seamless and fast for VERY MANY
repeats now. Much faster and looking better than the decompose.
FillGradientPrimitive2D: That was most work, but *all six* variations we
internally have are now directly mapped/supported despite the strange
old stuff that happens/is defined. I made one exception: for the
Elliptical I use standard-circular instead of that insane
step-in-two-pixels-and-draw-an-ellipse stuff, this is not really visible
in smooth gradients. Also gradients are much smoother that way.
Rectangle gradients were hard but solvable: have to stich together as
four filled polygons, but works, also much faster (meshes do not work,
have no colorsteps).
For some stuff I found multiple solutions which were similar in speed.
In those cases - since Cairo backends may be different - I kept both
preferring one using a static bool. It might be worth to do a 'test-run'
at 1st startup and measure that stuff and define some switches for the
renderer, just a possibility.
I added many comments to make it easier to read and change/correct in
the future.
Todos:
The TransparencePrimitive2D (which supports a general alpha definition
independent from content) should be optimized: I found no simple way
(yet) to render Cairo RGB(A) to a Cairo mask using the
standard-LuminanceToAlpha calculation (Direct2D has one...sigh). Worth
experimenting...
Of course: TextRendering. Does not look bad - the fallback decompose
gets the outlines and draws the AAed, but not professional quality.
SVG Gradients: These use 'own' Primitives which still get decomposed. To
solve that, either support directly or (better, all usages would profit)
change to standard gradient Primitives now that we have MultiClorGradients.
PatternFill: It sometimes feels slow, but it's NOT the rendering, but
the HitTest using Primitives when moving the mouse hovering: Needs to
directly support FillGraphicPrimitive2D instead of using the decompose.
ColorModifierStack: Add buffering of Bitmaps that have to be
ColorModified -> will speedup stuff not only for this renderer, but for
all Primitive renderers, maybe even VCL backends.
XOR: Not yet supported. Two possibilities: Get rid of (would be good but
requires some work, some already done, not too many cases remaining) or
support it (also some work already exists in Cairo backend how to do that).
General stabilization: Still may have errors...
General optimization: Many more possibilities for speedups...
Notes to how to use the CairoSDPR: For now on gerrit
(https://gerrit.libreoffice.org/c/core/+/168911) is on green, can go to
master soon. Is in a good in-between state to do so. Can be used/tested
by having the TEST_SYSTEM_PRIMITIVE_RENDERER env var set. I already
checked/compared with pro versions. Maybe you take a look at this in
your cases of usage and give feedback.
In the later product: The CairoSDPR *completely* replaces the
VCLPixelPrimitiveRenderer, so all Primitives will be rendered to
VDevs/Wins using Cairo directly and NO vcl at all. To repeat: This is
true for Draw/Impress completely, for Calc/Writer it's the inserted
DrawObjects, Graphics and overlay things (Markers, Grid, TextSelection,
...). For more, more needs to be adapted to Primitives...
That's the state of things. I plan to just continue, but let me know
about suggestions/thoughts from your side.
Regards,
Armin
My reply:
On 02/07/2024 16:41, Michael Meeks wrote:
> Hi Armin,
>
> Let me start from the end:
>
> On 02/07/2024 15:51, Armin Le Grand wrote:
> > That's the state of things. I plan to just continue, but let me know
> > about suggestions/thoughts from your side.
>
> TLDR; it sounds awesome :-) exciting times.
>
> In more detail; it'd be worth sharing this on the public dev list
> if I can bother you to do that & then having the discussion there;
> please feel free to fwd my response too =)
>
> We can of course, start to back-port to 24.04 and enable
> conditionally for some time on some demo servers to measure performance
> there.
>
> And then the questions:
>
> * does this bin the "render everything twice" problem;
> we draw to alpha transparent surfaces, and will increasingly
> be rendering to alpha layers and compositing them on the
> client: do we still have to do that twice ? or can that be
> avoided for LOK users ?
>
> * winding / de-composing polygons horror: I've spent my life
> seeing this take an extremely long time in profiles - and of
> course cairo will do this as it rasterizes: is there a
> flag / short-cut we can trigger to avoid the polygon winding /
> de-composition as we render ? - ie. not caching the result,
> but avoiding that altogether ?
>
> And a few comments in-line:
>
>> buffering mechanism available for quite some time. Bitmaps are
>> directly supporting RGBA, but also RGB. I also experimented using
>> RGB16, that works well but does not really show huge speedups AFAICS
>
> I'm reliably informed by our graphics team, and my AVX2 experience
> that from a CPU perspective RGBX and RGBA are the only things we want;
> everything else is far slower to process.
>
>> For some stuff I found multiple solutions which were similar in speed.
>> In those cases - since Cairo backends may be different - I kept both
>> preferring one using a static bool. It might be worth to do a
>> 'test-run' at 1st startup and measure that stuff and define some
>> switches for the renderer, just a possibility.
>
> Fun - of course, it's a CPU workload; we don't have super-reliable
> benchmarks; we could get a load of demo users to do the operations twice
> and time them each way for a week of real-world work I guess ;-) and
> come up with a "right answer".
>
>> Of course: TextRendering. Does not look bad - the fallback decompose
>> gets the outlines and draws the AAed, but not professional quality.
>
> Ah; of course we should not try rendering paths ourselves outside
> fontconfig, that'd not be a good idea =)
>
>> XOR: Not yet supported. Two possibilities: Get rid of (would be good
>> but requires some work, some already done, not too many cases
>> remaining) or support it (also some work already exists in Cairo
>> backend how to do that).
>
> I guess this is in meta-files, both WMF/EMF and SVP - so - not sure
> how to avoid that really; cairo can support.
>
>> In the later product: The CairoSDPR *completely* replaces the
>> VCLPixelPrimitiveRenderer, so all Primitives will be rendered to
>> VDevs/Wins using Cairo directly and NO vcl at all. To repeat: This is
>> true for Draw/Impress completely, for Calc/Writer it's the inserted
>> DrawObjects, Graphics and overlay things (Markers, Grid,
>> TextSelection, ...). For more, more needs to be adapted to Primitives...
>
> I guess the double-render / alpha query is there still =)
>
>> That's the state of things. I plan to just continue, but let me know
>> about suggestions/thoughts from your side.
>
> Great stuff; sounds like really good work,
>
> It's not hyper-urgent for us; but as/when it's done - we could
> include this as an option in a CODE release if the impact on the rest of
> the code is small.
>
> Thanks !
>
> Michael.
>
And more - but best to have the discussion on the list ...
More information about the LibreOffice
mailing list