[cairo] [PATCH] cairo-gl: Make VBO size run-time settable

Tue Sep 3 18:36:04 PDT 2013

On Fri, Aug 30, 2013 at 10:11:12AM +0100, Chris Wilson wrote:
> On Thu, Aug 29, 2013 at 06:56:21PM -0400, Behdad Esfahbod wrote:
> > On 13-08-29 01:55 PM, Bryce W. Harrington wrote:
> > > Chris, btw, sort of an aside question...
> > > 
> > > As I've been running various performance tests for each of the GL
> > > compositors, I am noticing that spans and traps basically have identical
> > > performance (any differences are in the noise).  I'm aware of the
> > > implementational differences between the two, and I've expected to see
> > > spans perform better than traps on at least a few of these tests, but
> > > nothing so far.  I'm guessing the tests simply aren't exercising spans'
> > > talents, or I'm not running the right tests.
> 
> This is what I measured on one of my systems:
> 
> old: gl-traps
> new: gl-spans
> Speedups
> ========
>    gl           firefox-fishtank: 55.16x speedup
>    gl             grads-heat-map: 16.03x speedup
>    gl             firefox-canvas: 12.33x speedup
>    gl         swfdec-giant-steps: 11.99x speedup
>    gl       firefox-canvas-alpha: 11.55x speedup
>    gl         firefox-chalkboard:  8.96x speedup
>    gl          firefox-paintball:  7.02x speedup
>    gl               firefox-tron:  6.93x speedup
>    gl       gnome-system-monitor:  6.10x speedup
>    gl          firefox-particles:  5.63x speedup
>    gl           firefox-fishbowl:  5.43x speedup
>    gl          firefox-talos-svg:  5.41x speedup
> etc.

Thanks; I don't get anywhere near these differences, so assume that
means I'm not running the tests properly.  I'll poke around and
hopefully figure it out.

One question though, if spans is so much better than traps, why do we
still have traps as an option?  Are there cases where spans may
underperform, or situations it can't handle and we must fall back to
traps?

> > Speaking of which, Chris, can you explain to those of us not following cairo
> > closely these days how all the various new compositors work?
> 
> The difference between the compositors of cairo-1.12 and the single
> trapezoid compositor of cairo-1.0 is that are more of them! The surface
> backends have to plug directly into the high level surface API (the old
> low level compositor API is removed) and explicitly decide how they want
> to render each individual operation. We have a few common strategies,
> the trapezoid compositor (based on the original Xrender approach), the
> spans scanline compositor (efficient for image based software
> rendering), and a "mask" compositor (where the backend can render the
> various channels separator and the horrible logic of combining mask with
> the clip with the source onto the destination is handled by the
> compositor).
> 
> For example, with cairo-gl it will first use its msaa compositor,
> falling back to the spans compositors, and then to a mask compositor
> (with a stage for glyphs to use the code from the traps compositor),
> with a final fallback to the CPU.

'traps' being the final fallback here?

> Since each compositor receives the high level state at each stage, we
> are not restricted by any of the decisions at an earlier point. (Which
> was an issue with the previous lowlevel approach that couldn't use spans
> for rendering an image fallback after it hit the trapezoid paths etc).
> The compositor pipeline does a few computations upfront (for computing
> extents and pattern reductions which were at the time common for all,
> but now bypassed for msaa) and then walks a delegate chain of
> compositor function tables, calling the operation on each until it is
> claimed. (So like the pixman approach we also suffer from static
> assignment of the best technique not always being hit first.) Each
> compositor function in that chain inspects the state (is the path
> rectangular? do we have a complex clip? can I handle the operator?) and
> if it finds the operation acceptable begins to process it. For the
> common compositors, it will break the operation down into various
> lowlevel callbacks that the backend provides (which often then use
> another library function to implement with further callbacks e.g. span
> rendering.) The stacks build very quickly with obvious overhead for
> trivial operations, so typically the backend may check upfront for the
> simplest operations and do them directly.
>
> The basic idea is that each backend is free to build a pipeline of how
> to render any operation, with the common stages being helper functions.
> -Chris

Thanks for writing this up; very illuminating.

Bryce