[cairo] cairo-gl glyph rendering performance

Mon May 2 12:57:59 PDT 2011

I'm pretty certain that for some operations, in particular OVER, drawing 
the glyphs atop each other produces identical results to compositing 
them together and then doing an a single operation.

Pixel has color B, and and OVER operation is drawn with color C of two 
glyphs, first one with opacity a and second with b:

   result= (B(1-a)+Ca)(1-b)+Cb
	= (B-Ba+Ca)(1-b)+Cb
	= B-Ba+Ca-Bb+Bab-Cab+Cb
	= B(1-a-b+ab)+C(a+b-ab)

If instead we composite a and b first we get a+b-ab. Using this with 
color C we get:

   result= B(1-(a+b-ab))+C(a+b-ab)
	= B(1-a-b+ab)+C(a+b-ab)
	= the first result

I suspect this is true for several of the other operators. It would 
certainly be worth it to make a list and avoid this for these
Chris Wilson wrote:
> On Wed, 20 Apr 2011 01:30:58 +0300, Alexandros Frantzis <alexandros.frantzis at linaro.org> wrote:
>> Hi all!
>>
>> I have been investigating the cairo-gl glyph implementation to see if we
>> can improve the glyph rendering performance.
>>
>> I have found that one source of performance loss is the overzealous selection
>> of the "via mask" path when rendering glyphs. When using the "via mask" path,
>> glyphs are first rendered to a temporary surface which is then used as a mask
>> to draw the glyphs on the final destination.
>>
>> In the current code, one of the reasons to use the mask path is because the
>> glyphs overlap. Is this valid?
> 
> Yes. It is deeply engrained in the API that a single operation acts a
> single mask. If the overlapping glyphs of the glyph string were to be
> rendered individual then you would operate twice on the overlapping
> pixels. Hence why we need to go construct a mask for the entire string to
> a apply it as a single operation.
> 
>> In any case, the overlap detection test as implemented in
>> _cairo_scaled_font_glyph_device_extents() is not suited for our needs for two
>> reasons:
> 
> We know. Applying the KISS rule to avoid over-engineering.
>  
>> 1. The overlap detection algorithm checks the extents of each glyph against the
>>    current total extent of previously processed glyphs. This works fine as long
>>    as the glyph group is limited to a single line and drawn sequentially.
> 
> This is the *extremely* common case due to historical interface
> limitations i.e. code that has evolved from using X interfaces or through
> pango will perform line breaking.
> 
>> 2. Due to font kerning, glyphs extents are often found to be overlapping,
>>    although the glyphs themselves are not actually overlapping.
> 
> Right, this is relatively common, about 25% of cases in ff, iirc.
> 
>> The important question here is how can actually achieve using the "via mask"
>> path less. Can we remove the overlap factor completely? Assuming that not using
>> a mask is wrong, how wrong are the results going to be? If the visual
>> difference is small enough perhaps we can make this compromise to increase
>> performance (or use an environment variable and leave it to the user to force
>> the fast behavior).
> 
> No, the visual result is wrong and text output is one that people care
> immensely about. The performance you measured is about 5-10x slower than
> what can be achieved using an intermediate mask (guestimating based on the
> i965 timings). So the extra step is not the bottleneck per-se.
> 
> Once I no longer feel embarrassed by the ddx performance, I'll gladly
> embarrass mesa instead.
> -Chris
>