[HarfBuzz] caching plans with user features

Wed Oct 30 08:02:50 PDT 2013

On 30/10/13 14:42, Behdad Esfahbod wrote:
> In the mean time, I think we should switch hb-shape / hb-view to cache
> their own plan.

I considered doing that, but adding it to the engine itself seemed more 
generally useful - it's likely to benefit many HB clients. (And then 
doing it in the test tools would be redundant.)

>
> On Oct 29, 2013 10:10 PM, "Jonathan Kew" <jfkthame at googlemail.com
> <mailto:jfkthame at googlemail.com>> wrote:
>
>     Hey Behdad,
>
>     I figured out why the dashboard figures make it look as though we're
>     surprisingly slow when running the XP fonts with simple scripts
>     (e.g. Latin, Cyrillic).
>
>     The problem is that my test framework passes "--features kern=0" for
>     these tests in order to suppress use of the legacy 'kern' table, so
>     that we'll better match Uniscribe's shaping results. But (somewhat
>     surprisingly, at first glance) running with kern=0 is substantially
>     *slower* than running without it:
>
>     $ time hb-shape arial-winxp.ttf --text-file ru.txt > /dev/null
>     real    0m38.831s
>     user    0m38.726s
>     sys     0m0.106s
>
>     $ time hb-shape arial-winxp.ttf --text-file ru.txt --features kern=0
>      > /dev/null
>     real    0m50.087s
>     user    0m49.984s
>     sys     0m0.104s
>
>     Explicitly specifying kern=1 is slower still, although the shaped
>     results will be identical to the no-user-feature case:
>
>     $ time hb-shape arial-winxp.ttf --text-file ru.txt --features kern=1
>      > /dev/null
>     real    0m56.122s
>     user    0m56.022s
>     sys     0m0.101s
>
>     The reason for this, as you no doubt realized right away, is that
>     passing *any* user features will prevent hb_shape() taking advantage
>     of a cached shape plan in the font, and so we re-create the plan for
>     every string. This is pretty expensive, and results in the slowdown
>     here.
>
>     Caching plans with arbitrary user features may be a bit tricky, but
>     what I suggest we could do to address this for the common use case
>     is to cache plans with user features *provided* all the user
>     features are "global" (i.e. they have start=0, end=-1). And only use
>     a cached plan if the list of features is exactly the same - i.e. the
>     same tags and values (and global ranges) and listed in the same
>     order. Make no attempt to decide whether different feature lists
>     could in fact share the same plan, just refuse to use a cached plan
>     if there's *any* difference. This makes it reasonably easy and cheap
>     to do the caching, and in practice it's still likely to hit the vast
>     majority of the use cases.
>
>     The attached patch implements this, and with this applied, I now see
>     kern=0 resulting in a substantial speed-up, as expected, instead of
>     a slow-down compared to the default shaping:
>
>     $ time hb-shape arial-winxp.ttf --text-file ru.txt --features kern=0
>      > /dev/null
>     real    0m30.807s
>     user    0m30.722s
>     sys     0m0.086s
>
>     And explicitly setting kern=1 shows no significant difference from
>     the original no-features case (the variation from the first run
>     above is within the noise level):
>
>     $ time hb-shape arial-winxp.ttf --text-file ru.txt --features kern=1
>      > /dev/null
>     real    0m37.544s
>     user    0m37.453s
>     sys     0m0.091s
>
>     So I'd suggest it's worth doing something like this - unless of
>     course you want to go the whole way and implement "smart" plan
>     caching with user features, but IMO that sounds like it might be
>     more effort than it's worth.
>
>     JK
>