[HarfBuzz] HB_CLOSURE_MAX_STAGES (was: harfbuzz: Branch 'master')

Behdad Esfahbod behdad at behdad.org
Tue Jul 31 00:04:42 UTC 2018


On Thu, Jul 26, 2018 at 12:06 AM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:

> On Tue, 24 Jul 2018 16:31:50 +0000 (UTC)
> behdad at kemper.freedesktop.org (Behdad Esfahbod) wrote:
>
> The following change bothers me:
>
> >  src/hb-ot-layout-common-private.hh |    7 +++++++
> >  src/hb-ot-layout.cc                |    5 ++++-
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > New commits:
> > commit 85646fdadb2f102333485e07425361795b4e0412
> > Author: Garret Rieger <grieger at google.com>
> > Date:   Mon Jul 23 15:37:18 2018 -0700
> >
> >     [subset] Limit the iterations of the closure algorithm.
> >     Prevents O(n^2) run times.
> >
> > diff --git a/src/hb-ot-layout-common-private.hh
> > b/src/hb-ot-layout-common-private.hh index 21caf9e9..7ff0dbeb 100644
> > --- a/src/hb-ot-layout-common-private.hh
> > +++ b/src/hb-ot-layout-common-private.hh
> > @@ -41,6 +41,13 @@
> >  #ifndef HB_MAX_CONTEXT_LENGTH
> >  #define HB_MAX_CONTEXT_LENGTH        64
> >  #endif
> > +#ifndef HB_CLOSURE_MAX_STAGES
> > +/*
> > + * The maximum number of times a lookup can be applied during
> > shaping.
> > + * Used to limit the number of iterations of the closure algorithm.
> > + */
> > +#define HB_CLOSURE_MAX_STAGES        8
> > +#endif
>
> I presume that this is intended to prevent a denial of service attack,
>

Correct.


> at the cost of trashing a subset font.
>

Not really.


> In non-malicious use, how is the victim supposed to detect that and
> then how he needs to change HarfBuzz or his font?  Does he have to read
> all the text using the subset font simply to detect a problem?  How
> does one test that a font does not hit this limit?


It's impossible to hit that limit...  Ok, it would be impossible if we
increase it to 32.  I'll do that.


>   Does one have to
> iterate over the power set of the supported characters for each
> script?  That's O(2^n) - impossible to do!
>
> The description of HB_CLOSURE_MAX_STAGES is completely wrong.  I was
> initially alarmed because I have lookups that are invoked in more than
> 8 places in substitution subtables.  A more accurate, but still not
> perfect, definition, would be 'the maximum number of times lookup can
> change a bit of text'.
>

Nope.  Stage is a technical term in HarfBuzz GSUB processing.

According to OpenType spec, lookups are processed in increasing order of
their indices.  This implies that each lookup is processed one.  But then
the script shaping specs say some features are applied separately.  Each of
those separated list of features/lookups applied are called one stage.  The
total number of stages in any shaper is the total number of times a lookup
can be applied in theory.  Note that this does NOT limit recursion through
Context and ChainContext lookups.

commit 66ccd8ac405c9c25b37de9eb467a7382880dda35 (HEAD -> master,
origin/master)
Author: Behdad Esfahbod <behdad at behdad.org>
Date:   Mon Jul 30 17:03:06 2018 -0700

    [serialize] Increase stage count from 8 to 32

    Indic shaper uses many stages.  Now we are provably not limiting
    functionality whereas the previous limit of 8 was assuming real-world
    practices.

diff --git a/src/hb-ot-layout-common-private.hh
b/src/hb-ot-layout-common-private.hh
index dec237c8..1cf530ea 100644
--- a/src/hb-ot-layout-common-private.hh
+++ b/src/hb-ot-layout-common-private.hh
@@ -45,8 +45,10 @@
 /*
  * The maximum number of times a lookup can be applied during shaping.
  * Used to limit the number of iterations of the closure algorithm.
+ * This must be larger than the number of times add_pause() is
+ * called in a collect_features call of any shaper.
  */
-#define HB_CLOSURE_MAX_STAGES  8
+#define HB_CLOSURE_MAX_STAGES  32
 #endif





> A limit of 8 does not strike me as obviously generous.  Some contextual
> changes can ripple through a string, and I would not be totally
> surprised to find that 8+1 or more lookups act on some irreducible
> strings in my Da Lekh font.  The consolations are that there are
> probably shorter paths to create the resultant glyphs from the input
> set, and one iteration will often process several lookups in the
> correct sequence.
>
> Richard.
>

-- 
behdad
http://behdad.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/harfbuzz/attachments/20180730/5df721d4/attachment.html>


More information about the HarfBuzz mailing list