[HarfBuzz] optimization for ASCII-only text

Steven R. Loomis srl at icu-project.org
Thu Aug 9 11:14:35 PDT 2012


ICU has quick-check functions
http://icu-project.org/apiref/icu4c/unorm2_8h.html#ad81711834f00bbeb97738004f4f08450
which can return YES, NO, MAYBE as to whether normalization is required.
 If you're making a pass over the data, this is not *much* more expensive
than just checking for non ascii. Something to consider, either if ICU is
used, or in principle.

-s

On Thu, Aug 9, 2012 at 10:32 AM, Jonathan Kew <jfkthame at googlemail.com>wrote:

> Hi Behdad,
>
> While complex-script shaping is obviously far more interesting, in
> practice there is a lot of very simple ASCII text on the web. So what would
> you think of adding a minor optimization that looks like it can give us
> about 10% gain on shaping ASCII text with simple fonts? The idea is to make
> hb_buffer_add check whether any non-ASCII characters have been put in the
> buffer; and if not, there's no need to run the normalization pass.
>
> (Of course, there are plenty of non-ASCII characters that could also be
> present without normalization becoming relevant, but I didn't want to make
> the check any more expensive than a simple character-code comparison, and
> optimizing performance of ASCII-only runs will benefit a lot of real-world
> text for minimal effort.)
>
> This was prompted by profile data such as http://people.mozilla.com/~**
> bgirard/cleopatra/?report=**c2e6bea3647461c0675e59441b78c0**f5c409ac0d<http://people.mozilla.com/~bgirard/cleopatra/?report=c2e6bea3647461c0675e59441b78c0f5c409ac0d>(see
> https://bugzilla.mozilla.org/**show_bug.cgi?id=762710#c25<https://bugzilla.mozilla.org/show_bug.cgi?id=762710#c25>),
> which relates to layout of a large, almost purely ASCII document. This
> shows the normalization pass - which we know is redundant for ASCII-only
> text - contributing around 10% of the total shaping time. With this patch,
> that time simply vanishes from the profile.
>
> JK
>
>
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/harfbuzz
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20120809/b32013e2/attachment.html>


More information about the HarfBuzz mailing list