[HarfBuzz] Unified Text Layout Engine?
emader at icu-project.org
Thu Feb 1 11:38:02 PST 2007
The ICU LayoutEngine uses an abstract base class to represent fonts and
so is independent of any particular font format or OS. (Though the model
in the base class assumes that a font contains tables w/ four-byte
I've thought for some time that the Indic shaper code in ICU is too
fragile. When I wrote it, I though I could get away with a single
routine to analyze and tag all Indic scripts. Since then, I've found out
that there are lots of script-specific exceptions and it's sometimes
hard to fix one script without breaking any of the others...
A couple of years ago I talked w/ Owen Taylor about rewriting the code
to have a (potentially) different shaper for each script. At that time I
thought that almost all the bugs were fixed and it wasn't worth the
effort. Subsequent experience has shown this belief to be optimistic. :-)
As you probably know, Microsoft changed the spec. for Indic OpenType
fonts for Vista. As soon as they publish the new spec. we'll have to
adapt our shaper to deal with the new fonts. I don't know all the
details of what's required yet, but it looks like the shaper may have to
"probe" the 'GSUB' table w/ trial lookups to determine, for example,
which characters have pre- and post-base forms.
The ICU LayoutEngine has a few other tricks in it that HarfBuzz might
want. For example, it uses a "canned" 'GSUB' table to do presentation
form based shaping of Arabic text if there's no 'GSUB' table in the font.
There's some similar code the deal with canonical forms. For example, if
the input text contains "a" followed by umlaut and the font contains an
a-umlaut glyph, it will substitute that. Also, if the input text
contains an a-umlaut character and the font does not have a glyph for
a-umlaut, it will substitute an "a" followed by an umlaut. This produces
better rendering of the "basic" scripts if there's no 'GSUB' table.
We're deep into feature planning for ICU 3.8 right now, and I don't know
yet how much time I'll get to work on LayoutEngine related tasks for
this release. I'm going to lobby hard to at least do the Vista-related work
Behdad Esfahbod wrote:
> On Thu, 2007-02-01 at 12:36 -0500, Eric Mader wrote:
>> What's happening on this project these days? I think the goal is very
>> worthwhile, and I'd like to proceed with the design and
>> Eric Mader
> I'm trying to find some more time to finish the OpenType Layout engine
> rewrite that I started. That one will not be depending on FreeType.
> With that done I'll proceed to the shaper part..
> While I have your attention Eric, am I reinventing ICU's OpenType Layout
> engine? I mean, does ICU have one already that uses mmapped files?
> I think what we can most use your expertise with is the Indic shaper.
> It's got out of my control in Pango, and most of the patches committed
> to it recently are all wrong and introduce lots of other bugs I believe.
> We need someone to go over Pango's bugs, ICU's, and Qt's and merge them
> back together. This needs to be done anyway for the HarfBuzz shaper.
> The rest of the shapers should be good enough in Pango. So, if you can
> tackle the Indic one, that would be really helpful.
More information about the HarfBuzz