[HarfBuzz] Unified Text Layout Engine?

Eric Mader emader at icu-project.org
Thu Feb 1 11:38:02 PST 2007


Hi Behdad,

The ICU LayoutEngine uses an abstract base class to represent fonts and 
so is independent of any particular font format or OS. (Though the model 
in the base class assumes that a font contains tables w/ four-byte 
names. ;-)

I've thought for some time that the Indic shaper code in ICU is too 
fragile. When I wrote it, I though I could get away with a single 
routine to analyze and tag all Indic scripts. Since then, I've found out 
that there are lots of script-specific exceptions and it's sometimes 
hard to fix one script without breaking any of the others...

A couple of years ago I talked w/ Owen Taylor about rewriting the code 
to have a (potentially) different shaper for each script. At that time I 
thought that almost all the bugs were fixed and it wasn't worth the 
effort. Subsequent experience has shown this belief to be optimistic. :-)

As you probably know, Microsoft changed the spec. for Indic OpenType 
fonts for Vista. As soon as they publish the new spec. we'll  have to 
adapt our shaper to deal with the new fonts. I don't know all the 
details of what's required yet, but it looks like the shaper may have to 
"probe" the 'GSUB' table w/ trial lookups to determine, for example, 
which characters have pre- and post-base forms.

The ICU LayoutEngine has a few other tricks in it that HarfBuzz might 
want. For example, it uses a "canned" 'GSUB' table to do presentation 
form based shaping of Arabic text if there's no 'GSUB' table in the font.

There's some similar code the deal with canonical forms. For example, if 
the input text contains "a" followed by umlaut and the font contains an 
a-umlaut glyph, it will substitute that. Also, if the input text 
contains an a-umlaut character and the font does not have a glyph for 
a-umlaut, it will substitute an "a" followed by an umlaut. This produces 
better rendering of the "basic" scripts if there's no 'GSUB' table.

We're deep into feature planning for ICU 3.8 right now, and I don't know 
yet how much time I'll get to work on LayoutEngine related tasks for 
this release. I'm going to lobby hard to at least do the Vista-related work

Regards,
Eric

Behdad Esfahbod wrote:
> On Thu, 2007-02-01 at 12:36 -0500, Eric Mader wrote:
>> Hi,
>>
>> What's happening on this project these days? I think the goal is very 
>> worthwhile, and I'd like to proceed with the design and
>> implementation.
>>
>> Regards,
>> Eric Mader 
> 
> I'm trying to find some more time to finish the OpenType Layout engine
> rewrite that I started.  That one will not be depending on FreeType.
> With that done I'll proceed to the shaper part..
> 
> While I have your attention Eric, am I reinventing ICU's OpenType Layout
> engine?  I mean, does ICU have one already that uses mmapped files?
> 
> I think what we can most use your expertise with is the Indic shaper.
> It's got out of my control in Pango, and most of the patches committed
> to it recently are all wrong and introduce lots of other bugs I believe.
> We need someone to go over Pango's bugs, ICU's, and Qt's and merge them
> back together.  This needs to be done anyway for the HarfBuzz shaper.
> The rest of the shapers should be good enough in Pango.  So, if you can
> tackle the Indic one, that would be really helpful.
> 
> Regards,




More information about the HarfBuzz mailing list