[HarfBuzz] Discovering 'vert' feature

Jonathan Kew jfkthame at gmail.com
Fri Oct 10 01:46:05 PDT 2014


On 10/10/14 02:12, Behdad Esfahbod wrote:
> I'm helping Dominik port Chrome's vertical text support to HarfBuzz, and we've
> hit a buggy font in Windows, that we need to work around.
>
> "SimSun" on Windows 7, only declares the 'vert' feature under script 'hani'
> (fine) language-system "CHN ".  No tag maps to "CHN ".  It's a font bug.
>
> In Windows 8 family, someone **tried** to fix it.  They tweaked the features
> until it worked for them.  So it has the 'vert' feature for script 'hani', now
> in language system "ZHS " as well as "CHN ", but also for script 'latn' as
> default language-system.  So now, if the itemizer failed to mark the correct
> script, HarfBuzz tries 'latn' and gets the right feature.  But if it looks
> under 'hani' but there's no language tag, it still fails.
>
> So, I'm tempted to try this fixup: if feature is 'vert', and we found no
> feature, then walk the feature list and enable the first feature found that
> has tag 'vert'.
>
> How does that sound?
>
> https://github.com/behdad/harfbuzz/issues/63
>

Sounds like it would probably work, but this makes me really uneasy - 
it's too much of a hack. Yuck.

Here's a proposal for an alternative fixup that *maybe* feels more 
acceptable to me (maybe, because I haven't thought about it for very 
long, and perhaps it's just as hackish really):

If the script we're using has no matching langSys for the buffer's 
language, and if there's also no default langSys defined, then look for 
the "typical" language system(s) for the script (e.g. ENG for 'latn') - 
allowing this to be a list of tags, so that for 'hani', for instance, we 
could list ZHS, ZHT, JAN ... and we could append the unofficial CHN here.

So we'd need to have a mapping of script tag -> langSys tag(s), often 
just one "prototypical" language that uses the script, but sometimes 
several candidates. This would address the comparable issue of a font 
that (for example) provides features only under arab/ARA (or deva/HIN), 
and is presented with text tagged as Persian (or Nepali)... ISTM it'd be 
better to use the Arabic- (or Hindi-) language features here than to 
fail altogether.

(This is obviously related, though not identical, to the idea of Pango's 
pango_script_get_sample_language function.)

WDYT?



More information about the HarfBuzz mailing list