[HarfBuzz] 'vert' substitutions in CJK fonts

suzuki toshiya mpsuzuki at hiroshima-u.ac.jp
Mon Feb 4 19:02:37 PST 2013


Hi,

Grigori Goronzy wrote:
> On 02/05/2013 01:50 AM, suzuki toshiya wrote:
>> Hi,
>>
>> Thank you for opening the interesting discussion.
>> I think, the script names in OpenType spec are not identical with the
>> block names in Unicode; "kana" does not specify the small group of katakana
>> and hiragana, but also specify the group including katakana, hiragana,
>> CJK ideographs, CJK punctuations, CJK symbols etc etc.
>>
> 
> They are not identical to Unicode, but "kana" indeed means just Katakana
> and Hiragana in OpenType, at least according to the specification:
> 
> https://www.microsoft.com/typography/otspec/scripttags.htm

Oh, I had overlooked that kana appears twice for Hiragana and Katakana.

> The lack of detail in the OpenType specification is really bad... in
> this case it just says it's not always similar to Unicode but doesn't
> explain how it differs from it either. :(

Indeed. I should ask SC29/WG11 font AHG people for the possibility
of further clarification.

>> When I worked for poppler (PDF rendering library), I got similar problem;
>> http://lists.freedesktop.org/archives/poppler/2012-March/008860.html
>> I should note that the default language system strategy would not work
>> well with (old versions of?) Batang font (a Korean font bundled to Microsoft
>> Windows).
>>
> 
> Hmm, interesting... but lack of language-specific matching is not the
> problem here.

I'm sorry - yes, the font referrers for the non-embedded CID-keyed
font in PDF often provide the additional information about the script,
so, such method cannot be applied to the rendering of the plain
Unicode text.

>> when vertical text is requested without embedded font, how OpenType layout
>> feature should be configured; I used the combinations CHN/hani for Chinese
>> Simplified or Traditional, JAN/kana for Japanese, KOR/hang for Korean.
>> But it was designed to fit the internal design of the poppler, more
>> comprehensive consideration would be expected for real i18n software.
>>
> 
> So you use a fixed script for a given language? I don't know, but this
> seems to be quite hacky. Often you don't even know what language you're
> going to display. This might work in poppler's case, but in my case
> (render some line of Unicode text with arbitrary languages) it does not.

Indeed, it's not easy to guess the language from the plain Unicode text.

I understand as the problem in your first post was that the OpenType script tag
has an ambiguity, or, is less-practical (if the spec is understood precisely)
to cover the Unicode characters to be controlled by the feature. It's correct?

Regards,
mpsuzuki



More information about the HarfBuzz mailing list