[HarfBuzz] Documenting OpenType shaping

Nathan Willis nwillis at glyphography.com
Fri Jun 15 22:53:41 UTC 2018


On Wed, Jun 6, 2018 at 2:29 PM, Richard Wordingham <
richard.wordingham at ntlworld.com> wrote:

> On Tue, 5 Jun 2018 09:42:38 -0500
> Nathan Willis <nwillis at glyphography.com> wrote:
>
> > Your feedback and help is appreciated!
>
> * Malayalam Remarks *
>
> In Sections 2.2 and 2.3, how are multiple vowels handled, such as
> U+0D4A and U+0D4B?  I'm particularly interested in the handling of
> multiple left matras.
>

Hmm. So, as I understand it, in HarfBuzz the presence of multiple matras
(on any side) would be an issue dealt with by the syllable-identification
regular expressions, before getting to the reordering stuff.

It seems like this it what is used (the same regexps being used for all
scripts in HarfBuzz's Indic shaper):

matra_group = z{0,3}.M.N?.(H | forced_rakar)?;
[...]
halant_or_matra_group = (final_halant_group | (H.ZWJ)? matra_group{0,4});

... and that only permits four matras (total) per syllable.

I vaguely recall seeing a commit message or comment or something indicating
that this limit was there to maintain compatibility with how Uniscribe
matches syllables, but I searched around and couldn't find it today. It was
something along the lines of the Microsoft docs saying "one matra for each
type [L,R,T,B] is permitted," but that isn't clear whether it's justified
by orthography at all or is just a practical concession that they made for
some reason.

Others with more Uniscribe knowledge may know.

That having been said, I *think* that HarfBuzz doesn't rearrange two
adjacent codepoints that have the same sort-ordering tags. So
"Consonant,U+0D4A,U+0D4B" ought to get the matras decomposed, then the two
left-side parts move together as-is to the left of the consonant, and the
two right-side parts remain unchanged.

You could test that with
hb-view /usr/share/fonts/truetype/noto/NotoSerifMalayalam-Regular.ttf
--unicodes=0d15,0d4a,0d4b

I'm on a new machine right at the moment and apparently don't have all of
Noto installed just yet, or I'd just try it. Will update later.

In the meantime, I honestly can't speak to whether or not that's the
correct behavior for the script.

Behdad? Any thoughts on that?


> In Section 3, how does tagging interact with substitutions?  Features
> can in general split and merge glyphs.
>
>
The tagging described in stage 2 is just the reordering / syllable-position
tags. So after all that is done, the
sort-the-syllable-into-final-sort-order is (AIUI) the last that the tags
come into play.

I do know that HarfBuzz keeps track of other sorts of state that it may
refer to internally as tags, but I don't think any of these docs reference
those, just the reordering position tags.

So applying the features in stage 3 doesn't interact with the tags — at
least, not directly. If the tagging was wrong, of course, then the final
sorted order might be wrong and sequences wouldn't match up to the
substitution rules in GSUB.  But, if I follow HarfBuzz's logic right, the
reordering stuff cannot be switched off, so it always happens completely
before any substitutions start, and that seems to be what other shapers did
first.

Should there be a wording change to address that in the document itself?

Thanks,
Nate



> Richard.
> _______________________________________________
> HarfBuzz mailing list
> HarfBuzz at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/harfbuzz
>



-- 
nathan.p.willis
nwillis at glyphography.com <http://identi.ca/n8>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/harfbuzz/attachments/20180615/7f60f254/attachment.html>


More information about the HarfBuzz mailing list