[HarfBuzz] Tai Tham Shaping Question #2 : MEDIAL RA

Mon Jan 7 06:59:07 PST 2013

On Mon, Jan 7, 2013 at 6:20 PM, Behdad Esfahbod <behdad at behdad.org> wrote:
> On 13-01-07 05:04 AM, Theppitak Karoonboonyanan wrote:
>> On Wed, Dec 26, 2012 at 3:12 PM, Theppitak Karoonboonyanan
>> <thep at linux.thai.net> wrote:

>> Note that only 'ccmp' is used in the font, as there seems to be no
>> indic_configs[] entry for Tai Tham yet. And 'blwf' just doesn't work
>> without the virama info.
>
> We don't necessarily need indic_config for Tai Tham.

You mean Tai Tham will never use 'blwf' for subjoined form?

>> - NA (ᨶ) + vowel AA (ᩣ) ligature is not always shaped if there is
>>   something in between, such as subjoined consonant (line 1).
>>   (For this one, I'm not sure what's the right thing to do, between
>>   using GSUB and letting HB do appropriate preprocessing.
>
> That's easy to fix.  Check MATRA_POS_BOTTOM in
> hb-ot-shape-complex-indic-private.h.

Did you mean MATRA_POS_RIGHT?
I've tried adding IS_LANA case using POS_AFTER_MAIN and
POS_BEFORE_SUB so that the vowel AA is put right after NA.
But I still don't get the ligatures.

diff --git a/src/hb-ot-shape-complex-indic-private.hh b/src/hb-ot-shape-complex-
index e36090e..03acd0d 100644
--- a/src/hb-ot-shape-complex-indic-private.hh
+++ b/src/hb-ot-shape-complex-indic-private.hh
@@ -167,6 +167,7 @@ enum indic_matra_category_t {
 #define IS_MLYM(u) (IN_HALF_BLOCK (u, 0x0D00))
 #define IS_SINH(u) (IN_HALF_BLOCK (u, 0x0D80))
 #define IS_KHMR(u) (IN_HALF_BLOCK (u, 0x1780))
+#define IS_LANA(u) (hb_in_range<hb_codepoint_t> (u, 0x1A20, 0x1AAF))


 #define MATRA_POS_LEFT(u)      POS_PRE_M
@@ -182,6 +183,7 @@ enum indic_matra_category_t {
                                  IS_MLYM(u) ? POS_AFTER_POST : \
                                  IS_SINH(u) ? POS_AFTER_SUB  : \
                                  IS_KHMR(u) ? POS_AFTER_POST : \
+                                 IS_LANA(u) ? POS_AFTER_MAIN : \
                                  /*default*/  POS_AFTER_SUB    \
                                )
 #define MATRA_POS_TOP(u)       ( /* BENG and MLYM don't have top matras. */ \
@@ -194,6 +196,7 @@ enum indic_matra_category_t {
                                  IS_KNDA(u) ? POS_BEFORE_SUB : \
                                  IS_SINH(u) ? POS_AFTER_SUB  : \
                                  IS_KHMR(u) ? POS_AFTER_POST : \
+                                 IS_LANA(u) ? POS_AFTER_POST : \
                                  /*default*/  POS_AFTER_SUB    \
                                )
 #define MATRA_POS_BOTTOM(u)    ( \
@@ -208,6 +211,7 @@ enum indic_matra_category_t {
                                  IS_MLYM(u) ? POS_AFTER_POST : \
                                  IS_SINH(u) ? POS_AFTER_SUB  : \
                                  IS_KHMR(u) ? POS_AFTER_POST : \
+                                 IS_LANA(u) ? POS_AFTER_POST : \
                                  /*default*/  POS_AFTER_SUB    \
                                )


>
>>   How is this usually done in, say, Khmer "បុប្ផា"?)
>>
>> - Leading vowels are correctly reordered, but medial RA (U+1A55)
>>   is not (line 2-3).
>
> I'll fix that.  We have the same issue in Myanmar and Cham too.  What you need
> to do to get it reordered (after I fix the code) is to have the 'pref' feature
> apply to the medial RA.  It can be a ContextSubst that just applies and has no
> recursive lookups.  That's how HarfBuzz detects whether this character needs
> to be reordered to pre-base position.

OK. Will just wait for it.

>> - Final NGA (U+1A59) with virama following is not reordered after
>>   the next base consonant (at the end of line 4).
>
> Oh, that's new.  We need to figure out how to implement that.  That one will
> be tricky.

I think this is also required for Myanmar, as I got the encoding scheme
from it (ASAT - U+103A).

Regards,
-- 
Theppitak Karoonboonyanan
http://linux.thai.net/~thep/