[HarfBuzz] unbreaking mixed-up khmer fonts

Jonathan Kew jfkthame at googlemail.com
Tue Nov 20 01:26:33 PST 2012


On 19/11/12 22:53, Jonathan Kew wrote:
> On 19/11/12 21:35, Jonathan Kew wrote:
>
>> I've put a test page at http://people.mozilla.org/~jkew/kh/test.html
>> that renders the sequences "ក្រុ ខេ គៀ" with 100+ fonts from several
>> sources. They're mostly different versions of KhmerOS fonts, but there
>> are a few others as well.

I dumped a listing of the features that are present in each of these 
fonts, to try and see how we might distinguish the various kinds of 
brokenness. See attached "kh-features".

On this basis, I suggest that we add an extra test to the Khmer case in 
hb_ot_shape_complex_categorize, and only switch to the generic shaper 
for 'khmr' script if there is a 'pres' feature in addition to 'liga'. 
This seems to characterise the problematic fonts that use 'liga' to fake 
the split vowels (and hence have problems in the Indic shaper), while 
still allowing Hanuman to work properly.

Attached "khmer-fix" has the suggested patch.

Admittedly, this looks like a very ad-hoc kind of a hack, but it seems 
to work for the fonts I've tested so far. The alternative, I think, 
would be to bring back the hard-coded Ra support for fonts that lack 
'pref', so that they can still go through the Indic shaper (with 'liga' 
disabled), but that looked like being an ugly process too. That'd 
probably be more like what old engines did, though (I guess).

WDYT?

-------------- next part --------------
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_31/KhmerOS.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_31/KhmerOSfasthand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_31/KhmerOSfreehand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_31/KhmerOSmc.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_31/KhmerOSmuol.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_31/KhmerOSsys.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOS.ttf
abvf abvm abvs blwf blwm blws ccmp clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOSbattambang.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOSbokor.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOSfasthand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOSfreehand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOSmc.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOSmuol.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_4.0/KhmerOSsys.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS .ttf
abvf abvm abvs blwf blwm blws ccmp clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_battambang.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_bokor.ttf
abvf abvm abvs blwf blwm blws ccmp clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_content.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_fasthand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_freehand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_metalchrieng.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_muol.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_muollight.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_muolpali.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_siemreap.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : All_KhmerOS_5.0/KhmerOS_sys.ttf
abvs blwf clig liga pref pstf : Hanuman/hanuman.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Battambang.ttf
abvf blwf blws clig liga pres : KhUnicode210/Kh-Bokor.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Content.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Dangrek.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Fasthand.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Freehand.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Kangrey.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Koulen.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-KoulenL.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Metal-Chrieng.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Muol-Pali.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Muol.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-Siemreap.ttf
abvf blwf blws clig liga pres psts : KhUnicode210/Kh-System.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOS.ttf
abvf abvm abvs blwf blwm blws ccmp clig mkmk pref pres pstf psts : KhmerOS/KhmerOSbattambang.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSbokor.ttf
abvf abvm abvs blwf blwm blws ccmp clig mkmk pref pres pstf psts : KhmerOS/KhmerOScontent.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSfasthand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSfreehand.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSmetalchrieng.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSmuol.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSmuollight.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSmuolpali.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSsiemreap.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerOS/KhmerOSsys.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMCHANTHA.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMCHRIENG1.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMERKEP.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMERMEF1.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMERMEF2.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMKAMPONGTRACH.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMKAMPOT.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMKOLAB.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMMOOL.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMMOOL1.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMNETTRA.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMOLD.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMVANARA.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMVIRAVUTH.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/KHMWATPHNOM.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Battambang.ttf
abvf blwf blws clig liga pres : KhmerUnicodeFonts/Kh-Bokor.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Content.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Dangrek.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Fasthand.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Freehand.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Kangrey.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Koulen.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-KoulenL.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Metal-Chrieng.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Muol-Pali.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Muol.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-Siemreap.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/Kh-System.ttf
abvf abvs blwf blwm blws clig dist mkmk pref pres pstf psts : KhmerUnicodeFonts/KhUniF1.ttf
abvf abvs blwf blwm blws clig dist mkmk pref pres pstf psts : KhmerUnicodeFonts/KhUniL1.ttf
abvf abvs blwf blwm blws clig dist mkmk pref pres pstf psts : KhmerUnicodeFonts/KhUniN2.ttf
abvf abvs blwf blwm blws clig dist mkmk pref pres pstf psts : KhmerUnicodeFonts/KhUniR1.ttf
abvf abvs blwf blwm blws clig dist mkmk pref pres pstf psts : KhmerUnicodeFonts/KhUniR2.ttf
abvf abvs blwf blwm blws clig dist mkmk pref pres pstf psts : KhmerUnicodeFonts/KhUniSerif.ttf
blwf blws clig liga pres : KhmerUnicodeFonts/KhmerMuol.ttf
abvf abvm abvs blwf blwm blws clig mkmk pref pres pstf psts : KhmerUnicodeFonts/KunKhmer.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/NiDAAngkor.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/NiDABayon.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/NiDAChenla.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/NiDAFunan.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/NiDAKhmerEmpire.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/NiDASowanaphum.ttf
abvf blwf blws clig liga pres psts : KhmerUnicodeFonts/NiDATaprom.ttf
: KhmerUnicodeFonts/Sankor.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts rlig : KhmerUnicodeFonts/kMon40hV2E3s.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts rlig zz01 zz01 zz02 zz02 zz03 zz03 zz04 zz04 zz05 zz05 zz06 zz06 zz07 zz07 zz08 zz08 zz09 zz09 zz10 zz10 zz11 zz11 zz12 zz12 zz13 zz13 zz14 zz14 zz15 zz15 zz16 zz16 zz17 zz17 zz18 zz18 zz19 zz19 zz20 zz20 zz21 zz21 zz22 zz22 zz23 zz23 zz24 zz24 zz25 zz25 zz26 zz26 zz27 zz27 zz28 zz28 zz29 zz29 zz30 zz30 zz31 zz31 zz32 zz32 zz33 zz33 zz34 zz34 zz35 zz35 zz36 zz36 zz37 zz37 zz38 zz38 zz39 zz40 zz41 zz42 zz43 zz44 zz45 zz46 zz47 zz48 zz49 zz50 zz51 zz52 zz53 : KhmerUnicodeFonts/kMons40V2E3.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts rlig : KhmerUnicodeFonts/kMons40dicV2E3s.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts rlig : KhmerUnicodeFonts/kMons40spcV2E3s.ttf
abvf abvm abvs blwf blwm blws ccmp clig dist pref pres pstf psts rlig : KhmerUnicodeFonts/kMons60bV2E3s.ttf
abvf abvm abvs blwf blwm blws blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/kMons80V1E2s.ttf
abvf abvm abvs blwf blwm blws blws ccmp clig dist pref pres pstf psts : KhmerUnicodeFonts/kMons99V1E2s.ttf
abvf abvm abvs blwf blwm blws blws ccmp clig pref pres pstf psts : KhmerUnicodeFonts/mond40uhgs.ttf
-------------- next part --------------
diff --git a/gfx/harfbuzz/src/hb-ot-shape-complex-private.hh b/gfx/harfbuzz/src/hb-ot-shape-complex-private.hh
--- a/gfx/harfbuzz/src/hb-ot-shape-complex-private.hh
+++ b/gfx/harfbuzz/src/hb-ot-shape-complex-private.hh
@@ -278,23 +278,31 @@ hb_ot_shape_complex_categorize (const hb
     case HB_SCRIPT_TAKRI:
 
       /* Only use Indic shaper if the font has Indic tables. */
       if (planner->map.found_script[0])
 	return &_hb_ot_complex_shaper_indic;
       else
 	return &_hb_ot_complex_shaper_default;
 
+
     case HB_SCRIPT_KHMER:
-      /* If the font has 'liga', let the generic shaper do it. */
+      /* If the font has both 'liga' and 'pres', let the generic shaper do it
+       * as it's probably a font that tries to do "khmer shaping" with the
+       * generic 'liga' lookups, and explicitly applying the khmer rules via
+       * the indic shaper will duplicate the pre-base part of split vowels. */
       if (!planner->map.found_script[0] ||
-	  hb_ot_layout_language_find_feature (planner->face, HB_OT_TAG_GSUB,
-					      planner->map.script_index[0],
-					      planner->map.language_index[0],
-					      HB_TAG ('l','i','g','a'), NULL))
+	  (hb_ot_layout_language_find_feature (planner->face, HB_OT_TAG_GSUB,
+					       planner->map.script_index[0],
+					       planner->map.language_index[0],
+					       HB_TAG ('l','i','g','a'), NULL) &&
+	   hb_ot_layout_language_find_feature (planner->face, HB_OT_TAG_GSUB,
+					       planner->map.script_index[0],
+					       planner->map.language_index[0],
+					       HB_TAG ('p','r','e','s'), NULL)))
 	return &_hb_ot_complex_shaper_default;
       else
 	return &_hb_ot_complex_shaper_indic;
 
 
     case HB_SCRIPT_MYANMAR:
       /* For Myanmar, we only want to use the Indic shaper if the "new" script
        * tag is found.  For "old" script tag we want to use the default shaper. */


More information about the HarfBuzz mailing list