[HarfBuzz] Harfbuzz-ng and Myanmar

Keith Stribley devel at thanlwinsoft.org
Mon Oct 26 01:11:13 PDT 2009


Hi Behdad,

I've been trying out harfbuzz-ng with pango 1.26 and come across a few 
issues with Myanmar Unicode fonts.

a) Lookup Flags

The Padauk font (http://scripts.sil.org/Padauk) uses mark and mkmk 
positioning, however, this seems to have been disabled. I tracked this 
down to what appears be an over restrictive use of the lookup flags and 
glyph properties. My current understanding is that lookup_flags can have 
a mark attachment type set within the 0xFF00 mask, in which case the 
mark attachment type bits should match with the glyph property bits.

In my test case of ကု the glyph has a property of 0x208 and the lookup 
flags on the mark lookup that should be applied are 0x200. I was able to 
get mark positioning working again with the following patch.

diff --git a/pango/opentype/hb-ot-layout.cc b/pango/opentype/hb-ot-layout.cc
index 3b6b8da..2c33cab 100644
--- a/pango/opentype/hb-ot-layout.cc
+++ b/pango/opentype/hb-ot-layout.cc
@@ -148,7 +148,7 @@ _hb_ot_layout_check_glyph_property (hb_face_t    *face,
   /* Not covered, if, for example, glyph class is ligature and
    * lookup_flags includes LookupFlags::IgnoreLigatures
    */
-  if (property & lookup_flags)
+  if (0x00FFu & property & lookup_flags)
     return false;
 
   if (property & HB_OT_LAYOUT_GLYPH_CLASS_MARK)

I'm not sure if the mask of 0xFF is correct, or whether it needs to be 
tightened further to just the ignore flags.

http://www.thanlwinsoft.org/~keith/tmp/Padauk-noMark.svg
shows the current harfbuzz-ng behaviour. Blue is ICU rendering, red is 
Harfbuzz-ng.

http://www.thanlwinsoft.org/~keith/tmp/Padauk-maskingLookupFlags.svg
shows the correct rendering after the above patch has been applied.

b) Order of applying features

The Myanmar3 font uses a mixture of clig and liga. In Uniscribe and ICU 
these lookups are applied in lookup order regardless of which particular 
feature is being used. In pango all the liga lookups are applied before 
the clig lookups in the basic shaper, which results in incorrect 
rendering since some of the clig lookups are expecting the liga lookups 
to have been applied in order. My understanding is that kern, mark and 
mkmk should also be applied together, in lookup order.

The difference in rendering is shown between
http://www.thanlwinsoft.org/~keith/tmp/Myanmar3-PangoFeatureOrder.svg 
(wrong)
http://www.thanlwinsoft.org/~keith/tmp/Myanmar3-LookupOrder.svg (correct)

I'm not sure if the relative order of applying lookups in different 
features is specified in the standard, but I would have thought it would 
be better to be consistent with Uniscribe and ICU's implementations. The 
Padauk font gets around this by just using the clig feature for the GSUB 
table lookups.

c) A Myanmar shaper

Myanmar3 and Padauk implement reordering using successive clig lookups, 
which is hard to implement, but does allow correct rendering. A while 
ago in the old harfbuzz a Myanmar shaper was written for one 
interpretation of Unicode 4, but this does not work with Unicode 5.1 
implementations and is the reason why Myanmar Unicode 5.1, 5.2 text 
cannot be displayed correctly in QT applications. Fortunately, Pango 
doesn't use that shaper.

Has the process of porting the old harfbuzz shapers to harfbuzz-ng been 
started, perhaps from Jonathan Kew? If so, where can I view it? Am I 
right to assume that this will eventually replace the code in pango/modules?

I would be interested in writing a new Myanmar shaper to perform the 
necessary reordering required for Unicode 5.2 to make it faster and 
easier to write Myanmar Unicode 5.1/5.2 compliant fonts. I will make 
sure that it doesn't break the existing fonts such as Padauk and 
Myanmar3 which don't require it.

d) Harfbuzz-ng interface

The SVGs above were generated with a utility that I use to compare 
rendering between ICU, Harfbuzz and Graphite.  
(http://scripts.sil.org/svn-public/graphite/grutils/trunk) I realise 
that hb_shape is currently just a stub, but I was able to implement the 
OT rendering by looking at basic_engine_shape, 
_pango_ot_info_substitute, _pango_ot_info_position and apply_gpos_ltr. 
The use of positions array from hb_buffer_get_glyph_positions was 
probably the least intuitive, the other aspects were fine.

In the process of this I started to write some implementations of the 
hb_font_funcs. Should the hb_font_get_contour_point_func_t have an extra 
argument for the point index? Currently it is defined as:

typedef hb_bool_t (*hb_font_get_contour_point_func_t) (hb_font_t *font, 
hb_face_t *face, const void *user_data,
                               hb_codepoint_t glyph, hb_position_t *x, 
hb_position_t *y);

What will be the contents of the hb_glyph_metrics_t? I made a guess at:

struct _hb_glyph_metrics_t
{
    hb_position_t x;
    hb_position_t y;
    hb_position_t x_offset;
    hb_position_t y_offset;
    hb_position_t width;
    hb_position_t height;
};

cheers,
Keith




More information about the HarfBuzz mailing list