[HarfBuzz] Questions about Itemization in QT / Pango

Ed Trager ed.trager at gmail.com
Wed Jan 2 12:33:47 PST 2008


Hi, Behdad,

Thanks for the information. I'm glad to hear that N'Ko is now supported.

BTW, it looks like it would be very easy to also support New Tai Le.
The SIL Dai Banna font page describes just four vowel signs which need
to be re-ordered to visually precede base consonants:

U+19B5 		NEW TAI LUE VOWEL SIGN E
U+19B6 		NEW TAI LUE VOWEL SIGN AE
U+19B7 		NEW TAI LUE VOWEL SIGN O
U+19BA 		NEW TAI LUE VOWEL SIGN AY

See the following URL:

       http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=DaiBannaSIL

The SIL Dai Banna font is an SIL OFL'ed font, so it would also be the
logical choice for testing.

Best Wishes -- Ed Trager

On Jan 1, 2008 7:29 PM, Behdad Esfahbod <behdad at behdad.org> wrote:
> On Sun, 2007-12-30 at 14:23 -0500, Ed Trager wrote:
> > Hi, Behdad, Simon, and everyone,
>
> Hello Ed,
>
> > I have been wondering recently a little bit about how QT and Pango
> > handle itemization:
> >
> >   (1) Do QT and Pango fully support itemization of all scripts now
> > present in Unicode 5 ?
>
> Yes, Pango 1.18 supports Unicode 5.0.  1.20 will support Unicode 5.1.
>
>
> > In other words, while perhaps HarfBuzz does
> > not yet handle OpenType layout of N'Ko or New Thai Le scripts, but
> > would the itemizers in QT and Pango correctly identify segments of
> > text in N'Ko and New Thai Le (and other recent Unicode script
> > additions) as belonging to those respective scripts?
>
> Pango 1.18 in fact does support N'Ko.  See:
>
>   http://mces.blogspot.com/2007/05/pango-opentype-update.html
>
>
> >   (2) What about Plane 1 CJK?  If I created a text containing BMP CJK
> > with a smattering of Plane 1 CJK thrown in, how will QT and Pango
> > itemize or segment that text ?
> >
> >   (3) What about itemization of other Plane 1 scripts in Unicode, like
> > Linear B, etc.?
>
> Pango (and I believe Qt too) uses Unicode Character Database.  So, all
> the characters marked as Script Han will be grouped together.
>
>
> >   (4) How do QT and Pango handle IPA phonetic characters?  Officially,
> > one could consider IPA and other phonetic extensions in Unicode as
> > belonging to "Latin" (latn) script.    Some might say that is a bit of
> > a stretch, because some IPA symbols might actually be closer to Greek
> > in origin, but certainly Michael Everson, inter alia, will give IPA a
> > "Latin" appelation.  But when actually laying out text, a user might
> > need or desire to use a special font (such as SIL Gentium, for
> > example) for laying out segments of IPA phonetics.  For example,
> > suppose I am writing a dictionary and my words and definitions are in
> > one font, while I might desire that my phonetic pronounciations are in
> > a different font tailored for such things.  Of course my word
> > processor or page layout program will permit me to manually select
> > which fonts to use for which parts of my document, and that is fine.
> > I am just wondering if QT or Pango have any special code to handle
> > such things in a more automated fashion, or on a level closer to
> > fontconfig's font matching attempts?
>
> IPA Extensions are marked as script Latin in Unicode.
>
>
> >   (5) A similar question for mathematical, scientific, and other
> > miscellaneous symbols.  Unicode now contains a number of blocks which
> > make up a rather extensive set of mathematical, scientific, and
> > miscellaneous technical symbols.  Fonts such as the STIX font set are
> > now available to specifically address the needs of scientific,
> > mathematical, and other technical users.  So, once again I am just
> > wondering how QT and Pango handle itemization/segmentation of runs of
> > text containing such symbols?  Are such symbols just treated as being
> > neutral?  I'm just wondering if one can make an argument for defining
> > a separate script category for "symbols" and then having a text
> > itemizer automatically break out segments of text containing such
> > symbols as separate items which can then be rendered using a font or
> > set of fonts that are tailored for such things.  One can imagine
> > having a category for "symbol fonts" as part of the fontconfig
> > pipeline, so that fontcconfig could provide automatic substitution for
> > such text segments.  Does that make sense?
>
> No.  You don't want the period, question mark, brackets, quotations, etc
> in your Latin text be rendered using a separate font.  This all really
> belongs to higher level to mark text appropriately with the desired
> font.
>
>
> > Since I don't know how QT and Pango currently do these things, I
> > thought I would ask.
> >
> > Best Wishes for a Happy and Prosperous New Years to all! -- Ed Trager
>
> Happy New Year to all,
>
> --
> behdad
> http://behdad.org/
>
> "Those who would give up Essential Liberty to purchase a little
>  Temporary Safety, deserve neither Liberty nor Safety."
>         -- Benjamin Franklin, 1759
>
>



More information about the HarfBuzz mailing list