[Fontconfig] Font matching in Unicode locales
Keith Packard
keithp at keithp.com
Sun Oct 26 15:37:38 EST 2003
Around 23 o'clock on Oct 25, Ambrose Li wrote:
> If I could ask the question again, would this mean that
> "Big5" would be considered "supported" only when all the
> 13051 characters are supported?
That's a trick question -- 'Big5' is not a language, but a text encoding.
Fontconfig doesn't concern itself with supporting encodings, but only in
supporting 'orthographies' for individual languages, those Unicode values
necessary to represent the bulk of words in the standard character set for
the language as used in a particular territory.
I gratefully accept authoritative changes to the orthographies that
fontconfig uses in computing language support for each font; the ones that
I have were gathered from a wide variety of sources. For European
scripts, I was able to rely on the fine work of Michael Everson (http://
www.evertype.com). For other latin and cyrillic scripts, I found the
Institute of the Estonian language (http://www.eki.ee) very useful. For
other languages, I scrounged around the net. I recall spending a day or so
looking for an authoritative reference for the orthography of Luxemborgish
which is related to German but had no official written representation
until sometime after WWII.
The Unicode standard provided quite a bit of help with languages using
unique scripts, although the coverage for those languages is often far
more comprehensive than used with any kind of regularity (or provided in
fonts, for some). Again, local expertise is the best information,
unfortunately fluency in a language does not equate to expertise in the
character set.
For the Han languages, I relied heavily on the tables which transcode
between Unicode and standard local encodings. I know those are probably
way too inclusive, but I don't have a better source at the current time,
and as the goal is to identify fonts supporting a particular language, it
turns out to work relatively well -- most fonts for zh-tw started as Big5
fonts and generally cover all of the Han glyphs in that encoding quite
well. I'd really like to get better references for these orthographies,
perhaps someone on this list can point me at national standards for
each language that lists exactly the characters considered "required" for
representing each one.
-keith
More information about the Fontconfig
mailing list