[Fontconfig] Font matching in Unicode locales

Keith Packard keithp at keithp.com
Sun Oct 26 15:37:38 EST 2003


Around 23 o'clock on Oct 25, Ambrose Li wrote:

> If I could ask the question again, would this mean that
> "Big5" would be considered "supported" only when all the
> 13051 characters are supported?

That's a trick question -- 'Big5' is not a language, but a text encoding.

Fontconfig doesn't concern itself with supporting encodings, but only in 
supporting 'orthographies' for individual languages, those Unicode values 
necessary to represent the bulk of words in the standard character set for 
the language as used in a particular territory.

I gratefully accept authoritative changes to the orthographies that 
fontconfig uses in computing language support for each font; the ones that 
I have were gathered from a wide variety of sources.  For European 
scripts, I was able to rely on the fine work of Michael Everson (http://
www.evertype.com).  For other latin and cyrillic scripts, I found the 
Institute of the Estonian language (http://www.eki.ee) very useful.  For 
other languages, I scrounged around the net.  I recall spending a day or so 
looking for an authoritative reference for the orthography of Luxemborgish 
which is related to German but had no official written representation 
until sometime after WWII.

The Unicode standard provided quite a bit of help with languages using 
unique scripts, although the coverage for those languages is often far 
more comprehensive than used with any kind of regularity (or provided in 
fonts, for some).  Again, local expertise is the best information, 
unfortunately fluency in a language does not equate to expertise in the 
character set.

For the Han languages, I relied heavily on the tables which transcode 
between Unicode and standard local encodings.  I know those are probably 
way too inclusive, but I don't have a better source at the current time, 
and as the goal is to identify fonts supporting a particular language, it 
turns out to work relatively well -- most fonts for zh-tw started as Big5 
fonts and generally cover all of the Han glyphs in that encoding quite 
well.  I'd really like to get better references for these orthographies, 
perhaps someone on this list can point me at national standards for 
each language that lists exactly the characters considered "required" for 
representing each one.

-keith






More information about the Fontconfig mailing list