[Fontconfig] Improving Latin font selection for CJK locales
Gerrit Sangel
z0idberg at gmx.de
Mon Jan 28 14:22:10 PST 2008
Am Montag 28 Januar 2008 schrieb Ed Trager:
Currently, locales are composed of just two elements:
>
> (1) A "language" code (ISO-639-1, -2 : "en", "ja", "zh", "th", etc.)
> and (2) A "region" code ("US", "CA", "FR", "TW", "HK", "SG", etc. )
>
> This concept is incomplete. A THIRD ELEMENT, SCRIPT, NEEDS TO BE ADDED.
> Using four-letter ISO-15924 (
> http://unicode.org/iso15924/iso15924-codes.html ) codes is the obvious
> answer:
>
> (3) "Script" code (ISO-15924 : "arab", "cyrl", "hans"
> (simplified Chinese), "hant" (traditional Chinese)
>
> Both "region" and "script" can be considered as "optional". So we
> could now enumerate locales such as:
I think I suggested this some months before and I still strongly support this.
It would also be necessary for a German locale in Fraktur writing (for which
I am currently gathering information).
>
> => "Fully Specified" locales with all three elements:
>
> az_AZ_latn
> az_AZ_cyrl
> az_IR_arab
>
> zh_HK_hans
> zh_HK_hant
>
> => Locales missing "region" would also be permissable (and I think
> this variant would be extremely useful and I think translators would
> perhaps favor the generality that this option provides in many
> real-life applications):
Also strongly support this. For de_Latf.
But I would urge for the script code with the first letter capitalized, so it
can be properly distinguished from the language or region code.
>
> az_latn
> az_arab
> az_cyrl
>
> zh_hans
> zh_hant
>
> => Locales missing "script" of course also permissable (this is the
> current "status quo"): Systems would have to have rules for the
> "default" script :
>
> az_AZ : defaults to "latn" (Latin became official in
> Azerbaijan in 1991 although uptake has been apparently slow)
> az_IR : defaults to "arab"
>
> zh_HK : defaults to "hant"
> zh_SG : defaults to "hans"
>
> => Locales missing both "region" and "script" are also permissable
> (again this does not differ from current "status quo"):
>
> ja : implies (defaults to) "ja_JP_jpan"
> th : implies (defaults to) "th_TH_thai"
>
> The CLDR community is one obvious place for discussions about this,
> and I apologize that I have not had the time to investigate how far
> discussions on this topic have gotten in CLDR or other relevant
> communities (like maybe Linux LSB folks?).
>
> Adding a four-letter script code to Locale is the obvious remedy.
> Perhaps the Pango and Fontconfig communities could take the lead in
> creating the minor changes in infrastructure needed to support this
> addition ?
Another question, but I do not know, to which applications this may be of
concern: For German Fraktur, the application would sometimes have to switch
fonts in a message string for some foreign words or upper case abbreviations
(maybe this is similar to the CJK-latin-font problem). So somehow the
translation files would have to have a possibility to change the script and
(maybe) language on the fly, similar to html (with <span xml:lang="de-Latf">
The problem with fraktur is, that it is unified with ordinary Latin, so the
difference could only be distinguished via a optional parameter, providing
the information which script is to be used.
Gerrit Sangel
More information about the Fontconfig
mailing list