[Fontconfig] Font matching in Unicode locales

Owen Taylor otaylor at redhat.com
Sat Oct 25 23:35:24 EST 2003


On Sat, 2003-10-25 at 01:09, John Alexander Thacker wrote:
> I have a real problem with fontconfigi-2.2.1 selecting the wrong fonts for 
> the Sans alias when I'm in a Unicode locale, specifically en_US.UTF-8.  It 
> seems that fontconfig prioritizes the current language setting too much, 
> which is a real problem when the intended glyphs come from a language other 
> than English.
> 
> Specifically, I occasionally view Japanese Unicode files, though I want 
> my UI to remain English, and hence keep my locale as en_US.UTF-8.  I'd 
> like the Sans alias to use the preferred Kochi Gothic for viewing Japanese 
> glyphs, and similarly to use Kochi Mincho for Serif.  In fact, these are
> listed as preferred by default in fonts.conf, if I understand the file.
> 
> However, neither Kochi Gothic nor Kochi Mincho list "en" as a supported 
> language.  This means that fontconfig will strongly prefer any other font
> which has those glyphs to Kochi Gothic or Kochi Mincho.  On a RedHat 9
> installation, this will mean preferring to use the far inferior hiragana
> and katakana contained in the MiscFixed bitmap font to that in Kochi Gothic 
> or Kochi Mincho.  (MiscFixed is installed by default as part of the
> bitmap-fonts package, and placed in /usr/share/fonts/bitmap-fonts.)
> Mozilla, of course, will completely refuse to use Kochi Gothic or Kochi
> Mincho for Unicode at all, since, as previously discussed, it is even
> more restrictive based on the language supported by a font.

Yes, this is a known difficulty. I've discussed it with Keith at length
several times, but we've never come up with a satisfactory solution.

I have a pretty good workaround that will be in Pango-1.4. How that
will work is that Pango computes the script for each run of text
(Latin, Arabic, Han, Hirigana, etc.) If that script isn't one of the
scripts used for the language tag, the language tag is removed and
replaced either with an appropriate language tag for the script
(Arabic => ar, Greek => el, etc.), or if, as for Han characters, there
is no good default language tag, with ??.

This approach could be copied by other apps, though they'd need the
an equivalent of the codepoint => script and script <=> language code
that Pango has.

[...]

> It seems to me that the promise of Unicode and fontconfig would be that
> I could just set all my applications to use "Sans" (or "Monospace" or 
> "Serif") by default, and that would always select the proper font.  I
> shouldn't have to change my locale or my font just to view something in
> Japanese-- that breaks the whole point of Unicode.

Well, actually, you are almost certainly going to have to tell the
rendering system that you are rendering Japanese to get good results
when you render Japanese. There is no reliable automatic way to tell
that text is in Japanese rather than Chinese or Korean.

Generally, this has to occur at the application level. For Pango-1.4
I may add some hooks so that you can configure things so that Han
characters will get, say, a ja language tag when no other information
is available instead of ??, but that is going to be an expert thing.

It's also possible to put magic in your fonts.conf to replace 'en'
language tags with 'en,ja'. That would be an immediate workaround
for your problem, though I don't consider it a solution.

Regards,
						Owen






More information about the Fontconfig mailing list