[Fontconfig] ISO 15924 font selection

Behdad Esfahbod behdad at behdad.org
Mon Dec 3 23:54:32 PST 2007


On Sun, 2007-12-02 at 19:13 -0800, Keith Packard wrote:
> On Sun, 2007-12-02 at 13:43 +0100, Gerrit Sangel wrote:
> > Hello,
> > 
> > Sorry if this was asked before, but I could not find anything in the archive.
> > 
> > Has fontconfig the option to define fonts according to ISO 15924 (and not only 
> > according to ISO 639)?
> 
> I chose to use ISO 639 because this tagging already existed in the HTML
> standard, and because I could readily find orthographies identifiable
> with specific ISO 639 languages.
> 
> Adding support for the 15924 values seems like it would be easy to do in
> a compatible fashion; as those values do not conflict with either 639-1
> or 639-2, we could simply add orthographies for the script codes and
> things should 'just work'.

I've thought about passing script knowledge from Pango to fontconfig
before:

  http://bugzilla.gnome.org/show_bug.cgi?id=346043

It's useful indeed, but I don't think using scripts *instead* of
language makes sense.  What I imagine is useful is having the pattern
element script=arabic.  That can be matched for font tailoring.

I also had Unicode scripts in mind, instead of ISO 15924, and I had a
user-readable version in mind, like "arabic" and "latin".  Pango already
has that information and it can be deduced from standard Unicode script
names.  Doesn't mean it can't be ISO 15924 names though, but the mapping
is not one to one, and I really don't understand why Fraktur is a
different script than Latin in there.  I don't think this feature if
added should be used for things like Fraktur.


> > In my opinion, it is much more flexible than defining fonts according to a 
> > specific region (e.g. TW or CN). In some cases, it is even necessary, because 
> > the region does not differ.
> 
> Yeah, conflicts among multiple scripts used for the same langauge in the
> same territory do exist, which fontconfig doesn't handle well at all.

If we add script tags in excess to language tags, orthographies then can
be extended to tell what script is used in them.  Matching can skip if
script tags don't match.


> > Do I understand this correctly, that the user can specify a font in the config 
> > file according to a specific language?
> 
> You can match on the language and prepend a family name to make that
> preferred.
> 
> > I see this in Firefox (even though it does not seem to use fontconfig, but I 
> > guess an addon could be written to solve it)
> 
> firefox does use fontconfig, although the language-based selection is
> internal, not based on modifying fontconfig matching rules.
> 
> > So I think a possible way would be to define a general rule for a language 
> > (according to ISO-639) or a script (ISO 15924) at first and then a specific 
> > rule for a language or script which would override the general rule.
> 
> The pattern matching and editing rules should be able to handle this
> without change, execpt for the addition of ISO 15924 script codes to the
> existing set of language/territory pairs.

Another piece of information that can improve language matching is to
use ISO 639-3 macrolanguage information.  That can fontconfig for
example that Dari is a Persian language for example:

  http://bugzilla.gnome.org/show_bug.cgi?id=470907

-- 
behdad
http://behdad.org/

...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning.  -- Matt Welsh



More information about the Fontconfig mailing list