[Fontconfig] asian font configuration

Keith Packard keithp at keithp.com
Thu Dec 9 21:11:24 EST 2004


Around 13 o'clock on Dec 9, Tor Andersson wrote:

> stratifying this into two layers would be a solution. how about splitting
> the categories into separate scripts and chaining them together at
> the end. this way if you ask for a typically cyrillic font and one doesnt
> find an 'exact' substitute you'll go through the list of cyrillic generic fonts
> first before trying arial unicode or falling back to verdana?

We currently have three layers here, the first is exact family matches
("strong" family matches), the second is language and territory matches and
the third is generic alias matches ("weak" family matches).  This is 
supposed to do exactly what you want.

> * classify known fonts into generic+script -- Kochi Gothic to 'sans-serif+japan'

That's done by adding the Kochi Gothic -> sans-serif alias entry.  The 
Japanese classification is done automatically by fontconfig's language 
support detection code.

> the above CMaps map from CID to unicode. inferring the reverse mapping
> should be trivial. if you need code, i have a cmap data structure and
> parser that are part of my pdf project that should be easy to extract,
> or you can just generate static mapping tables with a python script.
> i'd be happy to code one up if you need it.

Fontconfig needs to do two things.  The first is construct a list of 
Unicode values supported by the font.  For this it needs to enumerate the
encoded glyphs and compute Unicode values for each one.

The second abilty is mapping from Unicode codepoints to glyphs; this is
largely a convenience for applications which don't want to deal with the
obscurities of non-Unicode mappings.  Fontconfig already has a couple of
built-in transcoding tables to handle Apple Roman and Adobe Symbol encoded 
fonts, which often are either missing Unicode mappings or which have 
broken Unicode mappings (I have about 1200 fonts with broken Unicode 
mappings which have functional Apple Roman mappings).

FreeType has functions to enumerate the encoded glyphs in a font 
(FT_Get_First_Char and FT_Get_Next_Char) which fontconfig happily uses to 
enumerate glyphs, but for non-Unicode mappings it will need an additional 
function to convert from the encoded value to a Unicode value.

Please take a look at FcFreeTypeCharSetAndSpacing in fcfreetype.c to see 
how that all works.

> now how about a char *FcFindCMap(char *name) function?
> 
> if i find the CID-font with fontconfig, it is useless without the CMaps.
> CMaps are basically a size-optimisation, instead of adding encoding
> tables to all of the fonts like you do with truetype.

As I recall, the CMaps are external to the fonts themselves, just like 
kerning data in the .pfa files.  Is there a standard naming convention 
which is used to locate CMap files for particular font files?   Or could 
we construct such a convention?  I don't see how fontconfig would 
otherwise locate the files, and if there is a suitable convention, we 
should just get applications to use the same.

-keith


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20041209/b3662044/attachment.pgp


More information about the Fontconfig mailing list