[Fontconfig] asian font configuration
Keith Packard
keithp at keithp.com
Thu Dec 9 21:11:24 EST 2004
Around 13 o'clock on Dec 9, Tor Andersson wrote:
> stratifying this into two layers would be a solution. how about splitting
> the categories into separate scripts and chaining them together at
> the end. this way if you ask for a typically cyrillic font and one doesnt
> find an 'exact' substitute you'll go through the list of cyrillic generic fonts
> first before trying arial unicode or falling back to verdana?
We currently have three layers here, the first is exact family matches
("strong" family matches), the second is language and territory matches and
the third is generic alias matches ("weak" family matches). This is
supposed to do exactly what you want.
> * classify known fonts into generic+script -- Kochi Gothic to 'sans-serif+japan'
That's done by adding the Kochi Gothic -> sans-serif alias entry. The
Japanese classification is done automatically by fontconfig's language
support detection code.
> the above CMaps map from CID to unicode. inferring the reverse mapping
> should be trivial. if you need code, i have a cmap data structure and
> parser that are part of my pdf project that should be easy to extract,
> or you can just generate static mapping tables with a python script.
> i'd be happy to code one up if you need it.
Fontconfig needs to do two things. The first is construct a list of
Unicode values supported by the font. For this it needs to enumerate the
encoded glyphs and compute Unicode values for each one.
The second abilty is mapping from Unicode codepoints to glyphs; this is
largely a convenience for applications which don't want to deal with the
obscurities of non-Unicode mappings. Fontconfig already has a couple of
built-in transcoding tables to handle Apple Roman and Adobe Symbol encoded
fonts, which often are either missing Unicode mappings or which have
broken Unicode mappings (I have about 1200 fonts with broken Unicode
mappings which have functional Apple Roman mappings).
FreeType has functions to enumerate the encoded glyphs in a font
(FT_Get_First_Char and FT_Get_Next_Char) which fontconfig happily uses to
enumerate glyphs, but for non-Unicode mappings it will need an additional
function to convert from the encoded value to a Unicode value.
Please take a look at FcFreeTypeCharSetAndSpacing in fcfreetype.c to see
how that all works.
> now how about a char *FcFindCMap(char *name) function?
>
> if i find the CID-font with fontconfig, it is useless without the CMaps.
> CMaps are basically a size-optimisation, instead of adding encoding
> tables to all of the fonts like you do with truetype.
As I recall, the CMaps are external to the fonts themselves, just like
kerning data in the .pfa files. Is there a standard naming convention
which is used to locate CMap files for particular font files? Or could
we construct such a convention? I don't see how fontconfig would
otherwise locate the files, and if there is a suitable convention, we
should just get applications to use the same.
-keith
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20041209/b3662044/attachment.pgp
More information about the Fontconfig
mailing list