[Fontconfig] asian font configuration

Tor Andersson tor.andersson at gmail.com
Thu Dec 9 15:43:30 EST 2004


hi

> Thanks muchly for the additional data.  The list of alternate common (but
> non-free) family names is really nice to have.

the list is far from complete, but at least a starting point.
 
> If this stuff isn't working for you, we've just got bugs in fontconfig
> that need fixing.

this approach as-is has two problems for asian fonts. one: if you forget to
select language you tend to end up with times or verdana. two: even if you
do set the language, you may end up with slightly less-than-preferred fonts.
for example selecting lang=zh-tw you may end up with arial unicode or a
simplified chinese font...

asian fonts in the same family are more substitutable than latin fonts,
since they are mostly mono-spaced. they are somewhere in between
the 'times/times new roman' and 'serif' categorisation.

stratifying this into two layers would be a solution. how about splitting
the categories into separate scripts and chaining them together at
the end. this way if you ask for a typically cyrillic font and one doesnt
find an 'exact' substitute you'll go through the list of cyrillic generic fonts
first before trying arial unicode or falling back to verdana?

how about a font config that does the following:

first, exact aliases:

* alias to truetype substitutes -- Times to Times New Roman
* alias to ghostscript urw substitutes -- Times to Nimbus Roman No9 L

then, classification of known fonts:

* classify known fonts into generic+script -- Kochi Gothic to 'sans-serif+japan'

some cleaning of the pattern, tie together generic families with priority:

* if asked for generic and language tag is set, change to generic+script
* after the 'generic+script' append a pure 'generic' to catch unicode fonts

* if not classified and language tag is set, add sans-serif+script
* if not classified yet, add 'sans-serif'

* after pure 'generic' add all 'generic+script' versions to catch the
  case of wanting all 'generic' fonts

and finally...

* font substitution 'generic+script' to preferred -- serif+korea to
Baekmuk Batang
* font substitution 'generic' to preferred unicode font

example:

  Batang
  Batang, serif+korea
  Batang, serif+korea, serif
  Batang, serif+korea, serif, serif+latin,serif+chinasimp,....
  Batang, Baekmuk Batang, serif+korea, serif, ...latin fonts...,
serif+latin, ...chinasimp fonts......

does that sound reasonable?

attached is a subs.conf that should be included instead of the
current alias/substitute stuff in the default fonts.conf.
 
> > i understand that due to the incapability of freetype to use CMaps to
> > encode CID fonts, the ability to use CID-fonts with fontconfig is severely
> > limited. however, it would be really really nice if fontconfig were extended
> > in this area.
> 
> I don't have a lot of experience with Type1 CID fonts as I've tried to
> stick to TrueType which supports Unicode so much more nicely.
> 
> If you know of code or even documentation which clearly shows how to get
> from Unicode to CID stuff, it would be greatly appreciated.

Adobe-CNS1-UCS2
Adobe-GB1-UCS2
Adobe-Japan1-UCS2
Adobe-Korea1-UCS2

the above CMaps map from CID to unicode. inferring the reverse mapping
should be trivial. if you need code, i have a cmap data structure and
parser that are part of my pdf project that should be easy to extract,
or you can just generate static mapping tables with a python script.
i'd be happy to code one up if you need it.

> If I can get from Unicode to CID, I can generate FC_LANG tags, but I'm not
> sure what other information belongs in fontconfig itself; remember that
> fontconfig is designed to provide information needed to select among
> fonts, not all of the information needed once you have a font in hand.

now how about a char *FcFindCMap(char *name) function?

if i find the CID-font with fontconfig, it is useless without the CMaps.
CMaps are basically a size-optimisation, instead of adding encoding
tables to all of the fonts like you do with truetype.

tor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: subs.conf
Type: application/octet-stream
Size: 13539 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20041209/9e3cd194/subs.obj


More information about the Fontconfig mailing list