[Fontconfig] Re: fc-cache sometimes looses fonts

James Cloos cloos at jhcloos.com
Tue Nov 1 05:09:16 EST 2005

>>>>> "Patrick" == Patrick Lam <plam at MIT.EDU> writes:

Patrick> I've reverted this patch, as it breaks fontconfig-using
Patrick> applications (the ones that subsequently use Pango to fetch
Patrick> fonts).  I'm not sure what the correct solution is, but
Patrick> stripping basenames isn't it...  I guess that I have to put
Patrick> full pathnames, but I don't know how to get canonical full
Patrick> pathnames.  Anyone?

I guess I should have read further before hijacking a thread with a
postscript.... [SIGH]

Anyway, to answer the question you do ask above, in general you don't.

Going from filename to path is in the same problem space as going
from hash to source.  With some additional info you can come up
with plausible results, but others could also exist.

For fc-cache, you can probably presume that <dir/> entries are
absolute.  Whether you should traverse the path and expand out
symlinks is an open question, but that should be the only one.

(You do, though, have to convert anything matching '^~/(.+)$'
into "$HOME/$1".  But otherwise relative paths in <dir/> entries
can probably be defined as errors.)

For paths specified in ARGV[] you need to replace '^\./(.+)$'
with "$CWD/$1" as well as doing the tilde substitution above.

The problem -- and one I should have noticed when I first realized you
were storing absolute paths in the cache-2 files -- is that paths are
not fixed.  Between networked file systems, networked block devices
(including things like fibre channel), bind mounts, process-specific
namespaces, et al you cannot presume that the absolute paths are

Each box mounting an enterprise-wide -- or even just group-wide --
filesystem could use a different mountpoint.  Chroot(2)s might bind
the box's master locations at different points.  Distributed
workstations might mount some central filesystem under the logged-in
user's $HOME.  

In short, cache-2's support for multiple machine-specific chunks and
its use of absolute paths are pi radians out of phase.

The cache-1 files do not use absolute paths; the api allows the
library users to locate the fonts anyway.  The cache-2 api will have
to replicate that behavior.

AFAICS, that means each process will have to store the path to each
cache-2 file and for each font opened will have to create a string
in malloc(2)ed vm from that pathname and the string in the cache-2
file to pass to open(2) or mmap(2).  This will regress the vm savings
somewhat -- especially for programs that like to show all of the
available fonts' names rendered in said font -- but I don't see any
real alternative.  (Though I'd love to be proved wrong. :)

How this issue compares with the idea of moving the cache files into a
single dir, naming them something like .../$(hash $pathname).cache-$ver,
is another interesting question.  There, absolute pathnames may be OK,
provided the cache files are re-written whenever they are discovered
to be out of date or corrupt.   (Another interesting question:  if
the library causes each caller to write out a cache file when needed,
what happens when 1000 apps notice the cache file is corrupt or old
and start simultaneously writing out a replacement?)

James H. Cloos, Jr. <cloos at jhcloos.com>

More information about the Fontconfig mailing list