[Fontconfig] Caching strategy improvements

Behdad Esfahbod behdad at behdad.org
Tue Feb 27 02:57:46 UTC 2018


Thanks for clarification. No worries.

Rewriting the cache is an interesting challenge, but so far we don't have
any volunteers.

On Mon, Feb 26, 2018 at 6:28 PM, Kurt Kartaltepe <kkartaltepe at gmail.com>
wrote:

> I have rebuilt 2.12.93 tonight and it appears I was mistaken. I had
> attempted to replace 2.12.6 in my build chain but that must have been
> reverted as 2.12.93 indeed provides ~100x improvement and builds the
> cache on my system in 600ms.
>
> I still hold this is not a "blanket you should improve it" post.
> Indeed a hashmap (or any mapping between patterns and files that
> allows rapid validation of non-dirty files) on disk that reuses the
> patterns for files that didnt change is indeed what I suggesting from
> the start. I don't see why this needs to be lock-free as the entire
> structure can be atomically updated using the same mechanisms already
> in use for the cache. I defer to your experience if this cache is
> contended enough to warrant such a structure.
>
> I understand this is would be a significant project which is why i
> brought it to the mailing list and now that font cache build times are
> in seconds for large font libraries it is indeed harder to justify.
> Thank you very much for your time and sorry this all started due to a
> mistake on my own part. (I hope I have not been using terms
> inappropriately. but I have been using font/file interchangeably and
> from your replies it appears this may have been a mistake).
>
> --Kurt Kartaltepe
>
> On Mon, Feb 26, 2018 at 5:26 PM, Behdad Esfahbod <behdad at behdad.org>
> wrote:
> > On my laptop, warm fc-cache -f of over 2000 fonts takes 3.5s. So maybe
> worth
> > checking what's taking so much time on your setup.
> >
> > Believe me, if we knew how to make it faster easily, we would have done.
> So,
> > any blanket "you should improve it" has no information content
> whatsoever.
> >
> > Caching per font is not realistic.
> >
> > The best I can think of, requires a complete rewrite of the caching, and
> > would use a single cache file that implements a lock-free hashmap on the
> > disk and reuses pattern for files that didn't change. But that's a very
> > significant project to undertake.
> >
> > On Mon, Feb 26, 2018 at 3:38 AM, Kurt Kartaltepe <kkartaltepe at gmail.com>
> > wrote:
> >>
> >> Sorry It appears I have not been replying to the list.
> >>
> >> I would like to add testing on 2.12.6 before the much improved
> >> performance changes was ~40s cache build times with significant disk
> >> I/O. So the newly improved scanning is much appreciated but doesn't
> >> solve all the issues with cache build times.
> >>
> >> On Mon, Feb 26, 2018 at 5:29 AM, Kurt Kartaltepe <kkartaltepe at gmail.com
> >
> >> wrote:
> >> > 2.12.93 as released on
> >> > https://www.freedesktop.org/software/fontconfig/release/
> >> >
> >> > On Mon, Feb 26, 2018 at 5:27 AM, Behdad Esfahbod <behdad at behdad.org>
> >> > wrote:
> >> >> Just to make sure we are on the same page, which fontconfig version
> are
> >> >> you
> >> >> testing with?
> >> >>
> >> >> On Mon, Feb 26, 2018 at 3:21 AM, Kurt Kartaltepe
> >> >> <kkartaltepe at gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> For clarification, I have tested with ONLY the ttf fonts on my
> system.
> >> >>> In this case the normal 18s cache build step takes 15s. This
> suggests
> >> >>> to me there is no significant difference between FON and TTF, as
> they
> >> >>> made up ~21% of my fonts and removing them resulted in a
> proportionate
> >> >>> savings. Sorry If my OP was misleadingly suggesting that FON files
> >> >>> were exceptionally slow, I only meant that they may not have
> received
> >> >>> the same improvement as TTF files which may just be my
> >> >>> misunderstanding of the changes you made and lack of testing.
> >> >>>
> >> >>> I am concerned with why it seems acceptable to rebuild the entire
> >> >>> cache when only a tiny portion of it has actually changed. Users for
> >> >>> which rebuilding the cache is a significant event are those with
> large
> >> >>> font libraries. These users are are by their very nature more likely
> >> >>> to add or remove fonts from their library. It seems that this is the
> >> >>> worst possible case for the current caching strategy, and *this*
> seems
> >> >>> like an issue worth fixing.
> >> >>>
> >> >>> In this case if checksuming files is slower than scanning them the
> >> >>> issue still stands. Why checksum files that haven't changed? Does
> >> >>> fontconfig not trust filesystem metadata? It would appear directory
> >> >>> change times are used in detecting when to rescan so why can this
> not
> >> >>> be extended to files instead of the expensive checksum?
> >> >>>
> >> >>> FWIW an md5sum of my entire font library takes ~1s with hot caches
> >> >>> which I still find unacceptable as my library is possibly
> >> >>> significantly smaller and my system significantly more powerful
> than a
> >> >>> potential user's.
> >> >>>
> >> >>> --Kurt Kartaltepe
> >> >>>
> >> >>> On Sun, Feb 25, 2018 at 9:10 PM, Behdad Esfahbod <behdad at behdad.org
> >
> >> >>> wrote:
> >> >>> > What's with fon files being slow? Please report *that* and let's
> fix
> >> >>> > it.
> >> >>> >
> >> >>> > We've made scanning, like, 100x faster already. 2007 stats are
> >> >>> > irrelevant.
> >> >>> > Checksuming files is slower than scanning them now.
> >> >>> >
> >> >>> > On Sun, Feb 25, 2018 at 8:08 AM, Kurt Kartaltepe
> >> >>> > <kkartaltepe at gmail.com>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> While trying to move a project to the pango stack I noticed the
> >> >>> >> native
> >> >>> >> font selection backends were bad/useless on some platforms (like
> >> >>> >> windows see [1]). So I opted to try and use fontconfig on all
> >> >>> >> platforms as it performs outstandingly and has wonderful defaults
> >> >>> >> for
> >> >>> >> all platforms.
> >> >>> >>
> >> >>> >> However during this transition I noticed that there are some
> major
> >> >>> >> issues with cache build speed and during investigation I see that
> >> >>> >> there has recently been effort to improve the situation[2]. From
> >> >>> >> what
> >> >>> >> I can tell the fontconfig team has maintained that these cache
> >> >>> >> issues
> >> >>> >> were irrelevent for the primary fontconfig platform (linux) [3].
> On
> >> >>> >> linux of course the cache is global and maintained usually by
> font
> >> >>> >> packages ensuring its up-to-date. However it was precisely this
> the
> >> >>> >> slow cache build times that lead to package managers being
> required
> >> >>> >> to
> >> >>> >> build in additional tooling to support not rebuilding cache for
> >> >>> >> every
> >> >>> >> font installed [4].
> >> >>> >>
> >> >>> >> Anyway I hope that is enough reason to persuade you that there
> are
> >> >>> >> substantial improvements to make to the caching strategy and they
> >> >>> >> are
> >> >>> >> beneficial not only for the odd platforms (osx, windows) but also
> >> >>> >> for
> >> >>> >> Linux.
> >> >>> >>
> >> >>> >> My question is if fontconfig would be receptive to
> >> >>> >> building/accepting
> >> >>> >> a patch modifying the caching strategy to include checkums per
> file
> >> >>> >> instead of/in addition to per directory. Currently any change to
> >> >>> >> directory (such as adding a new font) invalidates all fonts
> within
> >> >>> >> that directory. This means for directories like the system
> >> >>> >> directory
> >> >>> >> it results in re scans of hundreds or more fonts. Thankfully this
> >> >>> >> is
> >> >>> >> faster on platforms like linux where all fonts on freetype.
> However
> >> >>> >> this improvement in scanning did not carry over to windows with
> its
> >> >>> >> many FNT (150 on the average install) and even on my very robust
> >> >>> >> development machine building a cache for a mere 650 files takes
> >> >>> >> half a
> >> >>> >> minute. This might be acceptable on install of the application
> >> >>> >> where
> >> >>> >> we can take our time building the cache, but what happens when a
> >> >>> >> user
> >> >>> >> installs 1 more font? A change to cache individual file checksums
> >> >>> >> would provide fontconfig a way to only require the expensive
> >> >>> >> coverage
> >> >>> >> check of a single font instead of the entirety of a users. I dare
> >> >>> >> say
> >> >>> >> with this exact change the need to use a faster less robust
> >> >>> >> coverage
> >> >>> >> check that made scanning freetype fonts faster may be unneeded as
> >> >>> >> the
> >> >>> >> number of scans required to rebuild a cache would reduced 100x on
> >> >>> >> the
> >> >>> >> average system or more.
> >> >>> >>
> >> >>> >> I'm certain such a change would be highly appreciated by all
> >> >>> >> fontconfig consumers who are hoping to use its powerful feature
> set
> >> >>> >> in
> >> >>> >> a multiplatform context.
> >> >>> >>
> >> >>> >> --Kurt Kartaltepe
> >> >>> >>
> >> >>> >> [1] https://bugzilla.gnome.org/show_bug.cgi?id=162681
> >> >>> >> [2]
> >> >>> >>
> >> >>> >>
> >> >>> >> https://lists.freedesktop.org/archives/fontconfig/2017-
> August/005986.html
> >> >>> >> [3] https://bugs.freedesktop.org/show_bug.cgi?id=64766
> >> >>> >> [4]
> >> >>> >>
> >> >>> >>
> >> >>> >> https://lists.freedesktop.org/archives/fontconfig/2007-
> October/002728.html
> >> >>> >> _______________________________________________
> >> >>> >> Fontconfig mailing list
> >> >>> >> Fontconfig at lists.freedesktop.org
> >> >>> >> https://lists.freedesktop.org/mailman/listinfo/fontconfig
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > --
> >> >>> > behdad
> >> >>> > http://behdad.org/
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> behdad
> >> >> http://behdad.org/
> >
> >
> >
> >
> > --
> > behdad
> > http://behdad.org/
>



-- 
behdad
http://behdad.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/fontconfig/attachments/20180226/0dbc9841/attachment-0001.html>


More information about the Fontconfig mailing list