[Fontconfig] Next steps for a reproducible Fontconfig?
Alexander Larsson
alexander.larsson at gmail.com
Thu Jan 10 10:53:58 UTC 2019
On Wed, Jan 9, 2019 at 8:29 PM Keith Packard <keithp at keithp.com> wrote:
>
> Akira TAGOH <akira at tagoh.org> writes:
>
> > As of the discussion on the list, Keith's changes doesn't address the
> > original purpose - allow sharing caches on bind-mounts in flatpaks.
> > particularly for the case where flatpaks uses same location in
> > sandbox.
>
> I'm probably forgetting a bunch of context here, but I think the problem
> was that flatpaks have /run/host/fonts pointing to the "real"
> /usr/share/fonts and then a separate /usr/share/fonts of their own with
> a small set of fonts, and so you end up with collisions in the cache
> file namespace as both directories end up generating the same cache file
> name.
Yes, and there is two aspects of this problems with this:
A sandbox has something in /usr/share/fonts that isn't the same as
what is on the host, but the cache is shared, so we pick up host cache
for the sandbox directory (or the reverse), causing misrendering of
fonts and whatnot.
A version of the host /usr/share/fonts is available to the sandbox
under a different pathname, causing the host cache to not be used and
thus have to be regenerated, which is slow first time the app is
started.
I'd like to repeat that this is not really flatpak specific as such.
The issue can happen in multiple cases like nfs mounts, multi-boot
systems, docker containers, etc.
Also, the focus on /usr/share/fonts here is mainly for illustration,
the same issue could happen with any other path. For instance,
separate containers with shared $HOME would have its font path
(whatever it is as long as it is the same) conflict between the
containers.
Flatpak has an additional weakness here, which is that we don't store
mtimes (to maximize content sharing abilities the mtime is not part of
the content addressing). This means the mtime can't be used to detect
a stale cache, so we use the uuid to detect such changes.
> Making the build reproducible means having all content generated
> deterministically based only on the source package and toolchain. The
> current UUID files are generated randomly making them
> non-deterministic.
>
> For the font cache, making it reproducible requires that the keys
> mapping directories to cache filenames be the same each time the cache
> is built. This means we cannot use the current randomly generated UUID
> values and also have a reproducible system.
I think this misrepresents what the UUID is for. The UUID represents
a uniqe identifier for the *location*, not the contents, and the goal
is to make it independent on how you found the directory. If you add
fonts to that directory, then you're supposed to keep the UUID because
you want to regenerate the same cache for the new content.
I realize this is not what the reproducible builds project wants, but
it is what the UUID was added for.
> I considered whether we might provide a mechanism to generate UUID
> values deterministically for purposes of packaging. However, this would
> mean that we couldn't use these same packages when creating a flatpak as
> the deterministic UUID values would collide if those same packages were
> used in the outer system.
I think this furthers the misunderstanding from above, but lets
continue this idea.
If the UUID really *was* content addressed, then it would change each
time some font was added, and old font caches would become stale (and
reaped via some other way like mtimes). In this case the fact that
caches between the sandbox and the hosts collide is not even a
problem, since the cached data is identical and could be shared. The
problem is rather that the font directory is mutable, and if it
changes without immediately updating the uuid you run into issues.
> Without deterministic UUID values, I'm left with the feeling that
> our only available solutions involve changing how flatpaks reference
> fonts.
>
> If we agree that a solution to this involves changing the flatpak
> mechanism, I'd like to suggest that the most straightforward fix for the
> overall system would be to expose the external fonts using the external
> path names -- bind mounting the external /usr/share/fonts as
> /usr/share/fonts within the flatpak, and creating a new
> /usr/share/fonts-minimal (or whatever) to hold the fonts provided by the
> flatpak itself. With this change, we can simply delete the UUID code
> from fontconfig and go back to using global font paths as keys to the
> font cache database.
I'm willing to make *some* changes to flatpak, but I'm not sure this
is the right approach. First of all it just looks at a tiny subset of
the problem (only flatpak, and only one directory). Secondly, it is
likely to run into issues having non-standard paths. For example, the
fedora flatpak runtime is created from the standard fedora rpms, so it
will have to be tweaked post install, and its possible that some code
hard-coded /usr/share/fonts/some-specific-font which is in the app but
not the host..
Also, we'll be guaranteeing that caches for /usr/share/fonts and
/usr/share/fonts-minimal don't conflict, but there is no guarantee
that different versions of /usr/share/fonts-minimal don't conflict. In
flatpak we set XDG_CACHE_HOME separate for each app, so this doesn't
happen cross-apps like it would with e.g. docker. However it will
cause flatpak to fail to detect an update of /usr/share/fonts-minimal
due to the mtime issue. I can imagine changing flatpak to modify the
mtime of the fonts directory after install (since only files are
hardlink-shared between apps, not directories), but this would cause
problems for all outstanding flatpak deployments that don't do this.
As a side point, flatpak also maps /run/host/user-fonts, which points
to the ~/.fonts or ~/.local/share/fonts directory even when the app
doesn't have regular homedir access. It would similarly have to be
changed to use the real path if the above approach is used.
> I'd love to hear about alternative ideas which might lead to solutions
> that make builds involving fontconfig reproducible. I'd be happy to take
> even vague hints at this point; all I've got at this point are a
> collection of dead ends.
Here is my proposal:
Make the uuid *generation* optional and manual. Then, when we create
the flatpak runtime we run fc-cache --make-uuid (or something) to
generate the uuid files. Then fontconfig would never confuse the
sandboxed /usr/share/fonts with any other, and since we would get a
new uuid each time we regenerated the runtime it would correctly pick
up stale caches when we update the runtime (even with no mtime
change).
This would make the default installation of fontconfig reproducible,
and it would solve the first problem (don't mix up sandboxed and host
font dirs). It would also let you opt-in to the uuid in other cases
where it makes sense. For instance, you could have a uuid file on a
NFS share or USB drive font dir, so that any caches for it will always
be the same no matter how it happens to be mounted.
We still wouldn't have a way to reuse host caches which were mounted
in a different way, but if we assume all conflicting directories use
uuids (like they would in the flatpak case), then we could solve this
in a pretty simple way by a config file saying "treat all instances of
/run/host/fonts as /usr/share/fonts", and I could make flatpak
generate such a file.
More information about the Fontconfig
mailing list