[Mesa-dev] shader cache backward compatibility

Bas Nieuwenhuizen bas at basnieuwenhuizen.nl
Fri Aug 31 19:36:11 UTC 2018


On Fri, Aug 31, 2018 at 4:05 PM Emil Velikov <emil.l.velikov at gmail.com> wrote:
>
> On 31 August 2018 at 14:36, Timothy Arceri <tarceri at itsqueeze.com> wrote:
> > On 31/08/18 21:07, Emil Velikov wrote:
> >>
> >> On 31 August 2018 at 11:37, Timothy Arceri <tarceri at itsqueeze.com> wrote:
> >>>
> >>> On 31/08/18 20:10, Emil Velikov wrote:
> >>>>
> >>>>
> >>>> Hi Timothy,
> >>>>
> >>>> On 31 August 2018 at 10:57, Timothy Arceri <tarceri at itsqueeze.com>
> >>>> wrote:
> >>>>>
> >>>>>
> >>>>> On 31/08/18 19:40, Bas Nieuwenhuizen wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> +TImothy
> >>>>>>
> >>>>>> On Fri, Aug 31, 2018 at 11:32 AM Alexander Larsson <alexl at redhat.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Hi, I'm the developer behind flatpak (https://flatpak.org/) and we've
> >>>>>>> recently run into some issues with the mesa shader cache. Flatpak has
> >>>>>>> a app/runtime split where the runtime is shared between app and
> >>>>>>> provides /usr. The runtime contains a version of mesa, but this can
> >>>>>>> be
> >>>>>>> overridden by runtime extensions to add other OpenGL drivers.
> >>>>>>>
> >>>>>>> Each app has a separate $XDG_CACHE_HOME, pointing into the per-app
> >>>>>>> writable storage. For example, gedit has
> >>>>>>> XDG_CACHE_HOME="/home/alex/.var/app/org.gnome.gedit/cache". This
> >>>>>>> causes mesa to store the shader cache per-app in
> >>>>>>> $XDG_CACHE_HOME/mesa_shader_cache.
> >>>>>>>
> >>>>>>> In the regular case this works fine, but sometimes the version of
> >>>>>>> mesa
> >>>>>>> is changed, with the shader cache being left in place. For example,
> >>>>>>> sometimes we update the mesa version in the runtime, and sometimes
> >>>>>>> the
> >>>>>>> app switches to a new runtime which has a different mesa version.
> >>>>>>>
> >>>>>>> Such updates have caused a lot of issues for us, ranging from direct
> >>>>>>> crashes at startup as in
> >>>>>>> https://github.com/flatpak/flatpak/issues/2052 and sometimes just
> >>>>>>> super-slow performance. In all cases, blowing away the shader cache
> >>>>>>> directory fixed all issues.
> >>>>>>>
> >>>>>>> The steam flatpak has a bunch of workaround for the cache:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> https://github.com/flathub/com.valvesoftware.Steam/blob/master/steam_wrapper/steam_wrapper.py#L35
> >>>>>>> But we can't expect every app to do this.
> >>>>>>>
> >>>>>>> So, my question is, is the cache supposed to be backward compatible,
> >>>>>>> or at least versioned? Are we missing something in our mesa builds to
> >>>>>>> make that work? Is this fixed somewhere with a patch i can backport?
> >>>>>>> And if not, do we need to add some magic to flatpak to automatically
> >>>>>>> clean up the shader cache after an update?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> It is supposed to be versioned automatically by mesa.
> >>>>>>
> >>>>>
> >>>>> Hi Alexander,
> >>>>>
> >>>>> We depend on build timestamps of the mesa/llvm binaries when generating
> >>>>> the
> >>>>> sha for cache items. So if flatpak results in two versions of mesa
> >>>>> having
> >>>>> the same timestamp then there is likely going to be issues.
> >>>>>
> >>>>> static inline bool
> >>>>> disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
> >>>>> {
> >>>>>      Dl_info info;
> >>>>>      struct stat st;
> >>>>>      if (!dladdr(ptr, &info) || !info.dli_fname) {
> >>>>>         return false;
> >>>>>      }
> >>>>>      if (stat(info.dli_fname, &st)) {
> >>>>>         return false;
> >>>>>      }
> >>>>>      *timestamp = st.st_mtime;
> >>>>>      return true;
> >>>>> }
> >>>>>
> >>>> Have you tried using the build-id from src/util/build_id.c?
> >>>>
> >>>
> >>> Hi Emil,
> >>>
> >>> Honestly I've got no idea what that code does. Maybe someone who does
> >>> could
> >>> write patches to switch to it along with an explanation of why its
> >>> better.
> >>> Even just adding some comments in that file would be helpful.
> >>>
> >>> I don't want to be the one responsible for it (and any new issues with
> >>> the
> >>> cache) when I'm not aware of how it works :(
> >>>
> >> In a few words - retrieves the unique, in our case sha1, for the
> >> binary. Any change in Mesa source will lead to a different build-id.
> >> The SO thread has longer/better explanation [1]. You can skim through
> >> git log for details and poke contributors with specific questions.
> >
> >
> > Hmm, I think part of the reason we never did this is that we need and id for
> > llvm also.
> >
> Valid point - I forgot about that.

Did we actually check that it typically does not in typical situations?

One reason we use this in radv  was because I did not about build-id
then and we had this before Intel implemented the build-id solution.

I suppose I can use the build-id and fallback to time if not available.

>
> A couple of ideas come to mind:
>  - static link LLVM (Flatpak already does it)
> No LLVM changes needed.
>
>  - shared link LLVM
> LLVM add -Wl,--build-id=sha1
>
> In both cases Mesa will need something like
> s/disk_cache_get_function_timestamp/build_id_find_nhdr_for_addr/
>
> HTH
> Emil


More information about the mesa-dev mailing list