[Mesa-dev] [PATCH] glsl: fix disk cache eviction issue
Tapani Pälli
tapani.palli at intel.com
Tue Jan 10 12:58:18 UTC 2017
On 01/10/2017 01:45 PM, Timothy Arceri wrote:
> On Tue, 2017-01-10 at 22:41 +1100, Timothy Arceri wrote:
>> On Tue, 2017-01-10 at 22:34 +1100, Timothy Arceri wrote:
>>> On Tue, 2017-01-10 at 12:20 +0200, Tapani Pälli wrote:
>>>>
>>>> On 01/09/2017 05:42 PM, Emil Velikov wrote:
>>>>> On 9 January 2017 at 15:36, Emil Velikov <emil.l.velikov at gmail.
>>>>> co
>>>>> m>
>>>>> wrote:
>>>>>> On 9 December 2016 at 05:58, Tapani Pälli <tapani.palli at intel
>>>>>> .c
>>>>>> om
>>>>>>> wrote:
>>>>>>> When 'evict_random_item' attempts to remove cache content
>>>>>>> to
>>>>>>> make more
>>>>>>> space, it may try to remove from new cache directory it
>>>>>>> just
>>>>>>> created
>>>>>>> which only has '.tmp' content at this phase and not proper
>>>>>>> cache files,
>>>>>>> this will cause eviction to fail and 'test_put_and_get'
>>>>>>> subtest
>>>>>>> assumptions about cache size and item count won't hold.
>>>>>>>
>>>>>>> Patch fixes this issue by adding 'exception' argument for
>>>>>>> random file
>>>>>>> matching so that algorithm won't pick the new directory.
>>>>>>>
>>>>>>
>>>>>> Hi Tapani,
>>>>>>
>>>>>> I've mentioned this over at #android-ia, but adding it here
>>>>>> for
>>>>>> posterity.
>>>>>>
>>>>>>
>>>>>> The issue itself is actually quite different - in my
>>>>>> experiments
>>>>>> with
>>>>>> importing SHA1 implementation to mesa, I've noticed that
>>>>>> there
>>>>>> bug
>>>>>> lies with a buggy (and/or undefined behaviour) SHA1
>>>>>> implementation
>>>>>> given the input data.
>>>>>>
>>>>>> Namely:
>>>>>> As one observes the output of _mesa_sha1_compute they will
>>>>>> notice
>>>>>> that
>>>>>> it differs across implementations. This in itself combined
>>>>>> with
>>>>>> the
>>>>>> (afaict) correct assumption in test_put_and_get() that the
>>>>>> beginning
>>>>>> for the SHA1blob_key_byte_zero
>>>>>>
>>>>>
>>>>> [Pressed enter to quickly]
>>>>>
>>>>> ...the SHA1 matches blob_key_byte_zero (as mentioned in the
>>>>> comment
>>>>> of
>>>>> said function) makes things work ... sometimes.
>>>>
>>>> OK looks like this is going to be pretty tricky to debug.
>>>> Independent
>>>> of
>>>> this test issue however I still believe my patch is
>>>> required/correct.
>>>> There is nothing currently in cache eviction code that would
>>>> prevent
>>>> it
>>>> from picking the directory created for new entry. If it picks it,
>>>> it
>>>> does not delete anything away (since no cache files inside) and
>>>> therefore does not respect the maximum size limit set for the
>>>> cache.
>>>
>>> There are some interesting problems to be solved/improved with
>>> regards
>>> to eviction. The current version of shader-cache stores everything
>>> in
>>> a
>>> single directory which I intend to change so that we instead store
>>> things in directory structures like so:
>>>
>>> ./Mesa-13.0.0/i965-BDW/
>>> ./Mesa-13.0.1/i965-BDW/
>>> ./Mesa-13.0.1/radeonsi/
>>>
>>> This will help avoid costly fallbacks that are in the current
>>> patchset
>>> where we fallback to a recompile at drawtime the first time we use
>>> Mesa
>>> 13.0.1 on a shader that was cached in Mesa-13.0.0. This will also
>>> help
>>> 3rd parties wishing to push pre-compiled shaders to a users
>>> machine.
>>>
>>> In this case we would always want to evict only from older Mesa
>>> versions one question is how much we should delete at a time.
>>>
>>> Getting back to this patch one thing I don't understand is how we
>>> could
>>> hit this problem in a realistic scenario. Directories names are
>>> only
>>> two chars long and are hexidecimals which means a max of 256
>>> directories it should be impossible to reach the max size before
>>> all
>>> possible directories are created unless the cache size is extremely
>>> small in which case we are going to have other problems.
>>>
>>> It may just make more sense to generate all possible directories
>>> when
>>> the parent directory is first created.
>>
>> Oh right the problem is that its empty. Still unless the cache is
>> very
>> small this seems unlikely. Did you actually come across this problem
>> somehow?
>
> Oh there is a bug report. I should have looked at the original email,
> those max sizes seem a bit ridiculous.
Purpose of that test is to test if eviction works (and honors the cache
size) so the reason it sets small size is to reach the max. I'm OK with
not fixing this though but then we should maybe remove that test.
>>
>> Another problem I've considered is that we always delete smaller
>> files
>> than the ones we create and go over the cache size that way although
>> that seems unlikely also.
>>
>> Hopefully we can land it soon and get some real world testing.
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>> So in a Tl;Dr; we want to drop the ~5 different codepaths, in
>>>>> favour
>>>>> of a single one, used by everyone - devs, testers, and users.
>>>>>
>>>>
>>>> Right, I agree this would be very nice. I will still check what
>>>> supporting boringssl would mean, at least from quick glance it
>>>> has
>>>> quite
>>>> different API from others (which is also unstable ..) which did
>>>> not
>>>> encourage me very much last time.
>>>>
>>>> // Tapani
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list