[Mesa-dev] [PATCH 7/7] util/disk_cache: also write key_blob to cache entry

Thu Mar 16 13:08:28 UTC 2017

On 16/03/17 23:40, Grazvydas Ignotas wrote:
> On Thu, Mar 16, 2017 at 3:28 AM, Timothy Arceri <tarceri at itsqueeze.com> wrote:
>> On 16/03/17 10:09, Grazvydas Ignotas wrote:
>>>
>>> This can be used to deal with key hash collisions from different
>>> version (should we find that to actually happen) and to find
>>> which mesa version produced it the entry.
>>>
>>> Signed-off-by: Grazvydas Ignotas <notasas at gmail.com>
>>> ---
>>>  src/util/disk_cache.c | 24 ++++++++++++++++++++++--
>>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
>>> index ad591be..8ac9e48 100644
>>> --- a/src/util/disk_cache.c
>>> +++ b/src/util/disk_cache.c
>>> @@ -766,6 +766,7 @@ destroy_put_job(void *job, int thread_index)
>>>  struct cache_entry_file_data {
>>>     uint32_t crc32;
>>>     uint32_t uncompressed_size;
>>> +   uint32_t key_blob_size;
>>>  };
>>>
>>>  static void
>>> @@ -839,6 +840,7 @@ cache_put(void *job, int thread_index)
>>>     struct cache_entry_file_data cf_data;
>>>     cf_data.crc32 = util_hash_crc32(dc_job->data, dc_job->size);
>>>     cf_data.uncompressed_size = dc_job->size;
>>> +   cf_data.key_blob_size = dc_job->cache->key_blob_size;
>>>
>>>     size_t cf_data_size = sizeof(cf_data);
>>>     ret = write_all(fd, &cf_data, cf_data_size);
>>> @@ -847,6 +849,16 @@ cache_put(void *job, int thread_index)
>>>        goto done;
>>>     }
>>>
>>> +   /* Write the key_blob, this can be used find information about the
>>> +    * mesa version that produced the entry or deal with hash collisions,
>>> +    * should that ever become a real problem.
>>> +    */
>>> +   ret = write_all(fd, dc_job->cache->key_blob,
>>> dc_job->cache->key_blob_size);
>>> +   if (ret == -1) {
>>> +      unlink(filename_tmp);
>>> +      goto done;
>>> +   }
>>> +
>>>     /* Now, finally, write out the contents to the temporary file, then
>>>      * rename them atomically to the destination filename, and also
>>>      * perform an atomic increment of the total cache size.
>>> @@ -859,7 +871,7 @@ cache_put(void *job, int thread_index)
>>>     }
>>>     rename(filename_tmp, filename);
>>>
>>> -   file_size += cf_data_size;
>>> +   file_size += cf_data_size + dc_job->cache->key_blob_size;
>>>     p_atomic_add(dc_job->cache->size, file_size);
>>>
>>>   done:
>>> @@ -966,8 +978,16 @@ disk_cache_get(struct disk_cache *cache, const
>>> cache_key key, size_t *size)
>>>           goto fail;
>>>     }
>>>
>>> +   if (cf_data.key_blob_size != cache->key_blob_size)
>>> +      goto fail;
>>> +
>>> +   /* Right now we don't use the key_blob, just skip it */
>>> +   ret = lseek(fd, cache->key_blob_size, SEEK_CUR);
>>> +   if (ret == -1)
>>> +      goto fail;
>>> +
>>>     /* Load the actual cache data. */
>>> -   size_t cache_data_size = sb.st_size - cf_data_size;
>>> +   size_t cache_data_size = sb.st_size - cf_data_size -
>>> cache->key_blob_size;
>>>     for (len = 0; len < cache_data_size; len += ret) {
>>>        ret = read(fd, data + len, cache_data_size - len);
>>>        if (ret == -1)
>>>
>>
>> I'm not sure how much I like writing this as a blob. It makes reading the
>> header difficult.
>
> I was thinking how to make it parsable and decided to make it an
> opaque blob. You see, this is becoming kind of an user ABI that will
> be difficult to maintain without breaking existing tools because
> things are rapidly changing in projects like mesa.

Yes that's kind of the point, we want to be able to easily parse this 
information. I doubt this will change much, but if it does its up to the 
apps devs to either update their tools or provide a fix for Mesa.

Concatenating the llvm-id + mesa-id is fine we just need these values 
to be parsable, I can see the idea behind an opaque blob but for our use 
case it is not going to work (at least not without making things more 
difficult than they need to be).

> We don't know what
> will be needed in future and if those 3 strings you listed below will
> be enough, not to mention driver differences. For example, it might
> easily turn out that shaders also have a kernel dependency (kernel
> enables some feature in GPU needed for shader to run), or there is
> some unexpected dependency on libdrm or something, and we need to take
> their version too. What will we do, add a few more version strings?
> But the tools only look at first 3. Concatenate? Then why have mesa
> and llvm separate?
>
> I guess the tools could use conventional means to to find the metadata
> they need (GPU name, etc) through GL strings, set the cache directory
> and compile a test shader to get the version blob and use that to
> identify other entries to collect. Otherwise if you think it must be
> strings, I'd say it should be just 2 strings, combined version of
> everything relevant and gpu_name, separate mesa+llvm does not suit all
> drivers and is likely not future proof anyway.
>
> Gražvydas
>
>> For example radeonsi will contain an llvm_id string while
>> i965 will not. Can we not just define a cache_key struct to be passed to
>> disk_cache_create().
>>
>> struct cache_driver_keys {
>>    char *mesa_id;
>>    char *llvm_id;
>>    char *gpu_id;
>> }
>>
>> It does mean we can't just use cache_entry_file_data for these strings, but
>> that's not a big deal. We do need to a least write 0 when llvm_id is NULL.
>>
>> As well as those 3 we would probably want to add ptr size to
>> cache_entry_file_data.