[Mesa-dev] [PATCH 32/32] r600/radeonsi: enable glsl/tgsi on-disk cache

Nicolai Hähnle nhaehnle at gmail.com
Wed Feb 15 12:49:55 UTC 2017


On 14.02.2017 23:51, Marek Olšák wrote:
> On Tue, Feb 14, 2017 at 11:47 PM, Timothy Arceri <tarceri at itsqueeze.com> wrote:
>>
>>
>> On 15/02/17 08:35, Marek Olšák wrote:
>>>
>>> On Tue, Feb 14, 2017 at 9:53 PM, Timothy Arceri <tarceri at itsqueeze.com>
>>> wrote:
>>>>
>>>> On 15/02/17 02:14, Marek Olšák wrote:
>>>>
>>>>> On Tue, Feb 14, 2017 at 1:52 AM, Timothy Arceri <tarceri at itsqueeze.com>
>>>>> wrote:
>>>>>>
>>>>>> ---
>>>>>>    src/gallium/drivers/r600/r600_pipe.c          | 10 ++++++++++
>>>>>>    src/gallium/drivers/radeon/r600_pipe_common.c |  2 +-
>>>>>>    src/gallium/drivers/radeon/r600_pipe_common.h |  1 +
>>>>>>    src/gallium/drivers/radeonsi/si_pipe.c        | 11 +++++++++++
>>>>>>    src/gallium/include/pipe/p_screen.h           |  3 +++
>>>>>>    src/mesa/state_tracker/st_context.c           |  2 ++
>>>>>>    6 files changed, 28 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/src/gallium/drivers/r600/r600_pipe.c
>>>>>> b/src/gallium/drivers/r600/r600_pipe.c
>>>>>> index 5290f40..bdd8c0a 100644
>>>>>> --- a/src/gallium/drivers/r600/r600_pipe.c
>>>>>> +++ b/src/gallium/drivers/r600/r600_pipe.c
>>>>>> @@ -600,6 +600,7 @@ static void r600_destroy_screen(struct pipe_screen*
>>>>>> pscreen)
>>>>>>                   compute_memory_pool_delete(rscreen->global_pool);
>>>>>>           }
>>>>>>
>>>>>> +       disk_cache_destroy(rscreen->b.b.disk_shader_cache);
>>>>>>           r600_destroy_common_screen(&rscreen->b);
>>>>>>    }
>>>>>>
>>>>>> @@ -633,6 +634,15 @@ struct pipe_screen *r600_screen_create(struct
>>>>>> radeon_winsys *ws)
>>>>>>                   return NULL;
>>>>>>           }
>>>>>>
>>>>>> +       uint32_t mesa_timestamp;
>>>>>> +       if (disk_cache_get_function_timestamp(r600_screen_create,
>>>>>> &mesa_timestamp)) {
>>>>>> +               char *timestamp_str;
>>>>>> +               if (asprintf(&timestamp_str, "%u", mesa_timestamp) !=
>>>>>> -1)
>>>>>> {
>>>>>> +                       rscreen->b.b.disk_shader_cache =
>>>>>> disk_cache_create(r600_get_chip_name(&rscreen->b), timestamp_str);
>>>>>> +                       free(timestamp_str);
>>>>>> +               }
>>>>>> +       }
>>>>>> +
>>>>>>           if (rscreen->b.info.chip_class >= EVERGREEN) {
>>>>>>                   rscreen->b.b.is_format_supported =
>>>>>> evergreen_is_format_supported;
>>>>>>           } else {
>>>>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
>>>>>> b/src/gallium/drivers/radeon/r600_pipe_common.c
>>>>>> index 95a6a48..4e5582f 100644
>>>>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>>>>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>>>>>> @@ -722,7 +722,7 @@ static const char* r600_get_device_vendor(struct
>>>>>> pipe_screen* pscreen)
>>>>>>           return "AMD";
>>>>>>    }
>>>>>>
>>>>>> -static const char* r600_get_chip_name(struct r600_common_screen
>>>>>> *rscreen)
>>>>>> +const char* r600_get_chip_name(struct r600_common_screen *rscreen)
>>>>>>    {
>>>>>>           switch (rscreen->info.family) {
>>>>>>           case CHIP_R600: return "AMD R600";
>>>>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
>>>>>> b/src/gallium/drivers/radeon/r600_pipe_common.h
>>>>>> index 6eff9aa..0449d4d 100644
>>>>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.h
>>>>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
>>>>>> @@ -765,6 +765,7 @@ void radeon_save_cs(struct radeon_winsys *ws,
>>>>>> struct
>>>>>> radeon_winsys_cs *cs,
>>>>>>                       struct radeon_saved_cs *saved);
>>>>>>    void radeon_clear_saved_cs(struct radeon_saved_cs *saved);
>>>>>>    bool r600_check_device_reset(struct r600_common_context *rctx);
>>>>>> +const char* r600_get_chip_name(struct r600_common_screen *rscreen);
>>>>>>
>>>>>>    /* r600_gpu_load.c */
>>>>>>    void r600_gpu_load_kill_thread(struct r600_common_screen *rscreen);
>>>>>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
>>>>>> b/src/gallium/drivers/radeonsi/si_pipe.c
>>>>>> index 853d850..0bb95b1 100644
>>>>>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>>>>>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>>>>>> @@ -712,6 +712,7 @@ static void si_destroy_screen(struct pipe_screen*
>>>>>> pscreen)
>>>>>>                   }
>>>>>>           }
>>>>>>           pipe_mutex_destroy(sscreen->shader_parts_mutex);
>>>>>> +       disk_cache_destroy(sscreen->b.b.disk_shader_cache);
>>>>>>           si_destroy_shader_cache(sscreen);
>>>>>>           r600_destroy_common_screen(&sscreen->b);
>>>>>>    }
>>>>>> @@ -801,6 +802,16 @@ struct pipe_screen *radeonsi_screen_create(struct
>>>>>> radeon_winsys *ws)
>>>>>>                   return NULL;
>>>>>>           }
>>>>>>
>>>>>> +       uint32_t mesa_timestamp, llvm_timestamp;
>>>>>> +       if (disk_cache_get_function_timestamp(radeonsi_screen_create,
>>>>>> &mesa_timestamp) &&
>>>>>> +
>>>>>> disk_cache_get_function_timestamp(LLVMInitializeAMDGPUTargetInfo,
>>>>>> &llvm_timestamp)) {
>>>>>> +               char *timestamp_str;
>>>>>> +               if (asprintf(&timestamp_str, "%u_%u", mesa_timestamp,
>>>>>> llvm_timestamp) != -1) {
>>>>>> +                       sscreen->b.b.disk_shader_cache =
>>>>>> disk_cache_create(r600_get_chip_name(&sscreen->b), timestamp_str);
>>>>>
>>>>> Can you please at least make some effort to get close to 80 chars per
>>>>> line even though it's not strictly required in this driver?
>>>>>
>>>>>> +                       free(timestamp_str);
>>>>>> +               }
>>>>>> +       }
>>>>>> +
>>>>>>           si_handle_env_var_force_family(sscreen);
>>>>>>
>>>>>>           if (!debug_get_bool_option("RADEON_DISABLE_PERFCOUNTERS",
>>>>>> false))
>>>>>> diff --git a/src/gallium/include/pipe/p_screen.h
>>>>>> b/src/gallium/include/pipe/p_screen.h
>>>>>> index b6203f1..6fd527f 100644
>>>>>> --- a/src/gallium/include/pipe/p_screen.h
>>>>>> +++ b/src/gallium/include/pipe/p_screen.h
>>>>>> @@ -42,6 +42,7 @@
>>>>>>    #include "pipe/p_format.h"
>>>>>>    #include "pipe/p_defines.h"
>>>>>>    #include "pipe/p_video_enums.h"
>>>>>> +#include "util/disk_cache.h"
>>>>>>
>>>>>>
>>>>>>
>>>>>> @@ -318,6 +319,8 @@ struct pipe_screen {
>>>>>>       const void *(*get_compiler_options)(struct pipe_screen *screen,
>>>>>>                                          enum pipe_shader_ir ir,
>>>>>>                                          unsigned shader);
>>>>>> +
>>>>>> +   struct disk_cache *disk_shader_cache;
>>>>>>    };
>>>>>
>>>>> Changes to gallium/include/pipe must be separate commits. We don't
>>>>> change this interface unless there is a very good reason to do so. You
>>>>> might have to get Brian's or Roland's ack if you wanna pursue this
>>>>> change, though the likelihood of acceptance is pretty low on this one.
>>>>>
>>>>> However you can ignore all the above, because there is a different
>>>>> issue to discuss. If I understand it correctly, this only caches TGSI.
>>>>> What's the point in putting the cache into the driver in that case?
>>>>
>>>> I was trying to land the glsl ir and tgsi pieces and avoid getting stuck
>>>> discussing the GCN cache patch at this point, there will defiantly be a
>>>> GCN
>>>> cache.
>>>>>
>>>>> If
>>>>> the goal is to have a GLSL->TGSI cache, it should live in st/mesa. If
>>>>> the goal is to have a GLSL->GCN cache, it might live in the driver.
>>>>> However, even that is not necessary if you just expose shader binaries
>>>>> to st/mesa, in which case st/mesa can query binaries from the driver
>>>>> and the GLSL->GCN cache can live entirely in st/mesa. If you need the
>>>>> lib timestamp in st/mesa, you can also add a query to pipe_screen
>>>>> returning the lib timestamp only. Then you'll have both native shader
>>>>> binaries and the driver compiler timestamp exposed to st/mesa. So I'd
>>>>> like to understand why the cache is in the driver.
>>>>
>>>> This might work ok for radeonsi as you don't really have to worry about
>>>> variants, but what about other drivers? It would seem odd to add
>>>> something
>>>> to the st if its only used by a single driver wouldn't it? Happy to be
>>>> wrong
>>>> about this, but this was my thought process here.
>>>
>>> OK. That's a fair point. Nouveau and radeonsi don't need a cache in
>>> the driver,


Not sure what you're saying here. Having the GCN in the cache would 
certainly help.

That said, implementing this in the state tracker via an interface like 
below makes sense to me.


>>> but other drivers like r600g might. However, r600g
>>> compilation is so much faster than radeonsi, so it might not be that
>>> important. Also given the rate of r600g development, I don't see a
>>> native shader cache happening there. What other drivers are there
>>> really? We have freedreno and vc4 as the other two actively developed
>>> drivers. So you can ask Rob or Eric if they need a shader cache inside
>>> the driver, and if not, well, there is no point in putting the cache
>>> instance into drivers.
>>
>> I asked Eric about this a while ago and while he hadn't thought about it too
>> much he was thinking he would cache NIR. This is an interesting approach and
>> something that I avoided in i965 because caching nir seemed painful. Anyway
>> I'll point them to this thread for their comments.
>>
>>>
>>>> Also its not clear to me where I can store the binary in the st, that is
>>>> acceptable and accessible by the driver, any advice on this would be
>>>> appreciated.
>>>
>>> The st is never accessible from the driver.
>>
>>
>> Sure what I mean is if we load the radeonsi/nouveau binary in the st at link
>> time, how would you pass that down to the driver?
>
> I would add an interface for creating a shader from a binary. So we have this:
> - pipe_screen::get_compiler_timestamp

I think this should return a string (and have a slightly different name) 
so we have the freedom to add additional information when required. I'm 
thinking GPU type (though that's also in the screen name) and R600_DEBUG 
flags that affect compilation.


> - pipe_context::get_shader_binary
> - pipe_context::create_shader_from_binary

Makes sense, but there's an open question: How do we store (optimized) 
variants in the cache? I admit they're not as important for the raw 
loading time, but it would be nice not to have the compiler threads busy 
during the first seconds of a game, either.

We don't have to solve that all at once, but it would be nice to have an 
interface that can be extended easily.

Cheers,
Nicolai


>
> Marek
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



More information about the mesa-dev mailing list