[Mesa-dev] [PATCH 2/2] radeonsi: drop few partial flushes when uploading bindless descriptors

Samuel Pitoiset samuel.pitoiset at gmail.com
Tue Jul 4 10:51:17 UTC 2017



On 07/03/2017 09:22 PM, Nicolai Hähnle wrote:
> On 03.07.2017 18:33, Samuel Pitoiset wrote:
>>
>>
>> On 07/03/2017 06:16 PM, Samuel Pitoiset wrote:
>>>
>>>
>>> On 07/03/2017 06:09 PM, Nicolai Hähnle wrote:
>>>> On 03.07.2017 18:03, Nicolai Hähnle wrote:
>>>>> On 29.06.2017 21:59, Samuel Pitoiset wrote:
>>>>>> Only emit partial flushes when the underlying shader stages
>>>>>> are using bindless samplers or images.
>>>>>>
>>>>>> This gets rid of 4% of partial flushes in the DOW3 benchmark.
>>>>>
>>>>> Do those flushes still trigger during play?
>>>>>
>>>>> I'd think it should be possible to eliminate these partial flushes 
>>>>> entirely. The logic is that you only ever need the wait in the 
>>>>> first place when you're overriding a previously used descriptor.
>>>>>
>>>>> So what you could do is add a counter (per context, I think?) which 
>>>>> is incremented whenever a flush happens, and then release 
>>>>> descriptors lazily based on that.
>>>>>
>>>>> Alternatively (maybe easier, not sure): all shaders are guaranteed 
>>>>> to have finished when the fence of a submit fires, so use that as 
>>>>> the guard for lazily releasing descriptors.
>>>>
>>>> Looking at the other patch, it seems the flushes are triggered for 
>>>> descriptor changes for buffer invalidations? In that case, forget 
>>>> what I wrote.
>>>
>>> Yes, most of them are for buffer invalidations.
>>
>> One thing we can do though is to scan all shaders in order to know if 
>> they use bindless texture buffers. Maybe this can help for removing 
>> more partial flushes, what do you think?
> 
> Having the descriptors is proof that *some* shaders use texture buffers. 
> If it's only a small minority of shaders, then that might help. I don't 
> know how DOW3 uses it, so it's hard to say.
> 
> Also, I think I have to retract my R-b for this patch. There's an issue 
> that occurred to me unfortunately (or fortunately, depending on how you 
> look at it). Consider a sequence like:

I would say, fortunately because introducing issues is definitely not a 
good idea. :)

> 
> 1. draw with bindless
> 2. draw without bindless
> 3. buffer invalidation that triggers descriptor upload
> 4. draw without bindless
> 
> At step 3, we would no longer do a shader partial flush with this patch. 
> However, shaders from the first draw might still be running!

You are right.

> 
> I'm not sure if we can fix that. Basically, whenever we started a draw 
> with bindless since the last (implied) gfx shader flush, we need to 
> issue a flush before updating descriptors. So you could track that with 
> a si_context bit that gets set when any shaders use bindless, and 
> cleared with an (implied) gfx shader flush. But I don't know how much 
> that really helps, and I don't think there's anything else we can do. 
> Even crazy schemes like manually counting context rolls wouldn't help, 
> because we can't avoid invalidating the L1K$.

Yes, it's tricky to fix it correctly. And I don't see real performance 
improvements with this patch anyway.

Samuel.

> 
> We're entering Vulkan-like territory where we might have to ask Feral to 
> improve their buffer handling.
> 
> Cheers,
> Nicolai
> 
> 
>>
>>>
>>>>
>>>> Cheers,
>>>> Nicolai
>>>>
>>>>
>>>>>
>>>>> Anyway, this patch is a nice first step.
>>>>>
>>>>> Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>>>>
>>>>>
>>>>>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
>>>>>> ---
>>>>>>   src/gallium/drivers/radeonsi/si_descriptors.c | 18 
>>>>>> ++++++++++++++++--
>>>>>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
>>>>>> b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>> index 88f7dcee959..7d8b3670887 100644
>>>>>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>> @@ -1934,14 +1934,28 @@ static void 
>>>>>> si_upload_bindless_descriptor(struct si_context *sctx,
>>>>>>   static void si_upload_bindless_descriptors(struct si_context *sctx)
>>>>>>   {
>>>>>> +    unsigned shader_uses_bindless_mask;
>>>>>> +
>>>>>>       if (!sctx->bindless_descriptors_dirty)
>>>>>>           return;
>>>>>>       /* Wait for graphics/compute to be idle before updating the 
>>>>>> resident
>>>>>>        * descriptors directly in memory, in case the GPU is using 
>>>>>> them.
>>>>>>        */
>>>>>> -    sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
>>>>>> -             SI_CONTEXT_CS_PARTIAL_FLUSH;
>>>>>> +    sctx->b.flags |= SI_CONTEXT_CS_PARTIAL_FLUSH;
>>>>>> +
>>>>>> +    /* To avoid unnecessary partial flushes, check which shader 
>>>>>> stages are
>>>>>> +     * using bindless samplers or images.
>>>>>> +     */
>>>>>> +    shader_uses_bindless_mask = 
>>>>>> sctx->shader_uses_bindless_samplers_mask |
>>>>>> +                    sctx->shader_uses_bindless_images_mask;
>>>>>> +
>>>>>> +    if (shader_uses_bindless_mask & (1 << PIPE_SHADER_FRAGMENT)) {
>>>>>> +        sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH;
>>>>>> +    } else if (shader_uses_bindless_mask) {
>>>>>> +        sctx->b.flags |= SI_CONTEXT_VS_PARTIAL_FLUSH;
>>>>>> +    }
>>>>>> +
>>>>>>       si_emit_cache_flush(sctx);
>>>>>>       util_dynarray_foreach(&sctx->resident_tex_handles,
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
> 
> 


More information about the mesa-dev mailing list