[Mesa-dev] [PATCH 2/2] radeonsi: drop few partial flushes when uploading bindless descriptors
Samuel Pitoiset
samuel.pitoiset at gmail.com
Tue Jul 4 10:51:17 UTC 2017
On 07/03/2017 09:22 PM, Nicolai Hähnle wrote:
> On 03.07.2017 18:33, Samuel Pitoiset wrote:
>>
>>
>> On 07/03/2017 06:16 PM, Samuel Pitoiset wrote:
>>>
>>>
>>> On 07/03/2017 06:09 PM, Nicolai Hähnle wrote:
>>>> On 03.07.2017 18:03, Nicolai Hähnle wrote:
>>>>> On 29.06.2017 21:59, Samuel Pitoiset wrote:
>>>>>> Only emit partial flushes when the underlying shader stages
>>>>>> are using bindless samplers or images.
>>>>>>
>>>>>> This gets rid of 4% of partial flushes in the DOW3 benchmark.
>>>>>
>>>>> Do those flushes still trigger during play?
>>>>>
>>>>> I'd think it should be possible to eliminate these partial flushes
>>>>> entirely. The logic is that you only ever need the wait in the
>>>>> first place when you're overriding a previously used descriptor.
>>>>>
>>>>> So what you could do is add a counter (per context, I think?) which
>>>>> is incremented whenever a flush happens, and then release
>>>>> descriptors lazily based on that.
>>>>>
>>>>> Alternatively (maybe easier, not sure): all shaders are guaranteed
>>>>> to have finished when the fence of a submit fires, so use that as
>>>>> the guard for lazily releasing descriptors.
>>>>
>>>> Looking at the other patch, it seems the flushes are triggered for
>>>> descriptor changes for buffer invalidations? In that case, forget
>>>> what I wrote.
>>>
>>> Yes, most of them are for buffer invalidations.
>>
>> One thing we can do though is to scan all shaders in order to know if
>> they use bindless texture buffers. Maybe this can help for removing
>> more partial flushes, what do you think?
>
> Having the descriptors is proof that *some* shaders use texture buffers.
> If it's only a small minority of shaders, then that might help. I don't
> know how DOW3 uses it, so it's hard to say.
>
> Also, I think I have to retract my R-b for this patch. There's an issue
> that occurred to me unfortunately (or fortunately, depending on how you
> look at it). Consider a sequence like:
I would say, fortunately because introducing issues is definitely not a
good idea. :)
>
> 1. draw with bindless
> 2. draw without bindless
> 3. buffer invalidation that triggers descriptor upload
> 4. draw without bindless
>
> At step 3, we would no longer do a shader partial flush with this patch.
> However, shaders from the first draw might still be running!
You are right.
>
> I'm not sure if we can fix that. Basically, whenever we started a draw
> with bindless since the last (implied) gfx shader flush, we need to
> issue a flush before updating descriptors. So you could track that with
> a si_context bit that gets set when any shaders use bindless, and
> cleared with an (implied) gfx shader flush. But I don't know how much
> that really helps, and I don't think there's anything else we can do.
> Even crazy schemes like manually counting context rolls wouldn't help,
> because we can't avoid invalidating the L1K$.
Yes, it's tricky to fix it correctly. And I don't see real performance
improvements with this patch anyway.
Samuel.
>
> We're entering Vulkan-like territory where we might have to ask Feral to
> improve their buffer handling.
>
> Cheers,
> Nicolai
>
>
>>
>>>
>>>>
>>>> Cheers,
>>>> Nicolai
>>>>
>>>>
>>>>>
>>>>> Anyway, this patch is a nice first step.
>>>>>
>>>>> Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>>>>
>>>>>
>>>>>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
>>>>>> ---
>>>>>> src/gallium/drivers/radeonsi/si_descriptors.c | 18
>>>>>> ++++++++++++++++--
>>>>>> 1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>> b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>> index 88f7dcee959..7d8b3670887 100644
>>>>>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>> @@ -1934,14 +1934,28 @@ static void
>>>>>> si_upload_bindless_descriptor(struct si_context *sctx,
>>>>>> static void si_upload_bindless_descriptors(struct si_context *sctx)
>>>>>> {
>>>>>> + unsigned shader_uses_bindless_mask;
>>>>>> +
>>>>>> if (!sctx->bindless_descriptors_dirty)
>>>>>> return;
>>>>>> /* Wait for graphics/compute to be idle before updating the
>>>>>> resident
>>>>>> * descriptors directly in memory, in case the GPU is using
>>>>>> them.
>>>>>> */
>>>>>> - sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
>>>>>> - SI_CONTEXT_CS_PARTIAL_FLUSH;
>>>>>> + sctx->b.flags |= SI_CONTEXT_CS_PARTIAL_FLUSH;
>>>>>> +
>>>>>> + /* To avoid unnecessary partial flushes, check which shader
>>>>>> stages are
>>>>>> + * using bindless samplers or images.
>>>>>> + */
>>>>>> + shader_uses_bindless_mask =
>>>>>> sctx->shader_uses_bindless_samplers_mask |
>>>>>> + sctx->shader_uses_bindless_images_mask;
>>>>>> +
>>>>>> + if (shader_uses_bindless_mask & (1 << PIPE_SHADER_FRAGMENT)) {
>>>>>> + sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH;
>>>>>> + } else if (shader_uses_bindless_mask) {
>>>>>> + sctx->b.flags |= SI_CONTEXT_VS_PARTIAL_FLUSH;
>>>>>> + }
>>>>>> +
>>>>>> si_emit_cache_flush(sctx);
>>>>>> util_dynarray_foreach(&sctx->resident_tex_handles,
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>
>
More information about the mesa-dev
mailing list