[Mesa-dev] [PATCH v3 5/7] radeonsi: Implement DCC fast clear.

Bas Nieuwenhuizen bas at basnieuwenhuizen.nl
Fri Oct 23 10:06:44 PDT 2015


On Fri, Oct 23, 2015 at 4:57 PM, Marek Olšák <maraeo at gmail.com> wrote:
> On Fri, Oct 23, 2015 at 1:57 PM, Bas Nieuwenhuizen
> <bas at basnieuwenhuizen.nl> wrote:
>> On Fri, Oct 23, 2015 at 1:52 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>> On Fri, Oct 23, 2015 at 1:30 PM, Bas Nieuwenhuizen
>>> <bas at basnieuwenhuizen.nl> wrote:
>>>> On Fri, Oct 23, 2015 at 12:50 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>> On Fri, Oct 23, 2015 at 12:17 PM, Bas Nieuwenhuizen
>>>>> <bas at basnieuwenhuizen.nl> wrote:
>>>>>> On Thu, Oct 22, 2015 at 12:12 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>>>> index 5548cba3..a277fa5 100644
>>>>>>>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>>>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>>>> @@ -234,7 +234,7 @@ static void si_set_sampler_views(struct pipe_context *ctx,
>>>>>>>>                         } else {
>>>>>>>>                                 samplers->depth_texture_mask &= ~(1 << slot);
>>>>>>>>                         }
>>>>>>>> -                       if (rtex->cmask.size || rtex->fmask.size) {
>>>>>>>> +                       if (rtex->cmask.size || rtex->fmask.size || rtex->surface.dcc_enabled) {
>>>>>>>>                                 samplers->compressed_colortex_mask |= 1 << slot;
>>>>>>>
>>>>>>> I'd like this flag to be set only when dirty_level_mask is non-zero.
>>>>>>> Setting this for all textures that have DCC is quite expensive in draw
>>>>>>> calls.
>>>>>>
>>>>>> I think this code is incorrect even without considering DCC. If we do
>>>>>> a fast clear on a surface which allocates a cmask and then use that
>>>>>> surface as a texture without calling set_sampler_views in between
>>>>>> (because it was bound before) we get a stale compressed_colortex_mask.
>>>>>>
>>>>>> Some testing shows that this can be triggered using OpenGL, although
>>>>>> the GL_ARB_texture_barrier extension may be needed to make the result
>>>>>> not undefined per the specification.
>>>>>
>>>>> In that case, we should decompress in texture_barrier and not in draw calls.
>>>>>
>>>>> Marek
>>>>
>>>>
>>>> texture_barrier does not need to be called though, the language
>>>> changes might be needed.
>>>>
>>>> Basically the test is
>>>>
>>>> fbo1, fbo2 framebuffers with 1 color buffer each:
>>>>
>>>> bind fbo2 as texture
>>>> clear fbo1 using shader
>>>> bind fbo1 as texture
>>>> clear fbo2 using shader
>>>> clear fbo1 using clear (which results in cmask being allocated for fbo1)
>>
>>>> bind fbo2 as texture
>>>> copy fbo2 to fbo1 using copy shader (which wrongly does not decompress fbo1)
>>
>> My apologies, these two lines should just be a copy fbo1 to fbo2,
>> which does need to eleminate the cmask fast clear.
>
> That sounds like a texture barrier is required.
>
> Marek

I think it valid if even without ARB_texture_barrier as the only place
where we could have a rendering feedback loop is the clear. The shader
clears and the copy do not have the same fbo as texture and therefore
no render feedback loop.

I am not sure if a clear classifies as a GL rendering operation. If it
is not, we have no render feedback loop. If it is, it is still not a
render feedback loop as the active fragment and vertex shaders (the
clear shader) do not contain instructions that sample from that
texture.

- Bas


More information about the mesa-dev mailing list