[Mesa-dev] [PATCH v3 5/7] radeonsi: Implement DCC fast clear.

Marek Olšák maraeo at gmail.com
Fri Oct 23 14:34:49 PDT 2015


On Fri, Oct 23, 2015 at 7:06 PM, Bas Nieuwenhuizen
<bas at basnieuwenhuizen.nl> wrote:
> On Fri, Oct 23, 2015 at 4:57 PM, Marek Olšák <maraeo at gmail.com> wrote:
>> On Fri, Oct 23, 2015 at 1:57 PM, Bas Nieuwenhuizen
>> <bas at basnieuwenhuizen.nl> wrote:
>>> On Fri, Oct 23, 2015 at 1:52 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>> On Fri, Oct 23, 2015 at 1:30 PM, Bas Nieuwenhuizen
>>>> <bas at basnieuwenhuizen.nl> wrote:
>>>>> On Fri, Oct 23, 2015 at 12:50 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>> On Fri, Oct 23, 2015 at 12:17 PM, Bas Nieuwenhuizen
>>>>>> <bas at basnieuwenhuizen.nl> wrote:
>>>>>>> On Thu, Oct 22, 2015 at 12:12 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>>>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>>>>> index 5548cba3..a277fa5 100644
>>>>>>>>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>>>>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
>>>>>>>>> @@ -234,7 +234,7 @@ static void si_set_sampler_views(struct pipe_context *ctx,
>>>>>>>>>                         } else {
>>>>>>>>>                                 samplers->depth_texture_mask &= ~(1 << slot);
>>>>>>>>>                         }
>>>>>>>>> -                       if (rtex->cmask.size || rtex->fmask.size) {
>>>>>>>>> +                       if (rtex->cmask.size || rtex->fmask.size || rtex->surface.dcc_enabled) {
>>>>>>>>>                                 samplers->compressed_colortex_mask |= 1 << slot;
>>>>>>>>
>>>>>>>> I'd like this flag to be set only when dirty_level_mask is non-zero.
>>>>>>>> Setting this for all textures that have DCC is quite expensive in draw
>>>>>>>> calls.
>>>>>>>
>>>>>>> I think this code is incorrect even without considering DCC. If we do
>>>>>>> a fast clear on a surface which allocates a cmask and then use that
>>>>>>> surface as a texture without calling set_sampler_views in between
>>>>>>> (because it was bound before) we get a stale compressed_colortex_mask.
>>>>>>>
>>>>>>> Some testing shows that this can be triggered using OpenGL, although
>>>>>>> the GL_ARB_texture_barrier extension may be needed to make the result
>>>>>>> not undefined per the specification.
>>>>>>
>>>>>> In that case, we should decompress in texture_barrier and not in draw calls.
>>>>>>
>>>>>> Marek
>>>>>
>>>>>
>>>>> texture_barrier does not need to be called though, the language
>>>>> changes might be needed.
>>>>>
>>>>> Basically the test is
>>>>>
>>>>> fbo1, fbo2 framebuffers with 1 color buffer each:
>>>>>
>>>>> bind fbo2 as texture
>>>>> clear fbo1 using shader
>>>>> bind fbo1 as texture
>>>>> clear fbo2 using shader
>>>>> clear fbo1 using clear (which results in cmask being allocated for fbo1)
>>>
>>>>> bind fbo2 as texture
>>>>> copy fbo2 to fbo1 using copy shader (which wrongly does not decompress fbo1)
>>>
>>> My apologies, these two lines should just be a copy fbo1 to fbo2,
>>> which does need to eleminate the cmask fast clear.
>>
>> That sounds like a texture barrier is required.
>>
>> Marek
>
> I think it valid if even without ARB_texture_barrier as the only place
> where we could have a rendering feedback loop is the clear. The shader
> clears and the copy do not have the same fbo as texture and therefore
> no render feedback loop.
>
> I am not sure if a clear classifies as a GL rendering operation. If it
> is not, we have no render feedback loop. If it is, it is still not a
> render feedback loop as the active fragment and vertex shaders (the
> clear shader) do not contain instructions that sample from that
> texture.

The texture barrier ensures that the previous writes are visible to
the next read of the texture. The previous reads are irrelevant.

Marek


More information about the mesa-dev mailing list