[Mesa-dev] [PATCH 3/8] radeonsi: Enable DCC.

Marek Olšák maraeo at gmail.com
Thu Sep 24 10:24:50 PDT 2015


On Thu, Sep 24, 2015 at 2:15 PM, Bas Nieuwenhuizen
<bas at basnieuwenhuizen.nl> wrote:
> Hi Marek,
>
> Thanks for the review and my apologies, it seems I underestimated the
> potential for regressions in this series.
>
> See below for some comments in reaction to the review.
>
> For a v2 is it preferred that I rebase it to master, or keep basing it on the
> old version? There are some function renames that impact this series.

We'd like to merge this eventually, so rebasing would be nice.

>
> On Thursday, September 24, 2015 01:36:31 AM Marek Olšák wrote:
>> On Fri, Sep 4, 2015 at 9:47 PM, Bas Nieuwenhuizen
>>
>> <bas at basnieuwenhuizen.nl> wrote:
>> > The flags to be enabled in the control registers have been taken from
>> > Catalyst traces.
>> >
>> > Signed-off-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
>> > ---
>> >
>> >  src/gallium/drivers/radeon/r600_pipe_common.h |  1 +
>> >  src/gallium/drivers/radeon/r600_texture.c     |  2 ++
>> >  src/gallium/drivers/radeon/r600d_common.h     |  1 +
>> >  src/gallium/drivers/radeonsi/si_blit.c        | 16 ++++++++++++----
>> >  src/gallium/drivers/radeonsi/si_pipe.c        |  2 ++
>> >  src/gallium/drivers/radeonsi/si_pipe.h        |  1 +
>> >  src/gallium/drivers/radeonsi/si_state.c       | 27
>> >  +++++++++++++++++++++++---- src/gallium/drivers/radeonsi/sid.h
>> >   |  1 +
>> >  8 files changed, 43 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
>> > b/src/gallium/drivers/radeon/r600_pipe_common.h index f05318f..dac270e
>> > 100644
>> > --- a/src/gallium/drivers/radeon/r600_pipe_common.h
>> > +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
>> > @@ -244,6 +244,7 @@ struct r600_surface {
>> >
>> >         unsigned cb_color_dim;          /* EG only */
>> >         unsigned cb_color_pitch;        /* EG and later */
>> >         unsigned cb_color_slice;        /* EG and later */
>> >
>> > +       unsigned cb_dcc_base;           /* VI and later */
>> >
>> >         unsigned cb_color_attrib;       /* EG and later */
>> >         unsigned cb_dcc_control;        /* VI and later */
>> >         unsigned cb_color_fmask;        /* CB_COLORn_FMASK (EG and later)
>> >         or CB_COLORn_FRAG (r600) */>
>> > diff --git a/src/gallium/drivers/radeon/r600_texture.c
>> > b/src/gallium/drivers/radeon/r600_texture.c index 46e735e..017f5e7 100644
>> > --- a/src/gallium/drivers/radeon/r600_texture.c
>> > +++ b/src/gallium/drivers/radeon/r600_texture.c
>> > @@ -495,6 +495,8 @@ static void vi_texture_alloc_dcc_separate(struct
>> > r600_common_screen *rscreen,>
>> >         r600_screen_clear_buffer(rscreen, &rtex->dcc_buffer->b.b, 0,
>> >         rtex->surface.dcc_size,>
>> >                                  0xFFFFFFFF, true);
>> >
>> > +
>> > +       rtex->cb_color_info |= VI_S_028C70_DCC_ENABLE(1);
>> >
>> >  }
>> >
>> >  static unsigned r600_texture_get_htile_size(struct r600_common_screen
>> >  *rscreen,>
>> > diff --git a/src/gallium/drivers/radeon/r600d_common.h
>> > b/src/gallium/drivers/radeon/r600d_common.h index 115042d..a3d182c 100644
>> > --- a/src/gallium/drivers/radeon/r600d_common.h
>> > +++ b/src/gallium/drivers/radeon/r600d_common.h
>> > @@ -202,6 +202,7 @@
>> >
>> >  #define   EG_S_028C70_FAST_CLEAR(x)                       (((x) & 0x1) <<
>> >  17) #define   SI_S_028C70_FAST_CLEAR(x)                       (((x) &
>> >  0x1) << 13)>
>> > +#define   VI_S_028C70_DCC_ENABLE(x)                       (((x) & 0x1) <<
>> > 28)>
>> >  /*CIK+*/
>> >  #define R_0300FC_CP_STRMOUT_CNTL                    0x0300FC
>> >
>> > diff --git a/src/gallium/drivers/radeonsi/si_blit.c
>> > b/src/gallium/drivers/radeonsi/si_blit.c index 13bb457..98913e5 100644
>> > --- a/src/gallium/drivers/radeonsi/si_blit.c
>> > +++ b/src/gallium/drivers/radeonsi/si_blit.c
>> > @@ -264,7 +264,7 @@ static void si_blit_decompress_color(struct
>> > pipe_context *ctx,>
>> >                 return;
>> >
>> >         for (level = first_level; level <= last_level; level++) {
>> >
>> > -               if (!(rtex->dirty_level_mask & (1 << level)))
>> > +               if (!(rtex->dirty_level_mask & (1 << level)) &&
>> > !(rtex->dcc_compressed_level_mask & (1 << level)))>
>> >                         continue;
>> >
>> >                 /* The smaller the mipmap level, the less layers there are
>> >
>> > @@ -274,6 +274,7 @@ static void si_blit_decompress_color(struct
>> > pipe_context *ctx,>
>> >                 for (layer = first_layer; layer <= checked_last_layer;
>> >                 layer++) {
>> >
>> >                         struct pipe_surface *cbsurf, surf_tmpl;
>> >
>> > +                       void * custom_blend;
>> >
>> >                         surf_tmpl.format = rtex->resource.b.b.format;
>> >                         surf_tmpl.u.tex.level = level;
>> >
>> > @@ -281,10 +282,17 @@ static void si_blit_decompress_color(struct
>> > pipe_context *ctx,>
>> >                         surf_tmpl.u.tex.last_layer = layer;
>> >                         cbsurf = ctx->create_surface(ctx,
>> >                         &rtex->resource.b.b, &surf_tmpl);
>> >
>> > +                       if(rtex->fmask.size) {
>> > +                               custom_blend =
>> > sctx->custom_blend_decompress; +                       } else
>> > if(rtex->dcc_buffer) {
>> > +                               /* also eliminates the fast clear if
>> > necessary */ +                               custom_blend =
>> > sctx->custom_blend_dcc_decompress; +                       } else {
>> > +                               custom_blend =
>> > sctx->custom_blend_fastclear; +                       }
>>
>> The order of these is wrong. DCC decompression should be first, which
>> also automatically invokes FMASK and CMASK decompression. Next should
>> be FMASK decompression, which automatically invokes CMASK
>> decompression.
>>
>> There are also 2 things that need to be changed in this file or
>> r600_texture.c:
>>
>> 1) CMASK fast clear isn't possible with DCC and shouldn't be used. It
>> should be disallowed in the commit that
>> enables DCC.
>
> Right, I have been looking at this since I found some related corruption last
> week. However, I come to a slightly different conclusion about the clear values
> than in your other mail:
>
> 0x00 (0, 0, 0, 0)
> 0x40 (0, 0, 0, 1)
> 0x80 (1, 1, 1, 0)
> 0xc0 (1, 1, 1, 1)
>
> and 0x20 for a general CMASK-like clear that uses the colorbuffer clear color.

No. CB always uses the clear color registers no matter what the DCC
buffer contains, so the only way to check if the clear color is set
correctly is to read it using texture units.


> Furthermore, it seems I can eliminate these with
> V_028808_CB_ELIMINATE_FAST_CLEAR. I need to test though whether this works in
> all cases a CMASK clear would work in the non-DCC case and if there are any
> caveats.

You're wasting time with CMASK fast clear. It's not officially
supported with DCC.


>
> The corruption seems to occur if I have a CMASK (the CMASK itself is fine) with
> cleared entries, i.e. uninitialized of cleared CMASK gives corruption, but set
> to 0xFF seems to be okay and fglrx actually does that during my limited tests.

Initializing CMASK to 0xFF for 1 sample and to 0xCC for multisample is
a requirement for using DCC, but if CMASK must be equal to 0xFF all
the time, why bother even enabling it for 1 sample? For multisampling,
CMASK does more than fast clear, so it should still be enabled, but
never cleared.


>
>> 2) Hardware MSAA resolves can't have DCC enabled in the destination.
>> There are 2 options:
>> a) use shader-based resolving
>> b) disable DCC and release the DCC buffer before using a hw resolve
>
> A third option would be to do the resolve with DCC disabled and then set the
> DCC buffer to the uncompressed state.
>
> I estimate that both options 2 and 3 will be hard to do in a nice way and I am
> not sure about the performance of the first option, so this may take some time.

We already have to do shader-based resolve in the majority of cases
(hw resolve cannot resolve into a scanout buffer), so I don't think
there will be any negative impact.

>
>> > +
>> >
>> >                         si_blitter_begin(ctx, SI_DECOMPRESS);
>> >
>> > -                       util_blitter_custom_color(sctx->blitter, cbsurf,
>> > -                               rtex->fmask.size ?
>> > sctx->custom_blend_decompress : -
>> >          sctx->custom_blend_fastclear); +
>> > util_blitter_custom_color(sctx->blitter, cbsurf, custom_blend);>
>> >                         si_blitter_end(ctx);
>> >
>> >                         pipe_surface_reference(&cbsurf, NULL);
>> >
>> > diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
>> > b/src/gallium/drivers/radeonsi/si_pipe.c index 7dbb2e3..a5525ac 100644
>> > --- a/src/gallium/drivers/radeonsi/si_pipe.c
>> > +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>> > @@ -67,6 +67,8 @@ static void si_destroy_context(struct pipe_context
>> > *context)>
>> >                 sctx->b.b.delete_blend_state(&sctx->b.b,
>> >                 sctx->custom_blend_decompress);
>> >
>> >         if (sctx->custom_blend_fastclear)
>> >
>> >                 sctx->b.b.delete_blend_state(&sctx->b.b,
>> >                 sctx->custom_blend_fastclear);
>> >
>> > +       if (sctx->custom_blend_dcc_decompress)
>> > +               sctx->b.b.delete_blend_state(&sctx->b.b,
>> > sctx->custom_blend_dcc_decompress);>
>> >         util_unreference_framebuffer_state(&sctx->framebuffer.state);
>> >
>> >         if (sctx->blitter)
>> >
>> > diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
>> > b/src/gallium/drivers/radeonsi/si_pipe.h index d9c7871..66c66c8 100644
>> > --- a/src/gallium/drivers/radeonsi/si_pipe.h
>> > +++ b/src/gallium/drivers/radeonsi/si_pipe.h
>> > @@ -159,6 +159,7 @@ struct si_context {
>> >
>> >         void                            *custom_blend_resolve;
>> >         void                            *custom_blend_decompress;
>> >         void                            *custom_blend_fastclear;
>> >
>> > +       void                            *custom_blend_dcc_decompress;
>> >
>> >         void                            *pstipple_sampler_state;
>> >         struct si_screen                *screen;
>> >         struct pipe_fence_handle        *last_gfx_fence;
>> >
>> > diff --git a/src/gallium/drivers/radeonsi/si_state.c
>> > b/src/gallium/drivers/radeonsi/si_state.c index faef2ee..5c9c866 100644
>> > --- a/src/gallium/drivers/radeonsi/si_state.c
>> > +++ b/src/gallium/drivers/radeonsi/si_state.c
>> > @@ -1915,8 +1915,19 @@ static void si_initialize_color_surface(struct
>> > si_context *sctx,>
>> >         surf->cb_color_info = color_info;
>> >         surf->cb_color_attrib = color_attrib;
>> >
>> > -       if (sctx->b.chip_class >= VI)
>> > -               surf->cb_dcc_control =
>> > S_028C78_OVERWRITE_COMBINER_DISABLE(1);
>> OVERWRITE_COMBINER_DISABLE must be set if both blending and MSAA are
>> enabled. Otherwise, corruption can occur. I'm waiting for an answer if
>> we can use CB_DCC_CONTROL.OVERWRITE_COMBINER_DISABLE instead. Until
>> then, it would be nice to add a "TODO" in the commit message.

BTW, I've received a confirmation that we can use:
CB_DCC_CONTROL.OVERWRITE_COMBINER_DISABLE = 1
if DCC, MSAA and blending are enabled on any of the colorbuffers. It's
required to prevent some corruption.


>>
>> > +       if (sctx->b.chip_class >= VI) {
>> > +
>>
>> Useless empty line.
>>
>> > +               surf->cb_dcc_control = S_028C78_KEY_CLEAR_ENABLE(1) |
>> > +
>> > S_028C78_MAX_UNCOMPRESSED_BLOCK_SIZE(2) | +
>> >        S_028C78_INDEPENDENT_64B_BLOCKS(1);
>> KEY_CLEAR_ENABLE isn't supported on VI as far as I know. Why do you
>> think it should be set?
>
> I have not been able to find out what it does, and not supporting it would be a
> very good reason why. I set it because fglrx does it and I guessed deviating
> from fglrx only increases the probability of regressions.
>
> What would be the preferred action in such a case in general ? Keep or remove?

I'd remove it and see if there are no regressions.

Marek


More information about the mesa-dev mailing list