[Mesa-dev] [PATCH 3/8] radeonsi: Enable DCC.

Bas Nieuwenhuizen bas at basnieuwenhuizen.nl
Thu Sep 24 12:47:26 PDT 2015


On Thursday, September 24, 2015 07:24:50 PM Marek Olšák wrote:
> On Thu, Sep 24, 2015 at 2:15 PM, Bas Nieuwenhuizen
> 
> <bas at basnieuwenhuizen.nl> wrote:
> > Hi Marek,
> > 
> > Thanks for the review and my apologies, it seems I underestimated the
> > potential for regressions in this series.
> > 
> > See below for some comments in reaction to the review.
> > 
> > For a v2 is it preferred that I rebase it to master, or keep basing it on
> > the old version? There are some function renames that impact this series.
> We'd like to merge this eventually, so rebasing would be nice.
> 
> > On Thursday, September 24, 2015 01:36:31 AM Marek Olšák wrote:
> >> On Fri, Sep 4, 2015 at 9:47 PM, Bas Nieuwenhuizen
> >> 
> >> <bas at basnieuwenhuizen.nl> wrote:
> >> > The flags to be enabled in the control registers have been taken from
> >> > Catalyst traces.
> >> > 
> >> > Signed-off-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
> >> > ---
> >> > 
> >> >  src/gallium/drivers/radeon/r600_pipe_common.h |  1 +
> >> >  src/gallium/drivers/radeon/r600_texture.c     |  2 ++
> >> >  src/gallium/drivers/radeon/r600d_common.h     |  1 +
> >> >  src/gallium/drivers/radeonsi/si_blit.c        | 16 ++++++++++++----
> >> >  src/gallium/drivers/radeonsi/si_pipe.c        |  2 ++
> >> >  src/gallium/drivers/radeonsi/si_pipe.h        |  1 +
> >> >  src/gallium/drivers/radeonsi/si_state.c       | 27
> >> >  +++++++++++++++++++++++---- src/gallium/drivers/radeonsi/sid.h
> >> >  
> >> >   |  1 +
> >> >  
> >> >  8 files changed, 43 insertions(+), 8 deletions(-)
> >> > 
> >> > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
> >> > b/src/gallium/drivers/radeon/r600_pipe_common.h index f05318f..dac270e
> >> > 100644
> >> > --- a/src/gallium/drivers/radeon/r600_pipe_common.h
> >> > +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
> >> > @@ -244,6 +244,7 @@ struct r600_surface {
> >> > 
> >> >         unsigned cb_color_dim;          /* EG only */
> >> >         unsigned cb_color_pitch;        /* EG and later */
> >> >         unsigned cb_color_slice;        /* EG and later */
> >> > 
> >> > +       unsigned cb_dcc_base;           /* VI and later */
> >> > 
> >> >         unsigned cb_color_attrib;       /* EG and later */
> >> >         unsigned cb_dcc_control;        /* VI and later */
> >> >         unsigned cb_color_fmask;        /* CB_COLORn_FMASK (EG and
> >> >         later)
> >> >         or CB_COLORn_FRAG (r600) */>
> >> > 
> >> > diff --git a/src/gallium/drivers/radeon/r600_texture.c
> >> > b/src/gallium/drivers/radeon/r600_texture.c index 46e735e..017f5e7
> >> > 100644
> >> > --- a/src/gallium/drivers/radeon/r600_texture.c
> >> > +++ b/src/gallium/drivers/radeon/r600_texture.c
> >> > @@ -495,6 +495,8 @@ static void vi_texture_alloc_dcc_separate(struct
> >> > r600_common_screen *rscreen,>
> >> > 
> >> >         r600_screen_clear_buffer(rscreen, &rtex->dcc_buffer->b.b, 0,
> >> >         rtex->surface.dcc_size,>
> >> >         
> >> >                                  0xFFFFFFFF, true);
> >> > 
> >> > +
> >> > +       rtex->cb_color_info |= VI_S_028C70_DCC_ENABLE(1);
> >> > 
> >> >  }
> >> >  
> >> >  static unsigned r600_texture_get_htile_size(struct r600_common_screen
> >> >  *rscreen,>
> >> > 
> >> > diff --git a/src/gallium/drivers/radeon/r600d_common.h
> >> > b/src/gallium/drivers/radeon/r600d_common.h index 115042d..a3d182c
> >> > 100644
> >> > --- a/src/gallium/drivers/radeon/r600d_common.h
> >> > +++ b/src/gallium/drivers/radeon/r600d_common.h
> >> > @@ -202,6 +202,7 @@
> >> > 
> >> >  #define   EG_S_028C70_FAST_CLEAR(x)                       (((x) & 0x1)
> >> >  <<
> >> >  17) #define   SI_S_028C70_FAST_CLEAR(x)                       (((x) &
> >> >  0x1) << 13)>
> >> > 
> >> > +#define   VI_S_028C70_DCC_ENABLE(x)                       (((x) & 0x1)
> >> > <<
> >> > 28)>
> >> > 
> >> >  /*CIK+*/
> >> >  #define R_0300FC_CP_STRMOUT_CNTL                    0x0300FC
> >> > 
> >> > diff --git a/src/gallium/drivers/radeonsi/si_blit.c
> >> > b/src/gallium/drivers/radeonsi/si_blit.c index 13bb457..98913e5 100644
> >> > --- a/src/gallium/drivers/radeonsi/si_blit.c
> >> > +++ b/src/gallium/drivers/radeonsi/si_blit.c
> >> > @@ -264,7 +264,7 @@ static void si_blit_decompress_color(struct
> >> > pipe_context *ctx,>
> >> > 
> >> >                 return;
> >> >         
> >> >         for (level = first_level; level <= last_level; level++) {
> >> > 
> >> > -               if (!(rtex->dirty_level_mask & (1 << level)))
> >> > +               if (!(rtex->dirty_level_mask & (1 << level)) &&
> >> > !(rtex->dcc_compressed_level_mask & (1 << level)))>
> >> > 
> >> >                         continue;
> >> >                 
> >> >                 /* The smaller the mipmap level, the less layers there
> >> >                 are
> >> > 
> >> > @@ -274,6 +274,7 @@ static void si_blit_decompress_color(struct
> >> > pipe_context *ctx,>
> >> > 
> >> >                 for (layer = first_layer; layer <= checked_last_layer;
> >> >                 layer++) {
> >> >                 
> >> >                         struct pipe_surface *cbsurf, surf_tmpl;
> >> > 
> >> > +                       void * custom_blend;
> >> > 
> >> >                         surf_tmpl.format = rtex->resource.b.b.format;
> >> >                         surf_tmpl.u.tex.level = level;
> >> > 
> >> > @@ -281,10 +282,17 @@ static void si_blit_decompress_color(struct
> >> > pipe_context *ctx,>
> >> > 
> >> >                         surf_tmpl.u.tex.last_layer = layer;
> >> >                         cbsurf = ctx->create_surface(ctx,
> >> >                         &rtex->resource.b.b, &surf_tmpl);
> >> > 
> >> > +                       if(rtex->fmask.size) {
> >> > +                               custom_blend =
> >> > sctx->custom_blend_decompress; +                       } else
> >> > if(rtex->dcc_buffer) {
> >> > +                               /* also eliminates the fast clear if
> >> > necessary */ +                               custom_blend =
> >> > sctx->custom_blend_dcc_decompress; +                       } else {
> >> > +                               custom_blend =
> >> > sctx->custom_blend_fastclear; +                       }
> >> 
> >> The order of these is wrong. DCC decompression should be first, which
> >> also automatically invokes FMASK and CMASK decompression. Next should
> >> be FMASK decompression, which automatically invokes CMASK
> >> decompression.
> >> 
> >> There are also 2 things that need to be changed in this file or
> >> r600_texture.c:
> >> 
> >> 1) CMASK fast clear isn't possible with DCC and shouldn't be used. It
> >> should be disallowed in the commit that
> >> enables DCC.
> > 
> > Right, I have been looking at this since I found some related corruption
> > last week. However, I come to a slightly different conclusion about the
> > clear values than in your other mail:
> > 
> > 0x00 (0, 0, 0, 0)
> > 0x40 (0, 0, 0, 1)
> > 0x80 (1, 1, 1, 0)
> > 0xc0 (1, 1, 1, 1)
> > 
> > and 0x20 for a general CMASK-like clear that uses the colorbuffer clear
> > color.
> No. CB always uses the clear color registers no matter what the DCC
> buffer contains, so the only way to check if the clear color is set
> correctly is to read it using texture units.
> 
> > Furthermore, it seems I can eliminate these with
> > V_028808_CB_ELIMINATE_FAST_CLEAR. I need to test though whether this works
> > in all cases a CMASK clear would work in the non-DCC case and if there
> > are any caveats.
> 
> You're wasting time with CMASK fast clear. It's not officially
> supported with DCC.


So for fast clears other than the four simple colors, do we need to use 
DCC_DECOMPRESS instead of ELIMINATE_FAST_CLEAR, or is anything other than 
those four colors not supported for a fast clear with DCC?

The reason I am interested in using ELIMINATE_FAST_CLEAR is that it is faster 
than the decompress and fglrx actually uses it. i.e. if I do

loop:
   clear
   one draw
read texture from shader

I get something like the following in the CS:

set CMASK to 0xFF
loop:
  set DCC to 0x20
  draw with DCC, CMASK enabled and CB_COLOR_CONTROL=0xcc0010 (i.e.CB_NORMAL)
draw with DCC, CMASK enabled and CB_COLOR_CONTROL=0xcc0020 
(i.e.CB_ELIMINATE_FAST_CLEAR)
other draws etc.

So I would suppose that this is actually stable enough to be used. I would be 
fine with implementing an alternative for this series and revisiting this issue 
at a later date.

Bas
 
> > The corruption seems to occur if I have a CMASK (the CMASK itself is fine)
> > with cleared entries, i.e. uninitialized of cleared CMASK gives
> > corruption, but set to 0xFF seems to be okay and fglrx actually does that
> > during my limited tests.
> Initializing CMASK to 0xFF for 1 sample and to 0xCC for multisample is
> a requirement for using DCC, but if CMASK must be equal to 0xFF all
> the time, why bother even enabling it for 1 sample? For multisampling,
> CMASK does more than fast clear, so it should still be enabled, but
> never cleared.
> 
> >> 2) Hardware MSAA resolves can't have DCC enabled in the destination.
> >> There are 2 options:
> >> a) use shader-based resolving
> >> b) disable DCC and release the DCC buffer before using a hw resolve
> > 
> > A third option would be to do the resolve with DCC disabled and then set
> > the DCC buffer to the uncompressed state.
> > 
> > I estimate that both options 2 and 3 will be hard to do in a nice way and
> > I am not sure about the performance of the first option, so this may take
> > some time.
> We already have to do shader-based resolve in the majority of cases
> (hw resolve cannot resolve into a scanout buffer), so I don't think
> there will be any negative impact.
> 
> >> > +
> >> > 
> >> >                         si_blitter_begin(ctx, SI_DECOMPRESS);
> >> > 
> >> > -                       util_blitter_custom_color(sctx->blitter,
> >> > cbsurf,
> >> > -                               rtex->fmask.size ?
> >> > sctx->custom_blend_decompress : -
> >> > 
> >> >          sctx->custom_blend_fastclear); +
> >> > 
> >> > util_blitter_custom_color(sctx->blitter, cbsurf, custom_blend);>
> >> > 
> >> >                         si_blitter_end(ctx);
> >> >                         
> >> >                         pipe_surface_reference(&cbsurf, NULL);
> >> > 
> >> > diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
> >> > b/src/gallium/drivers/radeonsi/si_pipe.c index 7dbb2e3..a5525ac 100644
> >> > --- a/src/gallium/drivers/radeonsi/si_pipe.c
> >> > +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> >> > @@ -67,6 +67,8 @@ static void si_destroy_context(struct pipe_context
> >> > *context)>
> >> > 
> >> >                 sctx->b.b.delete_blend_state(&sctx->b.b,
> >> >                 sctx->custom_blend_decompress);
> >> >         
> >> >         if (sctx->custom_blend_fastclear)
> >> >         
> >> >                 sctx->b.b.delete_blend_state(&sctx->b.b,
> >> >                 sctx->custom_blend_fastclear);
> >> > 
> >> > +       if (sctx->custom_blend_dcc_decompress)
> >> > +               sctx->b.b.delete_blend_state(&sctx->b.b,
> >> > sctx->custom_blend_dcc_decompress);>
> >> > 
> >> >         util_unreference_framebuffer_state(&sctx->framebuffer.state);
> >> >         
> >> >         if (sctx->blitter)
> >> > 
> >> > diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
> >> > b/src/gallium/drivers/radeonsi/si_pipe.h index d9c7871..66c66c8 100644
> >> > --- a/src/gallium/drivers/radeonsi/si_pipe.h
> >> > +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> >> > @@ -159,6 +159,7 @@ struct si_context {
> >> > 
> >> >         void                            *custom_blend_resolve;
> >> >         void                            *custom_blend_decompress;
> >> >         void                            *custom_blend_fastclear;
> >> > 
> >> > +       void                            *custom_blend_dcc_decompress;
> >> > 
> >> >         void                            *pstipple_sampler_state;
> >> >         struct si_screen                *screen;
> >> >         struct pipe_fence_handle        *last_gfx_fence;
> >> > 
> >> > diff --git a/src/gallium/drivers/radeonsi/si_state.c
> >> > b/src/gallium/drivers/radeonsi/si_state.c index faef2ee..5c9c866 100644
> >> > --- a/src/gallium/drivers/radeonsi/si_state.c
> >> > +++ b/src/gallium/drivers/radeonsi/si_state.c
> >> > @@ -1915,8 +1915,19 @@ static void si_initialize_color_surface(struct
> >> > si_context *sctx,>
> >> > 
> >> >         surf->cb_color_info = color_info;
> >> >         surf->cb_color_attrib = color_attrib;
> >> > 
> >> > -       if (sctx->b.chip_class >= VI)
> >> > -               surf->cb_dcc_control =
> >> > S_028C78_OVERWRITE_COMBINER_DISABLE(1);
> >> 
> >> OVERWRITE_COMBINER_DISABLE must be set if both blending and MSAA are
> >> enabled. Otherwise, corruption can occur. I'm waiting for an answer if
> >> we can use CB_DCC_CONTROL.OVERWRITE_COMBINER_DISABLE instead. Until
> >> then, it would be nice to add a "TODO" in the commit message.
> 
> BTW, I've received a confirmation that we can use:
> CB_DCC_CONTROL.OVERWRITE_COMBINER_DISABLE = 1
> if DCC, MSAA and blending are enabled on any of the colorbuffers. It's
> required to prevent some corruption.
> 
> >> > +       if (sctx->b.chip_class >= VI) {
> >> > +
> >> 
> >> Useless empty line.
> >> 
> >> > +               surf->cb_dcc_control = S_028C78_KEY_CLEAR_ENABLE(1) |
> >> > +
> >> > S_028C78_MAX_UNCOMPRESSED_BLOCK_SIZE(2) | +
> >> > 
> >> >        S_028C78_INDEPENDENT_64B_BLOCKS(1);
> >> 
> >> KEY_CLEAR_ENABLE isn't supported on VI as far as I know. Why do you
> >> think it should be set?
> > 
> > I have not been able to find out what it does, and not supporting it would
> > be a very good reason why. I set it because fglrx does it and I guessed
> > deviating from fglrx only increases the probability of regressions.
> > 
> > What would be the preferred action in such a case in general ? Keep or
> > remove?
> I'd remove it and see if there are no regressions.
> 
> Marek



More information about the mesa-dev mailing list