[Mesa-dev] [PATCH 1/4] r600g: avoid redundant DB registerupdates
Marc Dietrich
marvin24 at gmx.de
Fri Apr 28 18:19:41 UTC 2017
Am Freitag, 28. April 2017, 16:53:55 CEST schrieb Dieter Nützel:
> I'm running this, too.
> But alone. 4/4 didn't apply anylonger ;-)
>
> NO glitches on NI/Turks XT (6670).
>
> I had tested 'Heaven' and 'Valley' even with the former patch version.
> The 'Heaven' GPU hang (wireframe/tessellation) is OLD, as it stays there
> for ages.
> So:
>
> Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Dieter, your card is HD5000/Evergreen, while mine (RS880) is similar to
HD4200/r600. I'm ok if the patch only gets applied to the everygreen+
generations if the glitches for r600 cant get fixed.
Marc
>
> Dieter
>
> Am 28.04.2017 09:57, schrieb Marc Dietrich:
> > Hi Constantine,
> >
> >
> > Am Donnerstag, 27. April 2017, 21:04:37 CEST schrieb Constantine
> >
> > Kharlamov:
> >> Please, could you try this patch. The change is: I'm setting
> >> dirty_zsbuf in
> >> r600_bind_blend_state_internal() as well. It was the difference
> >> between
> >> radeonsi and r600 for CB updates, and my guess is, it might be
> >> relevant to
> >> DB ones as well.
> >
> > ok, crash is gone and I get 2-3 fps more :-)
> >
> > But some rendering glitches and:
> >
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> > radeon 0000:01:05.0: r600_cs_track_validate_db:696 htile surface too
> > small
> > 20480 for 262144 (256 256)
> > radeon 0000:01:05.0: r600_packet3_check:1724 invalid cmd stream
> > [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
> >
> >
> > Marc
> >
> >> ---
> >>
> >> src/gallium/drivers/r600/evergreen_state.c | 76
> >>
> >> +++++++++++++++------------- src/gallium/drivers/r600/r600_blit.c
> >>
> >> 1 +
> >> src/gallium/drivers/r600/r600_hw_context.c | 1 +
> >> src/gallium/drivers/r600/r600_pipe.h | 1 +
> >> src/gallium/drivers/r600/r600_state.c | 52 ++++++++++---------
> >> src/gallium/drivers/r600/r600_state_common.c | 2 +
> >> 6 files changed, 73 insertions(+), 60 deletions(-)
> >>
> >> diff --git a/src/gallium/drivers/r600/evergreen_state.c
> >> b/src/gallium/drivers/r600/evergreen_state.c index
> >> 19ad504097..7d84e92250
> >> 100644
> >> --- a/src/gallium/drivers/r600/evergreen_state.c
> >> +++ b/src/gallium/drivers/r600/evergreen_state.c
> >> @@ -1426,6 +1426,7 @@ static void
> >> evergreen_set_framebuffer_state(struct
> >> pipe_context *ctx, R600_CONTEXT_FLUSH_AND_INV_DB_META |
> >>
> >> R600_CONTEXT_INV_TEX_CACHE;
> >>
> >> + rctx->framebuffer.dirty_zsbuf |= rctx->framebuffer.state.zsbuf !=
> >> state->zsbuf; util_copy_framebuffer_state(&rctx->framebuffer.state,
> >> state);
> >>
> >> /* Colorbuffers. */
> >>
> >> @@ -1746,45 +1747,47 @@ static void
> >> evergreen_emit_framebuffer_state(struct
> >> r600_context *rctx, struct r radeon_set_context_reg(cs,
> >> R_028E50_CB_COLOR8_INFO + (i - 8) * 0x1C, 0);
> >>
> >> /* ZS buffer. */
> >>
> >> - if (state->zsbuf) {
> >> - struct r600_surface *zb = (struct r600_surface*)state->zsbuf;
> >> - unsigned reloc = radeon_add_to_buffer_list(&rctx->b,
> >> - &rctx->b.gfx,
> >> - (struct r600_resource*)state->zsbuf->texture,
> >> - RADEON_USAGE_READWRITE,
> >> - zb->base.texture->nr_samples > 1 ?
> >> - RADEON_PRIO_DEPTH_BUFFER_MSAA :
> >> - RADEON_PRIO_DEPTH_BUFFER);
> >> -
> >> - radeon_set_context_reg(cs, R_028008_DB_DEPTH_VIEW,
> >> zb->db_depth_view);
> >> -
> >> - radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 8);
> >> - radeon_emit(cs, zb->db_z_info); /* R_028040_DB_Z_INFO */
> >> - radeon_emit(cs, zb->db_stencil_info); /* R_028044_DB_STENCIL_INFO
> >> */
> >> - radeon_emit(cs, zb->db_depth_base); /* R_028048_DB_Z_READ_BASE */
> >> - radeon_emit(cs, zb->db_stencil_base); /*
> >> R_02804C_DB_STENCIL_READ_BASE
> >
> > */
> >
> >> - radeon_emit(cs, zb->db_depth_base); /* R_028050_DB_Z_WRITE_BASE */
> >> - radeon_emit(cs, zb->db_stencil_base); /*
> >> R_028054_DB_STENCIL_WRITE_BASE
> >> */ - radeon_emit(cs, zb->db_depth_size); /* R_028058_DB_DEPTH_SIZE
*/
> >> - radeon_emit(cs, zb->db_depth_slice); /* R_02805C_DB_DEPTH_SLICE */
> >> -
> >> - radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /* R_028048_DB_Z_READ_BASE
> >> */
> >> - radeon_emit(cs, reloc);
> >> + if (rctx->framebuffer.dirty_zsbuf) {
> >> + if (state->zsbuf) {
> >> + struct r600_surface *zb = (struct r600_surface*)state->zsbuf;
> >> + unsigned reloc = radeon_add_to_buffer_list(&rctx->b,
> >> + &rctx->b.gfx,
> >> + (struct r600_resource*)state->zsbuf->texture,
> >> + RADEON_USAGE_READWRITE,
> >> + zb->base.texture->nr_samples > 1 ?
> >> + RADEON_PRIO_DEPTH_BUFFER_MSAA :
> >> + RADEON_PRIO_DEPTH_BUFFER);
> >> +
> >> + radeon_set_context_reg(cs, R_028008_DB_DEPTH_VIEW, zb-
> >> db_depth_view);
> >> +
> >> + radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 8);
> >> + radeon_emit(cs, zb->db_z_info); /* R_028040_DB_Z_INFO */
> >> + radeon_emit(cs, zb->db_stencil_info); /* R_028044_DB_STENCIL_INFO
> >> */
> >> + radeon_emit(cs, zb->db_depth_base); /* R_028048_DB_Z_READ_BASE
*/
> >> + radeon_emit(cs, zb->db_stencil_base); /*
> >
> > R_02804C_DB_STENCIL_READ_BASE
> >
> >> */ + radeon_emit(cs, zb->db_depth_base); /*
R_028050_DB_Z_WRITE_BASE
> >
> > */
> >
> >> + radeon_emit(cs, zb->db_stencil_base); /*
> >
> > R_028054_DB_STENCIL_WRITE_BASE
> >
> >> */ + radeon_emit(cs, zb->db_depth_size); /*
R_028058_DB_DEPTH_SIZE
> >
> > */
> >
> >> + radeon_emit(cs, zb->db_depth_slice); /* R_02805C_DB_DEPTH_SLICE
*/
> >> +
> >> + radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /* R_028048_DB_Z_READ_BASE
> >> */
> >> + radeon_emit(cs, reloc);
> >>
> >> - radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /*
> >> R_02804C_DB_STENCIL_READ_BASE
> >> */ - radeon_emit(cs, reloc);
> >> + radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /*
> >
> > R_02804C_DB_STENCIL_READ_BASE
> >
> >> */ + radeon_emit(cs, reloc);
> >>
> >> - radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /* R_028050_DB_Z_WRITE_BASE
> >> */
> >> - radeon_emit(cs, reloc);
> >> + radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /*
R_028050_DB_Z_WRITE_BASE
> >> */
> >> + radeon_emit(cs, reloc);
> >>
> >> - radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /*
> >> R_028054_DB_STENCIL_WRITE_BASE
> >> */ - radeon_emit(cs, reloc);
> >> - } else if (rctx->screen->b.info.drm_minor >= 18) {
> >> - /* DRM 2.6.18 allows the INVALID format to disable depth/stencil.
> >> - * Older kernels are out of luck. */
> >> - radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 2);
> >> - radeon_emit(cs, S_028040_FORMAT(V_028040_Z_INVALID)); /*
> >> R_028040_DB_Z_INFO */ - radeon_emit(cs,
> >> S_028044_FORMAT(V_028044_STENCIL_INVALID)); /*
> >> R_028044_DB_STENCIL_INFO */
> >> + radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); /*
> >> R_028054_DB_STENCIL_WRITE_BASE */ + radeon_emit(cs, reloc);
> >> + } else if (rctx->screen->b.info.drm_minor >= 18) {
> >> + /* DRM 2.6.18 allows the INVALID format to disable depth/stencil.
> >> + * Older kernels are out of luck. */
> >> + radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 2);
> >> + radeon_emit(cs, S_028040_FORMAT(V_028040_Z_INVALID)); /*
> >> R_028040_DB_Z_INFO */ + radeon_emit(cs,
> >> S_028044_FORMAT(V_028044_STENCIL_INVALID)); /*
> >> R_028044_DB_STENCIL_INFO */
> >> + }
> >>
> >> }
> >>
> >> /* Framebuffer dimensions. */
> >>
> >> @@ -1806,6 +1809,7 @@ static void
> >> evergreen_emit_framebuffer_state(struct
> >> r600_context *rctx, struct r cayman_emit_msaa_config(cs,
> >> rctx->framebuffer.nr_samples,
> >>
> >> rctx->ps_iter_samples, 0, sc_mode_cntl_1);
> >>
> >> }
> >>
> >> + rctx->framebuffer.dirty_zsbuf = false;
> >>
> >> }
> >>
> >> static void evergreen_emit_polygon_offset(struct r600_context *rctx,
> >>
> >> struct
> >> r600_atom *a) diff --git a/src/gallium/drivers/r600/r600_blit.c
> >> b/src/gallium/drivers/r600/r600_blit.c index c52492e8c2..de87ed8650
> >> 100644
> >> --- a/src/gallium/drivers/r600/r600_blit.c
> >> +++ b/src/gallium/drivers/r600/r600_blit.c
> >> @@ -452,6 +452,7 @@ static void r600_clear(struct pipe_context *ctx,
> >> unsigned buffers, r600_mark_atom_dirty(rctx, &rctx->db_state.atom);
> >>
> >> }
> >> rctx->db_misc_state.htile_clear = true;
> >>
> >> + rctx->framebuffer.dirty_zsbuf = true;
> >>
> >> r600_mark_atom_dirty(rctx, &rctx->db_misc_state.atom);
> >>
> >> }
> >>
> >> }
> >>
> >> diff --git a/src/gallium/drivers/r600/r600_hw_context.c
> >> b/src/gallium/drivers/r600/r600_hw_context.c index
> >> 4511ce0c01..c85e346307
> >> 100644
> >> --- a/src/gallium/drivers/r600/r600_hw_context.c
> >> +++ b/src/gallium/drivers/r600/r600_hw_context.c
> >> @@ -374,6 +374,7 @@ void r600_begin_new_cs(struct r600_context *ctx)
> >>
> >> assert(!ctx->b.gfx.cs->prev_dw);
> >> ctx->b.initial_gfx_cs_size = ctx->b.gfx.cs->current.cdw;
> >>
> >> + ctx->framebuffer.dirty_zsbuf = true;
> >>
> >> }
> >>
> >> void r600_emit_pfp_sync_me(struct r600_context *rctx)
> >>
> >> diff --git a/src/gallium/drivers/r600/r600_pipe.h
> >> b/src/gallium/drivers/r600/r600_pipe.h index e1715e8628..aeba1f2635
> >> 100644
> >> --- a/src/gallium/drivers/r600/r600_pipe.h
> >> +++ b/src/gallium/drivers/r600/r600_pipe.h
> >> @@ -190,6 +190,7 @@ struct r600_framebuffer {
> >>
> >> bool is_msaa_resolve;
> >> bool dual_src_blend;
> >> bool do_update_surf_dirtiness;
> >>
> >> + bool dirty_zsbuf;
> >>
> >> };
> >>
> >> struct r600_sample_mask {
> >>
> >> diff --git a/src/gallium/drivers/r600/r600_state.c
> >> b/src/gallium/drivers/r600/r600_state.c index fc93eb02ad..a80156bab2
> >> 100644
> >> --- a/src/gallium/drivers/r600/r600_state.c
> >> +++ b/src/gallium/drivers/r600/r600_state.c
> >> @@ -1096,6 +1096,7 @@ static void r600_set_framebuffer_state(struct
> >> pipe_context *ctx, /* Set the new state. */
> >>
> >> util_copy_framebuffer_state(&rctx->framebuffer.state, state);
> >>
> >> + rctx->framebuffer.dirty_zsbuf |= rctx->framebuffer.state.zsbuf !=
> >> state->zsbuf; rctx->framebuffer.export_16bpc = state->nr_cbufs != 0;
> >>
> >> rctx->framebuffer.cb0_is_integer = state->nr_cbufs &&
> >>
> >> state->cbufs[0] &&
> >>
> >> util_format_is_pure_integer(state->cbufs[0]->format);
> >>
> >> @@ -1430,33 +1431,35 @@ static void r600_emit_framebuffer_state(struct
> >> r600_context *rctx, struct r600_a }
> >>
> >> /* Zbuffer. */
> >>
> >> - if (state->zsbuf) {
> >> - struct r600_surface *surf = (struct r600_surface*)state->zsbuf;
> >> - unsigned reloc = radeon_add_to_buffer_list(&rctx->b,
> >> - &rctx->b.gfx,
> >> - (struct r600_resource*)state->zsbuf->texture,
> >> - RADEON_USAGE_READWRITE,
> >> - surf->base.texture->nr_samples > 1 ?
> >> - RADEON_PRIO_DEPTH_BUFFER_MSAA :
> >> - RADEON_PRIO_DEPTH_BUFFER);
> >> -
> >> - radeon_set_context_reg_seq(cs, R_028000_DB_DEPTH_SIZE, 2);
> >> - radeon_emit(cs, surf->db_depth_size); /* R_028000_DB_DEPTH_SIZE */
> >> - radeon_emit(cs, surf->db_depth_view); /* R_028004_DB_DEPTH_VIEW */
> >> - radeon_set_context_reg_seq(cs, R_02800C_DB_DEPTH_BASE, 2);
> >> - radeon_emit(cs, surf->db_depth_base); /* R_02800C_DB_DEPTH_BASE */
> >> - radeon_emit(cs, surf->db_depth_info); /* R_028010_DB_DEPTH_INFO */
> >> + if (rctx->framebuffer.dirty_zsbuf) {
> >> + if (state->zsbuf) {
> >> + struct r600_surface *surf = (struct r600_surface*)state->zsbuf;
> >> + unsigned reloc = radeon_add_to_buffer_list(&rctx->b,
> >> + &rctx->b.gfx,
> >> + (struct r600_resource*)state->zsbuf->texture,
> >> + RADEON_USAGE_READWRITE,
> >> + surf->base.texture->nr_samples > 1 ?
> >> + RADEON_PRIO_DEPTH_BUFFER_MSAA :
> >> + RADEON_PRIO_DEPTH_BUFFER);
> >> +
> >> + radeon_set_context_reg_seq(cs, R_028000_DB_DEPTH_SIZE, 2);
> >> + radeon_emit(cs, surf->db_depth_size); /* R_028000_DB_DEPTH_SIZE
*/
> >> + radeon_emit(cs, surf->db_depth_view); /* R_028004_DB_DEPTH_VIEW
*/
> >> + radeon_set_context_reg_seq(cs, R_02800C_DB_DEPTH_BASE, 2);
> >> + radeon_emit(cs, surf->db_depth_base); /* R_02800C_DB_DEPTH_BASE
*/
> >> + radeon_emit(cs, surf->db_depth_info); /* R_028010_DB_DEPTH_INFO
*/
> >>
> >> - radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
> >> - radeon_emit(cs, reloc);
> >> + radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
> >> + radeon_emit(cs, reloc);
> >>
> >> - radeon_set_context_reg(cs, R_028D34_DB_PREFETCH_LIMIT,
> >> surf->db_prefetch_limit); + radeon_set_context_reg(cs,
> >> R_028D34_DB_PREFETCH_LIMIT, surf->db_prefetch_limit);
> >>
> >> - sbu |= SURFACE_BASE_UPDATE_DEPTH;
> >> - } else if (rctx->screen->b.info.drm_minor >= 18) {
> >> - /* DRM 2.6.18 allows the INVALID format to disable depth/stencil.
> >> - * Older kernels are out of luck. */
> >> - radeon_set_context_reg(cs, R_028010_DB_DEPTH_INFO,
> >> S_028010_FORMAT(V_028010_DEPTH_INVALID)); + sbu |=
> >> SURFACE_BASE_UPDATE_DEPTH;
> >> + } else if (rctx->screen->b.info.drm_minor >= 18) {
> >> + /* DRM 2.6.18 allows the INVALID format to disable depth/stencil.
> >> + * Older kernels are out of luck. */
> >> + radeon_set_context_reg(cs, R_028010_DB_DEPTH_INFO,
> >> S_028010_FORMAT(V_028010_DEPTH_INVALID)); + }
> >>
> >> }
> >>
> >> /* SURFACE_BASE_UPDATE */
> >>
> >> @@ -1484,6 +1487,7 @@ static void r600_emit_framebuffer_state(struct
> >> r600_context *rctx, struct r600_a }
> >>
> >> r600_emit_msaa_state(rctx, rctx->framebuffer.nr_samples);
> >>
> >> + rctx->framebuffer.dirty_zsbuf = false;
> >>
> >> }
> >>
> >> static void r600_set_min_samples(struct pipe_context *ctx, unsigned
> >>
> >> min_samples) diff --git a/src/gallium/drivers/r600/r600_state_common.c
> >> b/src/gallium/drivers/r600/r600_state_common.c index
> >> 7b52be36cd..78d2cab705
> >> 100644
> >> --- a/src/gallium/drivers/r600/r600_state_common.c
> >> +++ b/src/gallium/drivers/r600/r600_state_common.c
> >> @@ -187,6 +187,7 @@ static void r600_bind_blend_state_internal(struct
> >> r600_context *rctx, }
> >>
> >> if (update_cb) {
> >>
> >> r600_mark_atom_dirty(rctx, &rctx->cb_misc_state.atom);
> >>
> >> + rctx->framebuffer.dirty_zsbuf = true;
> >>
> >> }
> >> if (rctx->framebuffer.dual_src_blend != blend->dual_src_blend) {
> >>
> >> rctx->framebuffer.dual_src_blend = blend->dual_src_blend;
> >>
> >> @@ -1733,6 +1734,7 @@ static void r600_draw_vbo(struct pipe_context
> >> *ctx,
> >> const struct pipe_draw_info if (unlikely(dirty_tex_counter !=
> >> rctx->b.last_dirty_tex_counter)) { rctx->b.last_dirty_tex_counter =
> >> dirty_tex_counter;
> >>
> >> r600_mark_atom_dirty(rctx, &rctx->framebuffer.atom);
> >>
> >> + rctx->framebuffer.dirty_zsbuf = true;
> >>
> >> rctx->framebuffer.do_update_surf_dirtiness = true;
> >>
> >> }
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170428/8e151fe9/attachment-0001.sig>
More information about the mesa-dev
mailing list