[Mesa-dev] [PATCH 10/12] genX/cmd_buffer: Enable rendering to HiZ
Jason Ekstrand
jason at jlekstrand.net
Sat Sep 3 00:07:13 UTC 2016
On Fri, Sep 2, 2016 at 3:16 PM, Chad Versace <chadversary at chromium.org>
wrote:
> On Wed 31 Aug 2016, Nanley Chery wrote:
> > From: Chad Versace <chad.versace at intel.com>
> >
> > Nanley Chery:
> > (rebase)
> > - Resolve conflicts with new anv_batch_emit macro
> > (amend)
> > - Remove wip! tag and handle a QPitch TODO
> > - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
> > - Only use HiZ for single-subpass renderpasses
> > - Emit the HiZ instruction before the stencil instruction to follow the
> > optimized clear sequence specified in the PRMs
> > - Don't modify clear params
> > - Enable resolves when a HiZ buffer is used to ensure depth buffer
> validity
> >
> > Provides an FPS increase of ~15% on the Sascha triangle and multisampling
> > demos.
> >
> > Signed-off-by: Nanley Chery <nanley.g.chery at intel.com>
> > ---
> > src/intel/vulkan/gen8_cmd_buffer.c | 4 ++++
> > src/intel/vulkan/genX_cmd_buffer.c | 41 ++++++++++++++++++++++++++++++
> ++++----
> > 2 files changed, 41 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c
> b/src/intel/vulkan/gen8_cmd_buffer.c
> > index 4f27350..7f65fe2 100644
> > --- a/src/intel/vulkan/gen8_cmd_buffer.c
> > +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> > @@ -414,6 +414,10 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer
> *cmd_buffer, enum anv_hz_op op)
> > if (iview == NULL || !anv_image_has_hiz(iview->image))
> > return;
> >
> > + /* FIXME: Implement multi-subpass HiZ */
> > + if (cmd_buffer->state.pass->subpass_count > 1)
> > + return;
> > +
> > const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
> > const bool full_surface_op =
> > cmd_state->render_area.extent.width ==
> iview->extent.width &&
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> > index 95ed5f2..349d2a4 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -1040,6 +1040,7 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
> > anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
> > const struct anv_image *image = iview ? iview->image : NULL;
> > const bool has_depth = image && (image->aspects &
> VK_IMAGE_ASPECT_DEPTH_BIT);
> > + const bool has_hiz = image != NULL && anv_image_has_hiz(image);
> > const bool has_stencil =
> > image && (image->aspects & VK_IMAGE_ASPECT_STENCIL_BIT);
>
> >
> > @@ -1052,7 +1053,12 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
> > db.SurfaceType = SURFTYPE_2D;
> > db.DepthWriteEnable = true;
> > db.StencilWriteEnable = has_stencil;
> > - db.HierarchicalDepthBufferEnable = false;
> > +
> > + if (cmd_buffer->state.pass->subpass_count == 1) {
> > + db.HierarchicalDepthBufferEnable = has_hiz;
> > + } else {
> > + anv_finishme("Multiple-subpass HiZ not implemented");
> > + }
> >
> > db.SurfaceFormat = isl_surf_get_depth_format(&device->isl_dev,
> >
> &image->depth_surface.isl);
> > @@ -1104,6 +1110,34 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
> > }
> > }
> >
> > + if (has_hiz) {
>
> Note: This codepath is hit sometimes when
> 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false.
> Specifically, when subpass_count > 1. It's weird, but I doubt it causes
> any harm. After all, all the surface data programmed by
> 3DSTATE_HIER_BUFFER is valid here regardless of the value of
> HierarchicalDepthBufferEnable.
>
> > + anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER),
> hdb) {
> > + hdb.HierarchicalDepthBufferObjectControlState = GENX(MOCS);
> > + hdb.SurfacePitch = image->hiz_surface.isl.row_pitch - 1;
> > + hdb.SurfaceBaseAddress = (struct anv_address) {
> > + .bo = image->bo,
> > + .offset = image->offset + image->hiz_surface.offset,
> > + };
> > +#if GEN_GEN >= 8
> > + /* From the SKL PRM Vol2a:
> > + *
> > + * The interpretation of this field is dependent on Surface
> Type
> > + * as follows:
> > + * - SURFTYPE_1D: distance in pixels between array slices
> > + * - SURFTYPE_2D/CUBE: distance in rows between array slices
> > + * - SURFTYPE_3D: distance in rows between R - slices
> > + *
> > + * ISL implements HiZ surfaces for 1D depth buffers as 2D.
> Therefore
> > + * the depth buffer needs to be checked for the dimension.
> > + */
> > + hdb.SurfaceQPitch =
> > + image->depth_surface.isl.dim == ISL_SURF_DIM_1D ?
> > + isl_surf_get_array_pitch_el(&image->hiz_surface.isl) >>
> 2 :
> > + isl_surf_get_array_pitch_el_rows(&image->hiz_surface.isl)
> >> 2;
> > +#endif
> > + }
> > + }
>
> Here I expected to see:
>
> } else {
> /* Disable hierarchial depth buffers. */
> anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER),
> hz);
> }
>
> I'm not insisting that you *should* do that; just that I expected it.
>
> Why? If the current depth surface has no hiz surface, then any
> previously emitted 3DSTATE_HIER_DEPTH_BUFFER remains active. That
> instruction persists across render passes. If the HiZ surface referenced
> by the stale 3DSTATE_HIER_DEPTH_BUFFER no longer exists, what happens?
> Does the GPU attempt to prefetch that surface before observing
> 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, therefore
> performing an illegal memory access? Does the GPU first observe that
> 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, and so
> ignore the active (but stale) 3DSTATE_HIER_DEPTH_BUFFER instruction,
> doesn't fetch the stale surface memory, and all is good? Or something
> completely different that we haven't thought of?
>
> I don't know what the GPU does in that case. So it makes me feel safer
> if we emit a zero-filled 3DSTATE_HIER_DEPTH_BUFFER when HiZ is disabled.
> After all, that's what i965 does. Maybe it's unneeded. But if
> a zero-filled 3DSTATE_HIER_DEPTH_BUFFER *is* needed, the only symptom
> would be rare and difficult-to-reproduce rendering bugs and GPU hangs.
> And I don't want to risk that.
>
I completely agree. On older hardware, DEPTH_BUFFER, HIER_DEPTH_BUFFER,
and CLEAR_PARAMS were required to be emitted as a block. I think newer
hardware may be a bit more forgiving, but I share Chad's reservations and
would rather we just disable it.
> > /* Emit 3DSTATE_STENCIL_BUFFER */
> > if (has_stencil) {
> > anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER),
> sb) {
> > @@ -1126,9 +1160,6 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
> > anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER),
> sb);
> > }
> >
> > - /* Disable hierarchial depth buffers. */
> > - anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER),
> hz);
> > -
> > /* Clear the clear params. */
> > anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
>
> > }
> > @@ -1164,6 +1195,7 @@ void genX(CmdBeginRenderPass)(
> > genX(flush_pipeline_select_3d)(cmd_buffer);
> >
> > genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
> > + genX(cmd_buffer_do_hz_op)(cmd_buffer, ANV_HZ_OP_HIZ_RESOLVE);
> > anv_cmd_buffer_clear_subpass(cmd_buffer);
> > }
> >
> > @@ -1185,6 +1217,7 @@ void genX(CmdEndRenderPass)(
> > {
> > ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
> >
> > + genX(cmd_buffer_do_hz_op)(cmd_buffer, ANV_HZ_OP_DEPTH_RESOLVE);
> > anv_cmd_buffer_resolve_subpass(cmd_buffer);
> >
> > #ifndef NDEBUG
> > --
> > 2.9.3
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160902/9c368170/attachment.html>
More information about the mesa-dev
mailing list