[Mesa-dev] [PATCH 10/12] genX/cmd_buffer: Enable rendering to HiZ

Chad Versace chadversary at chromium.org
Fri Sep 2 22:16:21 UTC 2016


On Wed 31 Aug 2016, Nanley Chery wrote:
> From: Chad Versace <chad.versace at intel.com>
> 
> Nanley Chery:
> (rebase)
>  - Resolve conflicts with new anv_batch_emit macro
> (amend)
>  - Remove wip! tag and handle a QPitch TODO
>  - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
>  - Only use HiZ for single-subpass renderpasses
>  - Emit the HiZ instruction before the stencil instruction to follow the
>    optimized clear sequence specified in the PRMs
>  - Don't modify clear params
>  - Enable resolves when a HiZ buffer is used to ensure depth buffer validity
> 
> Provides an FPS increase of ~15% on the Sascha triangle and multisampling
> demos.
> 
> Signed-off-by: Nanley Chery <nanley.g.chery at intel.com>
> ---
>  src/intel/vulkan/gen8_cmd_buffer.c |  4 ++++
>  src/intel/vulkan/genX_cmd_buffer.c | 41 ++++++++++++++++++++++++++++++++++----
>  2 files changed, 41 insertions(+), 4 deletions(-)
> 
> diff --git a/src/intel/vulkan/gen8_cmd_buffer.c b/src/intel/vulkan/gen8_cmd_buffer.c
> index 4f27350..7f65fe2 100644
> --- a/src/intel/vulkan/gen8_cmd_buffer.c
> +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> @@ -414,6 +414,10 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer, enum anv_hz_op op)
>     if (iview == NULL || !anv_image_has_hiz(iview->image))
>        return;
>  
> +   /* FIXME: Implement multi-subpass HiZ */
> +   if (cmd_buffer->state.pass->subpass_count > 1)
> +      return;
> +
>     const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
>     const bool full_surface_op =
>               cmd_state->render_area.extent.width == iview->extent.width &&
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
> index 95ed5f2..349d2a4 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -1040,6 +1040,7 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
>        anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
>     const struct anv_image *image = iview ? iview->image : NULL;
>     const bool has_depth = image && (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT);
> +   const bool has_hiz = image != NULL && anv_image_has_hiz(image);
>     const bool has_stencil =
>        image && (image->aspects & VK_IMAGE_ASPECT_STENCIL_BIT);

>  
> @@ -1052,7 +1053,12 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
>           db.SurfaceType                   = SURFTYPE_2D;
>           db.DepthWriteEnable              = true;
>           db.StencilWriteEnable            = has_stencil;
> -         db.HierarchicalDepthBufferEnable = false;
> +
> +         if (cmd_buffer->state.pass->subpass_count == 1) {
> +            db.HierarchicalDepthBufferEnable = has_hiz;
> +         } else {
> +            anv_finishme("Multiple-subpass HiZ not implemented");
> +         }
>  
>           db.SurfaceFormat = isl_surf_get_depth_format(&device->isl_dev,
>                                                        &image->depth_surface.isl);
> @@ -1104,6 +1110,34 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
>        }
>     }
>  
> +   if (has_hiz) {

Note: This codepath is hit sometimes when
3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false.
Specifically, when subpass_count > 1. It's weird, but I doubt it causes
any harm. After all, all the surface data programmed by
3DSTATE_HIER_BUFFER is valid here regardless of the value of
HierarchicalDepthBufferEnable.

> +      anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hdb) {
> +         hdb.HierarchicalDepthBufferObjectControlState = GENX(MOCS);
> +         hdb.SurfacePitch = image->hiz_surface.isl.row_pitch - 1;
> +         hdb.SurfaceBaseAddress = (struct anv_address) {
> +            .bo = image->bo,
> +            .offset = image->offset + image->hiz_surface.offset,
> +         };
> +#if GEN_GEN >= 8
> +         /* From the SKL PRM Vol2a:
> +          *
> +          *    The interpretation of this field is dependent on Surface Type
> +          *    as follows:
> +          *    - SURFTYPE_1D: distance in pixels between array slices
> +          *    - SURFTYPE_2D/CUBE: distance in rows between array slices
> +          *    - SURFTYPE_3D: distance in rows between R - slices
> +          *
> +          * ISL implements HiZ surfaces for 1D depth buffers as 2D. Therefore
> +          * the depth buffer needs to be checked for the dimension.
> +          */
> +         hdb.SurfaceQPitch =
> +            image->depth_surface.isl.dim == ISL_SURF_DIM_1D ?
> +               isl_surf_get_array_pitch_el(&image->hiz_surface.isl) >> 2 :
> +               isl_surf_get_array_pitch_el_rows(&image->hiz_surface.isl) >> 2;
> +#endif
> +      }
> +   }

Here I expected to see:

   } else {
       /* Disable hierarchial depth buffers. */
       anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hz);
   }

I'm not insisting that you *should* do that; just that I expected it.

Why? If the current depth surface has no hiz surface, then any
previously emitted 3DSTATE_HIER_DEPTH_BUFFER remains active. That
instruction persists across render passes. If the HiZ surface referenced
by the stale 3DSTATE_HIER_DEPTH_BUFFER no longer exists, what happens?
Does the GPU attempt to prefetch that surface before observing
3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, therefore
performing an illegal memory access? Does the GPU first observe that
3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, and so
ignore the active (but stale) 3DSTATE_HIER_DEPTH_BUFFER instruction,
doesn't fetch the stale surface memory, and all is good? Or something
completely different that we haven't thought of?

I don't know what the GPU does in that case. So it makes me feel safer
if we emit a zero-filled 3DSTATE_HIER_DEPTH_BUFFER when HiZ is disabled.
After all, that's what i965 does. Maybe it's unneeded. But if
a zero-filled 3DSTATE_HIER_DEPTH_BUFFER *is* needed, the only symptom
would be rare and difficult-to-reproduce rendering bugs and GPU hangs.
And I don't want to risk that.

>     /* Emit 3DSTATE_STENCIL_BUFFER */
>     if (has_stencil) {
>        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb) {
> @@ -1126,9 +1160,6 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
>        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
>     }
>  
> -   /* Disable hierarchial depth buffers. */
> -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hz);
> -
>     /* Clear the clear params. */
>     anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);

>  }
> @@ -1164,6 +1195,7 @@ void genX(CmdBeginRenderPass)(
>     genX(flush_pipeline_select_3d)(cmd_buffer);
>  
>     genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
> +   genX(cmd_buffer_do_hz_op)(cmd_buffer, ANV_HZ_OP_HIZ_RESOLVE);
>     anv_cmd_buffer_clear_subpass(cmd_buffer);
>  }
>  
> @@ -1185,6 +1217,7 @@ void genX(CmdEndRenderPass)(
>  {
>     ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
>  
> +   genX(cmd_buffer_do_hz_op)(cmd_buffer, ANV_HZ_OP_DEPTH_RESOLVE);
>     anv_cmd_buffer_resolve_subpass(cmd_buffer);
>  
>  #ifndef NDEBUG
> -- 
> 2.9.3
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list