[Mesa-dev] [PATCH 10/12] genX/cmd_buffer: Enable rendering to HiZ

Nanley Chery nanleychery at gmail.com
Mon Sep 19 20:49:09 UTC 2016


On Fri, Sep 02, 2016 at 03:16:21PM -0700, Chad Versace wrote:
> On Wed 31 Aug 2016, Nanley Chery wrote:
> > From: Chad Versace <chad.versace at intel.com>
> > 
> > Nanley Chery:
> > (rebase)
> >  - Resolve conflicts with new anv_batch_emit macro
> > (amend)
> >  - Remove wip! tag and handle a QPitch TODO
> >  - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
> >  - Only use HiZ for single-subpass renderpasses
> >  - Emit the HiZ instruction before the stencil instruction to follow the
> >    optimized clear sequence specified in the PRMs
> >  - Don't modify clear params
> >  - Enable resolves when a HiZ buffer is used to ensure depth buffer validity
> > 
> > Provides an FPS increase of ~15% on the Sascha triangle and multisampling
> > demos.
> > 
> > Signed-off-by: Nanley Chery <nanley.g.chery at intel.com>
> > ---
> >  src/intel/vulkan/gen8_cmd_buffer.c |  4 ++++
> >  src/intel/vulkan/genX_cmd_buffer.c | 41 ++++++++++++++++++++++++++++++++++----
> >  2 files changed, 41 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c b/src/intel/vulkan/gen8_cmd_buffer.c
> > index 4f27350..7f65fe2 100644
> > --- a/src/intel/vulkan/gen8_cmd_buffer.c
> > +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> > @@ -414,6 +414,10 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer, enum anv_hz_op op)
> >     if (iview == NULL || !anv_image_has_hiz(iview->image))
> >        return;
> >  
> > +   /* FIXME: Implement multi-subpass HiZ */
> > +   if (cmd_buffer->state.pass->subpass_count > 1)
> > +      return;
> > +
> >     const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
> >     const bool full_surface_op =
> >               cmd_state->render_area.extent.width == iview->extent.width &&
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
> > index 95ed5f2..349d2a4 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -1040,6 +1040,7 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
> >        anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
> >     const struct anv_image *image = iview ? iview->image : NULL;
> >     const bool has_depth = image && (image->aspects & VK_IMAGE_ASPECT_DEPTH_BIT);
> > +   const bool has_hiz = image != NULL && anv_image_has_hiz(image);
> >     const bool has_stencil =
> >        image && (image->aspects & VK_IMAGE_ASPECT_STENCIL_BIT);
> 
> >  
> > @@ -1052,7 +1053,12 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
> >           db.SurfaceType                   = SURFTYPE_2D;
> >           db.DepthWriteEnable              = true;
> >           db.StencilWriteEnable            = has_stencil;
> > -         db.HierarchicalDepthBufferEnable = false;
> > +
> > +         if (cmd_buffer->state.pass->subpass_count == 1) {
> > +            db.HierarchicalDepthBufferEnable = has_hiz;
> > +         } else {
> > +            anv_finishme("Multiple-subpass HiZ not implemented");
> > +         }
> >  
> >           db.SurfaceFormat = isl_surf_get_depth_format(&device->isl_dev,
> >                                                        &image->depth_surface.isl);
> > @@ -1104,6 +1110,34 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
> >        }
> >     }
> >  
> > +   if (has_hiz) {
> 
> Note: This codepath is hit sometimes when
> 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false.
> Specifically, when subpass_count > 1. It's weird, but I doubt it causes
> any harm. After all, all the surface data programmed by
> 3DSTATE_HIER_BUFFER is valid here regardless of the value of
> HierarchicalDepthBufferEnable.
> 
> > +      anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hdb) {
> > +         hdb.HierarchicalDepthBufferObjectControlState = GENX(MOCS);
> > +         hdb.SurfacePitch = image->hiz_surface.isl.row_pitch - 1;
> > +         hdb.SurfaceBaseAddress = (struct anv_address) {
> > +            .bo = image->bo,
> > +            .offset = image->offset + image->hiz_surface.offset,
> > +         };
> > +#if GEN_GEN >= 8
> > +         /* From the SKL PRM Vol2a:
> > +          *
> > +          *    The interpretation of this field is dependent on Surface Type
> > +          *    as follows:
> > +          *    - SURFTYPE_1D: distance in pixels between array slices
> > +          *    - SURFTYPE_2D/CUBE: distance in rows between array slices
> > +          *    - SURFTYPE_3D: distance in rows between R - slices
> > +          *
> > +          * ISL implements HiZ surfaces for 1D depth buffers as 2D. Therefore
> > +          * the depth buffer needs to be checked for the dimension.
> > +          */
> > +         hdb.SurfaceQPitch =
> > +            image->depth_surface.isl.dim == ISL_SURF_DIM_1D ?
> > +               isl_surf_get_array_pitch_el(&image->hiz_surface.isl) >> 2 :
> > +               isl_surf_get_array_pitch_el_rows(&image->hiz_surface.isl) >> 2;
> > +#endif
> > +      }
> > +   }
> 
> Here I expected to see:
> 
>    } else {
>        /* Disable hierarchial depth buffers. */
>        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hz);
>    }
> 
> I'm not insisting that you *should* do that; just that I expected it.
> 
> Why? If the current depth surface has no hiz surface, then any
> previously emitted 3DSTATE_HIER_DEPTH_BUFFER remains active. That
> instruction persists across render passes. If the HiZ surface referenced
> by the stale 3DSTATE_HIER_DEPTH_BUFFER no longer exists, what happens?
> Does the GPU attempt to prefetch that surface before observing
> 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, therefore
> performing an illegal memory access? Does the GPU first observe that
> 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, and so
> ignore the active (but stale) 3DSTATE_HIER_DEPTH_BUFFER instruction,
> doesn't fetch the stale surface memory, and all is good? Or something
> completely different that we haven't thought of?
> 

Is there any documentation that mentions prefetching surfaces? I was
only able to find some on state and instruction prefetching.

I'd expect that the GPU would avoid accessing a region of memory that
it knows to be undefined. There are some programming notes that agree
with this expectation. For example, the depth resolve and HiZ
ambiguation state fields in the BDW+ PRMs say:

  If Hierarchical Depth Buffer Enable is disabled, enabling this field
  will have no effect.

Nonetheless, the overhead of the extra packet should be negligible so
I can emit the empty packet if you'd like.

Nanley 

> I don't know what the GPU does in that case. So it makes me feel safer
> if we emit a zero-filled 3DSTATE_HIER_DEPTH_BUFFER when HiZ is disabled.
> After all, that's what i965 does. Maybe it's unneeded. But if
> a zero-filled 3DSTATE_HIER_DEPTH_BUFFER *is* needed, the only symptom
> would be rare and difficult-to-reproduce rendering bugs and GPU hangs.
> And I don't want to risk that.
> 
> >     /* Emit 3DSTATE_STENCIL_BUFFER */
> >     if (has_stencil) {
> >        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb) {
> > @@ -1126,9 +1160,6 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
> >        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
> >     }
> >  
> > -   /* Disable hierarchial depth buffers. */
> > -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_HIER_DEPTH_BUFFER), hz);
> > -
> >     /* Clear the clear params. */
> >     anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
> 
> >  }
> > @@ -1164,6 +1195,7 @@ void genX(CmdBeginRenderPass)(
> >     genX(flush_pipeline_select_3d)(cmd_buffer);
> >  
> >     genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
> > +   genX(cmd_buffer_do_hz_op)(cmd_buffer, ANV_HZ_OP_HIZ_RESOLVE);
> >     anv_cmd_buffer_clear_subpass(cmd_buffer);
> >  }
> >  
> > @@ -1185,6 +1217,7 @@ void genX(CmdEndRenderPass)(
> >  {
> >     ANV_FROM_HANDLE(anv_cmd_buffer, cmd_buffer, commandBuffer);
> >  
> > +   genX(cmd_buffer_do_hz_op)(cmd_buffer, ANV_HZ_OP_DEPTH_RESOLVE);
> >     anv_cmd_buffer_resolve_subpass(cmd_buffer);
> >  
> >  #ifndef NDEBUG
> > -- 
> > 2.9.3
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list