[Mesa-dev] [PATCH v3 09/10] anv: Enable fast depth clears

Jason Ekstrand jason at jlekstrand.net
Thu Oct 27 03:14:42 UTC 2016


On Wed, Oct 26, 2016 at 3:47 PM, Nanley Chery <nanleychery at gmail.com> wrote:

> On Thu, Oct 06, 2016 at 03:21:54PM -0700, Nanley Chery wrote:
> > Provides an FPS increase of ~30% on the Sascha triangle and multisampling
> > demos.
>
> After attempting to enable fast depth clears in more areas, I noticed
> something possibly worth sharing. Most of the performance gain from this
> patch isn't due to the delta in fast depth clear's performance compared
> to that of a regular depth clear with HiZ enabled. Most of the gain
> is from the avoidance of from meta's suboptimal interaction with the
> HiZ buffer, which if I understand correctly was as follows:
>         1. HiZ Resolve
>         2. Depth clear With HiZ enabled
>         3. Depth Resolve
>
> A similar increase in performance arose for slow depth clears once the
> meta path was replaced with blorp at commit
> d823f92970447859c4891728da4e48f0c9bc0044 . It seems like the actual
> fast vs slow depth clear delta is only roughly 1%. I obtained that
> figure by simply looking at the FPS counter across several runs.
>

Cool!  Thanks for digging into this!  I'm glad that our theory (about
non-HiZ clears being fast when HiZ is enabled) wasn't crazy.  If we can
avoid using HiZ clears, that should make some of the tracking issues easier
and we should be able to enable texture-with-hiz fairly easily.


> - Nanley
>
> >
> > Signed-off-by: Nanley Chery <nanley.g.chery at intel.com>
> > Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v2)
> >
> > ---
> > v3. Emit required clear_params packet (Chad)
> >     Share clear_params code path IVB+ (Jason)
> >
> >  src/intel/vulkan/anv_pass.c        | 13 +++++++++++++
> >  src/intel/vulkan/genX_cmd_buffer.c | 24 ++++++++++++++++++++++--
> >  2 files changed, 35 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> > index 69c3c7e..595c2ea 100644
> > --- a/src/intel/vulkan/anv_pass.c
> > +++ b/src/intel/vulkan/anv_pass.c
> > @@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity(
> >      VkRenderPass                                renderPass,
> >      VkExtent2D*                                 pGranularity)
> >  {
> > +   ANV_FROM_HANDLE(anv_render_pass, pass, renderPass);
> > +
> > +   /* This granularity satisfies HiZ fast clear alignment requirements
> > +    * for all sample counts.
> > +    */
> > +   for (unsigned i = 0; i < pass->subpass_count; ++i) {
> > +      if (pass->subpasses[i].depth_stencil_attachment !=
> > +          VK_ATTACHMENT_UNUSED) {
> > +         *pGranularity = (VkExtent2D) { .width = 8, .height = 4 };
> > +         return;
> > +      }
> > +   }
> > +
> >     *pGranularity = (VkExtent2D) { 1, 1 };
> >  }
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> > index ed6a109..4089fc7 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -1318,8 +1318,27 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
> >        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER),
> sb);
> >     }
> >
> > -   /* Clear the clear params. */
> > -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
> > +   /* From the IVB PRM Vol2P1, 11.5.5.4 3DSTATE_CLEAR_PARAMS:
> > +    *
> > +    *    3DSTATE_CLEAR_PARAMS must always be programmed in the along
> with
> > +    *    the other Depth/Stencil state commands(i.e.
> 3DSTATE_DEPTH_BUFFER,
> > +    *    3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)
> > +    *
> > +    * Testing also shows that some variant of this restriction may
> exist HSW+.
> > +    * On BDW+, it is not possible to emit 2 of these packets
> consecutively when
> > +    * both have DepthClearValueValid set. An analysis of such state
> programming
> > +    * on SKL showed that the GPU doesn't register the latter packet's
> clear
> > +    * value.
> > +    */
> > +   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp) {
> > +      if (has_hiz) {
> > +         cp.DepthClearValueValid = true;
> > +         const uint32_t ds =
> > +            cmd_buffer->state.subpass->depth_stencil_attachment;
> > +         cp.DepthClearValue =
> > +            cmd_buffer->state.attachments[ds].clear_value.depthStencil.
> depth;
> > +      }
> > +   }
> >  }
> >
> >  static void
> > @@ -1332,6 +1351,7 @@ genX(cmd_buffer_set_subpass)(struct
> anv_cmd_buffer *cmd_buffer,
> >
> >     cmd_buffer_emit_depth_stencil(cmd_buffer);
> >     genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_HIZ_RESOLVE);
> > +   genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_DEPTH_CLEAR);
> >
> >     anv_cmd_buffer_clear_subpass(cmd_buffer);
> >  }
> > --
> > 2.10.0
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20161026/0b4ed136/attachment-0001.html>


More information about the mesa-dev mailing list