[Mesa-dev] [PATCH v3 09/10] anv: Enable fast depth clears

Nanley Chery nanleychery at gmail.com
Wed Oct 26 22:47:38 UTC 2016


On Thu, Oct 06, 2016 at 03:21:54PM -0700, Nanley Chery wrote:
> Provides an FPS increase of ~30% on the Sascha triangle and multisampling
> demos.

After attempting to enable fast depth clears in more areas, I noticed
something possibly worth sharing. Most of the performance gain from this
patch isn't due to the delta in fast depth clear's performance compared
to that of a regular depth clear with HiZ enabled. Most of the gain
is from the avoidance of from meta's suboptimal interaction with the
HiZ buffer, which if I understand correctly was as follows:
	1. HiZ Resolve
	2. Depth clear With HiZ enabled
	3. Depth Resolve

A similar increase in performance arose for slow depth clears once the
meta path was replaced with blorp at commit
d823f92970447859c4891728da4e48f0c9bc0044 . It seems like the actual
fast vs slow depth clear delta is only roughly 1%. I obtained that
figure by simply looking at the FPS counter across several runs.

- Nanley

> 
> Signed-off-by: Nanley Chery <nanley.g.chery at intel.com>
> Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v2)
> 
> ---
> v3. Emit required clear_params packet (Chad)
>     Share clear_params code path IVB+ (Jason)
> 
>  src/intel/vulkan/anv_pass.c        | 13 +++++++++++++
>  src/intel/vulkan/genX_cmd_buffer.c | 24 ++++++++++++++++++++++--
>  2 files changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> index 69c3c7e..595c2ea 100644
> --- a/src/intel/vulkan/anv_pass.c
> +++ b/src/intel/vulkan/anv_pass.c
> @@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity(
>      VkRenderPass                                renderPass,
>      VkExtent2D*                                 pGranularity)
>  {
> +   ANV_FROM_HANDLE(anv_render_pass, pass, renderPass);
> +
> +   /* This granularity satisfies HiZ fast clear alignment requirements
> +    * for all sample counts.
> +    */
> +   for (unsigned i = 0; i < pass->subpass_count; ++i) {
> +      if (pass->subpasses[i].depth_stencil_attachment !=
> +          VK_ATTACHMENT_UNUSED) {
> +         *pGranularity = (VkExtent2D) { .width = 8, .height = 4 };
> +         return;
> +      }
> +   }
> +
>     *pGranularity = (VkExtent2D) { 1, 1 };
>  }
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
> index ed6a109..4089fc7 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -1318,8 +1318,27 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer *cmd_buffer)
>        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
>     }
>  
> -   /* Clear the clear params. */
> -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
> +   /* From the IVB PRM Vol2P1, 11.5.5.4 3DSTATE_CLEAR_PARAMS:
> +    *
> +    *    3DSTATE_CLEAR_PARAMS must always be programmed in the along with
> +    *    the other Depth/Stencil state commands(i.e. 3DSTATE_DEPTH_BUFFER,
> +    *    3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)
> +    *
> +    * Testing also shows that some variant of this restriction may exist HSW+.
> +    * On BDW+, it is not possible to emit 2 of these packets consecutively when
> +    * both have DepthClearValueValid set. An analysis of such state programming
> +    * on SKL showed that the GPU doesn't register the latter packet's clear
> +    * value.
> +    */
> +   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp) {
> +      if (has_hiz) {
> +         cp.DepthClearValueValid = true;
> +         const uint32_t ds =
> +            cmd_buffer->state.subpass->depth_stencil_attachment;
> +         cp.DepthClearValue =
> +            cmd_buffer->state.attachments[ds].clear_value.depthStencil.depth;
> +      }
> +   }
>  }
>  
>  static void
> @@ -1332,6 +1351,7 @@ genX(cmd_buffer_set_subpass)(struct anv_cmd_buffer *cmd_buffer,
>  
>     cmd_buffer_emit_depth_stencil(cmd_buffer);
>     genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_HIZ_RESOLVE);
> +   genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_DEPTH_CLEAR);
>  
>     anv_cmd_buffer_clear_subpass(cmd_buffer);
>  }
> -- 
> 2.10.0
> 


More information about the mesa-dev mailing list