[Mesa-dev] [PATCH v2 17/24] anv: Use blorp_ccs_ambiguate instead of fast-clears

Nanley Chery nanleychery at gmail.com
Tue Jan 30 22:54:40 UTC 2018


On Fri, Jan 19, 2018 at 03:47:34PM -0800, Jason Ekstrand wrote:
> Even though the blorp pass looks a bit on the sketchy side, the end
> result in the Vulkan driver is very nice.  Instead of having this weird
> case where you do a fast clear and then maybe have to resolve, we just
> do the ambiguate and are done with it.  The ambiguate does exactly what
> we want of setting all the CCS values to 0 which puts it inot the
                                                           ^
							   in
> pass-through state.
> 
> This should also improve performance a bit in certain cases.  For
> instance, if we did a transition from UNDEFINED to GENERAL for a surface
> that doesn't have CCS enabled all the time, we would end up doing a
> fast-clear and then a full resolve which ends up touching every byte in
> the main surface as well as the CCS.  With the ambiguate pass, that
> transition only touches the CCS.
> ---
>  src/intel/vulkan/anv_blorp.c       |  5 ++++
>  src/intel/vulkan/genX_cmd_buffer.c | 54 +++++++++-----------------------------
>  2 files changed, 17 insertions(+), 42 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 05efc6d..3698543 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -1792,6 +1792,11 @@ anv_image_ccs_op(struct anv_cmd_buffer *cmd_buffer,
>                          surf.surf->format, isl_to_blorp_fast_clear_op(ccs_op));
>        break;
>     case ISL_AUX_OP_AMBIGUATE:
> +      for (uint32_t a = 0; a < layer_count; a++) {
> +         const uint32_t layer = base_layer + a;
> +         blorp_ccs_ambiguate(&batch, &surf, level, layer);
> +      }
> +      break;
>     default:
>        unreachable("Unsupported CCS operation");
>     }
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c
> index 77fdadf..9e2eba3 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -486,15 +486,6 @@ init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
>     uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
>     enum isl_aux_usage aux_usage = image->planes[plane].aux_usage;
>  
> -   /* The resolve flag should updated to signify that fast-clear/compression
> -    * data needs to be removed when leaving the undefined layout. Such data
> -    * may need to be removed if it would cause accesses to the color buffer
> -    * to return incorrect data. The fast clear data in CCS_D buffers should
> -    * be removed because CCS_D isn't enabled all the time.
> -    */
> -   genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level,
> -                                 aux_usage == ISL_AUX_USAGE_NONE);
> -
>     /* The fast clear value dword(s) will be copied into a surface state object.
>      * Ensure that the restrictions of the fields in the dword(s) are followed.
>      *
> @@ -677,10 +668,9 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
>        for (unsigned level = base_level; level < last_level_num; level++)
>           init_fast_clear_state_entry(cmd_buffer, image, aspect, level);
>  
> -      /* Initialize the aux buffers to enable correct rendering. This operation
> -       * requires up to two steps: one to rid the aux buffer of data that may
> -       * cause GPU hangs, and another to ensure that writes done without aux
> -       * will be visible to reads done with aux.
> +      /* Initialize the aux buffers to enable correct rendering.  In order to
> +       * ensure that things such as storage images work correctly, aux buffers
> +       * are initialized to the pass-through state.

Only CCS is initialized to the pass-through state while MCS is
fast-cleared. We may also want to update the comment below since we're
no longer fast-clearing CCS.

>         *
>         * Having an aux buffer with invalid data is possible for CCS buffers
>         * SKL+ and for MCS buffers with certain sample counts (2x and 8x). One
> @@ -692,16 +682,18 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
>         * We don't have any data to show that this is a problem, but we want to
>         * avoid causing difficult-to-debug problems.
>         */
> -      if (GEN_GEN >= 9 && image->samples == 1) {
> +      if (image->samples == 1) {
>           for (uint32_t l = 0; l < level_count; l++) {
>              const uint32_t level = base_level + l;
>              const uint32_t level_layer_count =
>                 MIN2(layer_count, anv_image_aux_layers(image, aspect, level));
>              anv_image_ccs_op(cmd_buffer, image, aspect, level,
>                               base_layer, level_layer_count,
> -                             ISL_AUX_OP_FAST_CLEAR, false);
> +                             ISL_AUX_OP_AMBIGUATE, false);
> +            genX(set_image_needs_resolve)(cmd_buffer, image,
> +                                          aspect, level, false);
>           }
> -      } else if (image->samples > 1) {
> +      } else {
>           if (image->samples == 4 || image->samples == 16) {
>              anv_perf_warn(cmd_buffer->device->instance, image,
>                            "Doing a potentially unnecessary fast-clear to "
> @@ -712,32 +704,10 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
>           anv_image_mcs_op(cmd_buffer, image, aspect,
>                            base_layer, layer_count,
>                            ISL_AUX_OP_FAST_CLEAR, false);
> -      }
> -      /* At this point, some elements of the CCS buffer may have the fast-clear
> -       * bit-arrangement. As the user writes to a subresource, we need to have
> -       * the associated CCS elements enter the ambiguated state. This enables
> -       * reads (implicit or explicit) to reflect the user-written data instead
> -       * of the clear color. The only time such elements will not change their
> -       * state as described above, is in a final layout that doesn't have CCS
> -       * enabled. In this case, we must force the associated CCS buffers of the
> -       * specified range to enter the ambiguated state in advance.
> -       */
> -      if (image->samples == 1 &&
> -          image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E &&
> -          final_layout != VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL) {
> -         /* The CCS_D buffer may not be enabled in the final layout. Call this
> -          * function again with a initial layout of COLOR_ATTACHMENT_OPTIMAL
> -          * to perform a resolve.
> -          */
> -          anv_perf_warn(cmd_buffer->device->instance, image,
> -                        "Performing an additional resolve for CCS_D layout "
> -                        "transition. Consider always leaving it on or "
> -                        "performing an ambiguation pass.");
> -         transition_color_buffer(cmd_buffer, image, aspect,
> -                                 base_level, level_count,
> -                                 base_layer, layer_count,
> -                                 VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,
> -                                 final_layout);

Nice to be rid of this.

> +         for (unsigned level = base_level; level < last_level_num; level++) {
> +            genX(set_image_needs_resolve)(cmd_buffer, image,
> +                                          aspect, level, true);
> +         }

Why set needs resolve true here?

-Nanley

>        }
>        return;
>     }
> -- 
> 2.5.0.400.gff86faf
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list