<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jan 5, 2018 at 4:34 PM, Nanley Chery <span dir="ltr"><<a href="mailto:nanleychery@gmail.com" target="_blank">nanleychery@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Mon, Nov 27, 2017 at 07:06:03PM -0800, Jason Ekstrand wrote:<br>
</span><span class="">> This makes us start tracking two bits per level for aux to describe both<br>
> whether or not something is fast-cleared and whether or not it is<br>
> compressed as opposed to a single "needs resolve" bit.  The previous<br>
> scheme only worked because we assumed that CCS_E compressed images would<br>
> always be read with CCS_E enabled and so they didn't need any sort of<br>
> resolve if there was no fast clear color.<br>
><br>
> The assumptions of the previous scheme held because the one case where<br>
> we do need a full resolve when CCS_E is enabled is for window-system<br>
> images.  Since we only ever allowed X-tiled window-system images, CCS<br>
> was entirely disabled on gen9+ and we never got CCS_E.  With the advent<br>
> of Y-tiled window-system buffers, we now need to properly support doing<br>
> a full resolve images marked CCS_E.  This requires us to track things<br>
> more granularly because, if the client does two back-to-back transitions<br>
> where we first do a partial resolve and then a full resolve, we need<br>
> both resolves to happen.<br>
<br>
</span>This commit message says that this patch is necessary to do a full<br>
resolve with CCS_E, why is this so? Weren't the requirements fulfilled<br>
with the "anv/cmd_buffer: Generalize transition_color_buffer" patch<br>
(and my comment about disabling predication applied)?<span class="HOEnZb"><font color="#888888"><br></font></span></blockquote><div><br></div><div>Yes, we could just disable predication for full resolves on CCS_E but is that going to lead to extra resolves?  Also, as per comments about clear colors, this is going to have to be revised to force a resolve even if the clear color is zero.  In particular, if we have CCS_E with a zero clear color and we're transitioning it to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR and the image has I915_FORMAT_MOD_Y_TILED_CCS then we need to force a partial resolve.  It's kind-of a mess and I don't see a good way to do it without either tracking it all or having a bunch of redundant resolves.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="HOEnZb"><font color="#888888">
-Nanley<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
> ---<br>
>  src/intel/vulkan/anv_private.h     |  10 +--<br>
>  src/intel/vulkan/genX_cmd_<wbr>buffer.c | 158 ++++++++++++++++++++++++++++++<wbr>-------<br>
>  2 files changed, 135 insertions(+), 33 deletions(-)<br>
><br>
> diff --git a/src/intel/vulkan/anv_<wbr>private.h b/src/intel/vulkan/anv_<wbr>private.h<br>
> index f805246..e875705 100644<br>
> --- a/src/intel/vulkan/anv_<wbr>private.h<br>
> +++ b/src/intel/vulkan/anv_<wbr>private.h<br>
> @@ -2476,16 +2476,16 @@ anv_fast_clear_state_entry_<wbr>size(const struct anv_device *device)<br>
>  {<br>
>     assert(device);<br>
>     /* Entry contents:<br>
> -    *   +-----------------------------<wbr>---------------+<br>
> -    *   | clear value dword(s) | needs resolve dword |<br>
> -    *   +-----------------------------<wbr>---------------+<br>
> +    *   +-----------------------------<wbr>----------------+<br>
> +    *   | clear value dword(s) | needs resolve dwords |<br>
> +    *   +-----------------------------<wbr>----------------+<br>
>      */<br>
><br>
> -   /* Ensure that the needs resolve dword is in fact dword-aligned to enable<br>
> +   /* Ensure that the needs resolve dwords are in fact dword-aligned to enable<br>
>      * GPU memcpy operations.<br>
>      */<br>
>     assert(device->isl_dev.ss.<wbr>clear_value_size % 4 == 0);<br>
> -   return device->isl_dev.ss.clear_<wbr>value_size + 4;<br>
> +   return device->isl_dev.ss.clear_<wbr>value_size + 8;<br>
>  }<br>
><br>
>  static inline struct anv_address<br>
> diff --git a/src/intel/vulkan/genX_cmd_<wbr>buffer.c b/src/intel/vulkan/genX_cmd_<wbr>buffer.c<br>
> index 6aeffa3..2d47179 100644<br>
> --- a/src/intel/vulkan/genX_cmd_<wbr>buffer.c<br>
> +++ b/src/intel/vulkan/genX_cmd_<wbr>buffer.c<br>
> @@ -404,6 +404,22 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,<br>
>                         0, 0, 1, hiz_op);<br>
>  }<br>
><br>
> +/** Bitfield representing the needed resolves.<br>
> + *<br>
> + * This bitfield represents the kinds of compression we may or may not have in<br>
> + * a given image.  The ANV_IMAGE_NEEDS_* can be ANDed with the ANV_IMAGE_HAS<br>
> + * bits to determine when a given resolve actually needs to happen.<br>
> + *<br>
> + * For convenience of dealing with MI_PREDICATE, we blow these two bits out<br>
> + * into two dwords in the image meatadata page.<br>
> + */<br>
> +enum anv_image_resolve_bits {<br>
> +   ANV_IMAGE_HAS_FAST_CLEAR_BIT           = 0x1,<br>
> +   ANV_IMAGE_HAS_COMPRESSION_BIT          = 0x2,<br>
> +   ANV_IMAGE_NEEDS_PARTIAL_<wbr>RESOLVE_BITS   = 0x1,<br>
> +   ANV_IMAGE_NEEDS_FULL_RESOLVE_<wbr>BITS      = 0x3,<br>
> +};<br>
> +<br>
>  #define MI_PREDICATE_SRC0  0x2400<br>
>  #define MI_PREDICATE_SRC1  0x2408<br>
><br>
> @@ -411,23 +427,62 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,<br>
>   * performed properly.<br>
>   */<br>
>  static void<br>
> -set_image_needs_resolve(<wbr>struct anv_cmd_buffer *cmd_buffer,<br>
> -                        const struct anv_image *image,<br>
> -                        VkImageAspectFlagBits aspect,<br>
> -                        unsigned level, bool needs_resolve)<br>
> +set_image_needs_resolve_bits(<wbr>struct anv_cmd_buffer *cmd_buffer,<br>
> +                             const struct anv_image *image,<br>
> +                             VkImageAspectFlagBits aspect,<br>
> +                             unsigned level,<br>
> +                             enum anv_image_resolve_bits set_bits)<br>
>  {<br>
>     assert(cmd_buffer && image);<br>
>     assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_<wbr>ANV);<br>
>     assert(level < anv_image_aux_levels(image, aspect));<br>
><br>
> -   /* The HW docs say that there is no way to guarantee the completion of<br>
> -    * the following command. We use it nevertheless because it shows no<br>
> -    * issues in testing is currently being used in the GL driver.<br>
> -    */<br>
> -   anv_batch_emit(&cmd_buffer-><wbr>batch, GENX(MI_STORE_DATA_IMM), sdi) {<br>
> -      sdi.Address = anv_image_get_needs_resolve_<wbr>addr(cmd_buffer->device,<br>
> -                                                     image, aspect, level);<br>
> -      sdi.ImmediateData = needs_resolve;<br>
> +   const struct anv_address resolve_flag_addr =<br>
> +      anv_image_get_needs_resolve_<wbr>addr(cmd_buffer->device,<br>
> +                                       image, aspect, level);<br>
> +<br>
> +   if (set_bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {<br>
> +      anv_batch_emit(&cmd_buffer-><wbr>batch, GENX(MI_STORE_DATA_IMM), sdi) {<br>
> +         sdi.Address = resolve_flag_addr;<br>
> +         sdi.ImmediateData = 1;<br>
> +      }<br>
> +   }<br>
> +   if (set_bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {<br>
> +      anv_batch_emit(&cmd_buffer-><wbr>batch, GENX(MI_STORE_DATA_IMM), sdi) {<br>
> +         sdi.Address = resolve_flag_addr;<br>
> +         sdi.Address.offset += 4;<br>
> +         sdi.ImmediateData = 1;<br>
> +      }<br>
> +   }<br>
> +}<br>
> +<br>
> +static void<br>
> +clear_image_needs_resolve_<wbr>bits(struct anv_cmd_buffer *cmd_buffer,<br>
> +                               const struct anv_image *image,<br>
> +                               VkImageAspectFlagBits aspect,<br>
> +                               unsigned level,<br>
> +                               enum anv_image_resolve_bits clear_bits)<br>
> +{<br>
> +   assert(cmd_buffer && image);<br>
> +   assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_<wbr>ANV);<br>
> +   assert(level < anv_image_aux_levels(image, aspect));<br>
> +<br>
> +   const struct anv_address resolve_flag_addr =<br>
> +      anv_image_get_needs_resolve_<wbr>addr(cmd_buffer->device,<br>
> +                                       image, aspect, level);<br>
> +<br>
> +   if (clear_bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {<br>
> +      anv_batch_emit(&cmd_buffer-><wbr>batch, GENX(MI_STORE_DATA_IMM), sdi) {<br>
> +         sdi.Address = resolve_flag_addr;<br>
> +         sdi.ImmediateData = 0;<br>
> +      }<br>
> +   }<br>
> +   if (clear_bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {<br>
> +      anv_batch_emit(&cmd_buffer-><wbr>batch, GENX(MI_STORE_DATA_IMM), sdi) {<br>
> +         sdi.Address = resolve_flag_addr;<br>
> +         sdi.Address.offset += 4;<br>
> +         sdi.ImmediateData = 0;<br>
> +      }<br>
>     }<br>
>  }<br>
><br>
> @@ -435,7 +490,8 @@ static void<br>
>  load_needs_resolve_predicate(<wbr>struct anv_cmd_buffer *cmd_buffer,<br>
>                               const struct anv_image *image,<br>
>                               VkImageAspectFlagBits aspect,<br>
> -                             unsigned level)<br>
> +                             unsigned level,<br>
> +                             enum anv_image_resolve_bits bits)<br>
>  {<br>
>     assert(cmd_buffer && image);<br>
>     assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_<wbr>ANV);<br>
> @@ -450,9 +506,27 @@ load_needs_resolve_predicate(<wbr>struct anv_cmd_buffer *cmd_buffer,<br>
>      */<br>
>     emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1    , 0);<br>
>     emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1 + 4, 0);<br>
> -   emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0    , 0);<br>
> -   emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4,<br>
> -            <a href="http://resolve_flag_addr.bo" rel="noreferrer" target="_blank">resolve_flag_addr.bo</a>, resolve_flag_addr.offset);<br>
> +<br>
> +   /* Conditionally load the two dwords into the high and low portions of<br>
> +    * MI_PREDICATE_SRC0.  This effectively ANDs the bits passed into this<br>
> +    * function with the logical bits stored in the metadata page.  Because<br>
> +    * they're split out with one bit per dword, we don't need to use any<br>
> +    * sort of MI math.<br>
> +    */<br>
> +   if (bits & ANV_IMAGE_HAS_FAST_CLEAR_BIT) {<br>
> +      emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0,<br>
> +               <a href="http://resolve_flag_addr.bo" rel="noreferrer" target="_blank">resolve_flag_addr.bo</a>, resolve_flag_addr.offset);<br>
> +   } else {<br>
> +      emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0, 0);<br>
> +   }<br>
> +<br>
> +   if (bits & ANV_IMAGE_HAS_COMPRESSION_BIT) {<br>
> +      emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4,<br>
> +               <a href="http://resolve_flag_addr.bo" rel="noreferrer" target="_blank">resolve_flag_addr.bo</a>, resolve_flag_addr.offset + 4);<br>
> +   } else {<br>
> +      emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4, 0);<br>
> +   }<br>
> +<br>
>     anv_batch_emit(&cmd_buffer-><wbr>batch, GENX(MI_PREDICATE), mip) {<br>
>        mip.LoadOperation    = LOAD_LOADINV;<br>
>        mip.CombineOperation = COMBINE_SET;<br>
> @@ -467,6 +541,18 @@ genX(cmd_buffer_mark_image_<wbr>written)(struct anv_cmd_buffer *cmd_buffer,<br>
>                                      enum isl_aux_usage aux_usage,<br>
>                                      unsigned level)<br>
>  {<br>
> +   /* The only compression types with more than just fast-clears are MCS,<br>
> +    * CCS_E, and HiZ.  With HiZ we just trust the layout and don't actually<br>
> +    * track the current fast-clear and compression state.  This leaves us<br>
> +    * with just MCS and CCS_E.<br>
> +    */<br>
> +   if (aux_usage != ISL_AUX_USAGE_CCS_E &&<br>
> +       aux_usage != ISL_AUX_USAGE_MCS)<br>
> +      return;<br>
> +<br>
> +   set_image_needs_resolve_bits(<wbr>cmd_buffer, image,<br>
> +                                VK_IMAGE_ASPECT_COLOR_BIT, level,<br>
> +                                ANV_IMAGE_HAS_COMPRESSION_BIT)<wbr>;<br>
>  }<br>
><br>
>  static void<br>
> @@ -488,8 +574,11 @@ init_fast_clear_state_entry(<wbr>struct anv_cmd_buffer *cmd_buffer,<br>
>      * to return incorrect data. The fast clear data in CCS_D buffers should<br>
>      * be removed because CCS_D isn't enabled all the time.<br>
>      */<br>
> -   set_image_needs_resolve(cmd_<wbr>buffer, image, aspect, level,<br>
> -                           aux_usage == ISL_AUX_USAGE_NONE);<br>
> +   if (aux_usage == ISL_AUX_USAGE_NONE) {<br>
> +      set_image_needs_resolve_bits(<wbr>cmd_buffer, image, aspect, level, ~0);<br>
> +   } else {<br>
> +      clear_image_needs_resolve_<wbr>bits(cmd_buffer, image, aspect, level, ~0);<br>
> +   }<br>
><br>
>     /* The fast clear value dword(s) will be copied into a surface state object.<br>
>      * Ensure that the restrictions of the fields in the dword(s) are followed.<br>
> @@ -812,12 +901,25 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,<br>
>           layer_count = MIN2(layer_count, anv_image_aux_layers(image, aspect, level));<br>
>        }<br>
><br>
> -      load_needs_resolve_predicate(<wbr>cmd_buffer, image, aspect, level);<br>
> +      enum anv_image_resolve_bits resolve_bits;<br>
> +      switch (resolve_op) {<br>
> +      case ISL_AUX_OP_FULL_RESOLVE:<br>
> +         resolve_bits = ANV_IMAGE_NEEDS_FULL_RESOLVE_<wbr>BITS;<br>
> +         break;<br>
> +      case ISL_AUX_OP_PARTIAL_RESOLVE:<br>
> +         resolve_bits = ANV_IMAGE_NEEDS_PARTIAL_<wbr>RESOLVE_BITS;<br>
> +         break;<br>
> +      default:<br>
> +         unreachable("Invalid resolve op");<br>
> +      }<br>
> +      load_needs_resolve_predicate(<wbr>cmd_buffer, image, aspect, level,<br>
> +                                   resolve_bits);<br>
><br>
>        anv_image_ccs_op(cmd_buffer, image, aspect, level,<br>
>                         base_layer, layer_count, resolve_op, true);<br>
><br>
> -      set_image_needs_resolve(cmd_<wbr>buffer, image, aspect, level, false);<br>
> +      clear_image_needs_resolve_<wbr>bits(cmd_buffer, image, aspect,<br>
> +                                     level, resolve_bits);<br>
>     }<br>
><br>
>     cmd_buffer->state.pending_<wbr>pipe_bits |=<br>
> @@ -2992,15 +3094,15 @@ cmd_buffer_subpass_sync_fast_<wbr>clear_values(struct anv_cmd_buffer *cmd_buffer)<br>
>               * will match what's in every RENDER_SURFACE_STATE object when it's<br>
>               * being used for sampling.<br>
>               */<br>
> -            set_image_needs_resolve(cmd_<wbr>buffer, iview->image,<br>
> -                                    VK_IMAGE_ASPECT_COLOR_BIT,<br>
> -                                    iview->planes[0].isl.base_<wbr>level,<br>
> -                                    false);<br>
> +            clear_image_needs_resolve_<wbr>bits(cmd_buffer, iview->image,<br>
> +                                           VK_IMAGE_ASPECT_COLOR_BIT,<br>
> +                                           iview->planes[0].isl.base_<wbr>level,<br>
> +                                           ANV_IMAGE_HAS_FAST_CLEAR_BIT);<br>
>           } else {<br>
> -            set_image_needs_resolve(cmd_<wbr>buffer, iview->image,<br>
> -                                    VK_IMAGE_ASPECT_COLOR_BIT,<br>
> -                                    iview->planes[0].isl.base_<wbr>level,<br>
> -                                    true);<br>
> +            set_image_needs_resolve_bits(<wbr>cmd_buffer, iview->image,<br>
> +                                         VK_IMAGE_ASPECT_COLOR_BIT,<br>
> +                                         iview->planes[0].isl.base_<wbr>level,<br>
> +                                         ANV_IMAGE_HAS_FAST_CLEAR_BIT);<br>
>           }<br>
>        } else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {<br>
>           /* The attachment may have been fast-cleared in a previous render<br>
> --<br>
> 2.5.0.400.gff86faf<br>
><br>
</div></div><div class="HOEnZb"><div class="h5">> ______________________________<wbr>_________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
</div></div></blockquote></div><br></div></div>