[Mesa-dev] [PATCH v3 20/24] anv/cmd_buffer: Rework aux tracking

Wed Feb 7 19:27:38 UTC 2018

On Wed, Feb 7, 2018 at 9:59 AM, Nanley Chery <nanleychery at gmail.com> wrote:

> On Tue, Feb 06, 2018 at 05:57:26PM -0800, Jason Ekstrand wrote:
> > On Tue, Feb 6, 2018 at 5:09 PM, Nanley Chery <nanleychery at gmail.com>
> wrote:
> >
> > > On Mon, Feb 05, 2018 at 06:16:26PM -0800, Jason Ekstrand wrote:
> > > > This commit completely reworks aux tracking.  This includes a number
> of
> > > > somewhat distinct changes:
> > > >
> > > >  1) Since we are no longer fast-clearing multiple slices, we only
> need
> > > >     to track one fast clear color and one fast clear type.
> > > >
> > > >  2) We store two bits for fast clear instead of one to let us
> > > >     distinguish between zero and non-zero fast clear colors.  This is
> > > >     needed so that we can do full resolves when transitioning to
> > > >     PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear
> > > >     values in all sorts of places we wouldn't normally.
> > > >
> > > >  3) We now track compression state as a boolean separate from fast
> clear
> > > >     type and this is tracked on a per-slice granularity.
> > > >
> > > > The previous scheme had some issues when it came to individual
> slices of
> > > > a multi-LOD images.  In particular, we only tracked "needs resolve"
> > > > per-LOD but you could do a vkCmdPipelineBarrier that would only
> resolve
> > > > a portion of the image and would set "needs resolve" to false anyway.
> > > > Also, any transition from an undefined layout would reset the clear
> > > > color for the entire LOD regardless of whether or not there was some
> > > > clear color on some other slice.
> > > >
> > > > As far as full/partial resolves go, he assumptions of the previous
> > > > scheme held because the one case where we do need a full resolve when
> > > > CCS_E is enabled is for window-system images.  Since we only ever
> > > > allowed X-tiled window-system images, CCS was entirely disabled on
> gen9+
> > > > and we never got CCS_E.  With the advent of Y-tiled window-system
> > > > buffers, we now need to properly support doing a full resolve of
> images
> > > > marked CCS_E.
> > > >
> > > > v2 (Jason Ekstrand):
> > > >  - Fix an bug in the compressed flag offset calculation
> > > >  - Treat 3D images as multi-slice for the purposes of resolve
> tracking
> > > >
> > > > Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
> > > > ---
> > > >  src/intel/vulkan/anv_blorp.c       |   3 +-
> > > >  src/intel/vulkan/anv_image.c       | 100 ++++++-----
> > > >  src/intel/vulkan/anv_private.h     |  60 ++++---
> > > >  src/intel/vulkan/genX_cmd_buffer.c | 340
> +++++++++++++++++++++++++++---
> > > -------
> > > >  4 files changed, 345 insertions(+), 158 deletions(-)
> > > >
> > > > diff --git a/src/intel/vulkan/anv_blorp.c
> b/src/intel/vulkan/anv_blorp.c
> > > > index 497ae6f..fc3b717 100644
> > > > --- a/src/intel/vulkan/anv_blorp.c
> > > > +++ b/src/intel/vulkan/anv_blorp.c
> > > > @@ -1758,8 +1758,7 @@ anv_image_ccs_op(struct anv_cmd_buffer
> *cmd_buffer,
> > > >         * particular value and don't care about format or clear
> value.
> > > >         */
> > > >        const struct anv_address clear_color_addr =
> > > > -         anv_image_get_clear_color_addr(cmd_buffer->device, image,
> > > > -                                        aspect, level);
> > > > +         anv_image_get_clear_color_addr(cmd_buffer->device, image,
> > > aspect);
> > > >        surf.clear_color_addr = anv_to_blorp_address(clear_
> color_addr);
> > > >     }
> > > >
> > > > diff --git a/src/intel/vulkan/anv_image.c
> b/src/intel/vulkan/anv_image.c
> > > > index 11942d0..011e952 100644
> > > > --- a/src/intel/vulkan/anv_image.c
> > > > +++ b/src/intel/vulkan/anv_image.c
> > > > @@ -190,46 +190,54 @@ all_formats_ccs_e_compatible(const struct
> > > gen_device_info *devinfo,
> > > >   * fast-clear values in non-trivial cases (e.g., outside of a render
> > > pass in
> > > >   * which a fast clear has occurred).
> > > >   *
> > > > - * For the purpose of discoverability, the algorithm used to manage
> > > this buffer
> > > > - * is described here. A clear value in this buffer is updated when a
> > > fast clear
> > > > - * is performed on a subresource. One of two synchronization
> operations
> > > is
> > > > - * performed in order for a following memory access to use the
> > > fast-clear
> > > > - * value:
> > > > - *    a. Copy the value from the buffer to the surface state object
> > > used for
> > > > - *       reading. This is done implicitly when the value is the
> clear
> > > value
> > > > - *       predetermined to be the default in other surface state
> > > objects. This
> > > > - *       is currently only done explicitly for the operation below.
> > > > - *    b. Do (a) and use the surface state object to resolve the
> > > subresource.
> > > > - *       This is only done during layout transitions for decent
> > > performance.
> > > > + * In order to avoid having multiple clear colors for a single
> plane of
> > > an
> > > > + * image (hence a single RENDER_SURFACE_STATE), we only allow
> > > fast-clears on
> > > > + * the first slice (level 0, layer 0).  At the time of our testing
> (Jan
> > > 17,
> > > > + * 2018), there were no known applications which would benefit from
> > > fast-
> > > > + * clearing more than just the first slice.
> > > >   *
> > > > - * With the above scheme, we can fast-clear whenever the hardware
> > > allows except
> > > > - * for two cases in which synchronization becomes impossible or
> > > undesirable:
> > > > - *    * The subresource is in the GENERAL layout and is cleared to a
> > > value
> > > > - *      other than the special default value.
> > > > + * The fast clear portion of the image is laid out in the following
> > > order:
> > > >   *
> > > > - *      Performing a synchronization operation in order to read
> from the
> > > > - *      subresource is undesirable in this case. Firstly, b) is not
> an
> > > option
> > > > - *      because a layout transition isn't required between a write
> and
> > > read of
> > > > - *      an image in the GENERAL layout. Secondly, it's undesirable
> to
> > > do a)
> > > > - *      explicitly because it would require large infrastructural
> > > changes. The
> > > > - *      Vulkan API supports us in deciding not to optimize this
> layout
> > > by
> > > > - *      stating that using this layout may cause suboptimal
> > > performance. NOTE:
> > > > - *      the auxiliary buffer must always be enabled to support a)
> > > implicitly.
> > > > + *  * 1 or 4 dwords (depending on hardware generation) for the clear
> > > color
> > > > + *  * 1 dword for the anv_fast_clear_type of the clear color
> > > > + *  * On gen9+, 1 dword per level and layer of the image (3D levels
> > > count
> > > > + *    multiple layers) in level-major order for compression state.
> > > >   *
> > > > + * For the purpose of discoverability, the algorithm used to manage
> > > > + * compression and fast-clears is described here:
> > > >   *
> > > > - *    * For the given miplevel, only some of the layers are cleared
> at
> > > once.
> > > > + *  * On a transition from UNDEFINED or PREINITIALIZED to a defined
> > > layout,
> > > > + *    all of the values in the fast clear portion of the image are
> > > initialized
> > > > + *    to default values.
> > > >   *
> > > > - *      If the user clears each layer to a different value, then
> tries
> > > to
> > > > - *      render to multiple layers at once, we have no ability to
> > > perform a
> > > > - *      synchronization operation in between. a) is not helpful
> because
> > > the
> > > > - *      object can only hold one clear value. b) is not an option
> > > because a
> > > > - *      layout transition isn't required in this case.
> > > > + *  * On fast-clear, the clear value is written into surface state
> and
> > > also
> > > > + *    into the buffer and the fast clear type is set appropriately.
> > > Both
> > > > + *    setting the fast-clear value in the buffer and setting the
> > > fast-clear
> > > > + *    type happen from the GPU using MI commands.
> > >
> > > There's no mention about setting the compression dwords during
> > > fast-clear operations or on render target writes.
> > >
> >
> > I've added another bullet:
> >
> >  *  * Whenever a render or blorp operation is performed with CCS_E, we
> call
> >  *    anv_cmd_buffer_mark_image_written to set the compression state to
> 1.
> >
> >
>
> That'll work. I think we should replace
> anv_cmd_buffer_mark_image_written with
> genX(cmd_buffer_mark_image_written) though, since the former's a wrapper
> that only gets called for blorp operations.
>

Done.

> > > > + *
> > > > + *  * On pipeline barrier transitions, the worst-case transition is
> > > computed
> > > > + *    from the image layouts.  The command streamer inspects the
> fast
> > > clear
> > > > + *    type and compression state dwords and constructs a
> predicate.  The
> > > > + *    worst-case resolve is performed with the given predicate and
> the
> > > fast
> > > > + *    clear and compression state is set accordingly.
> > > > + *
> > > > + * See anv_layout_to_aux_usage and anv_layout_to_fast_clear_type
> > > functions for
> > > > + * details on exactly what is allowed in what layouts.
> > > > + *
> > > > + * On gen7-9, we do not have a concept of indirect clear colors in
> > > hardware.
> > > > + * In order to deal with this, we have to do some clear color
> > > management.
> > > > + *
> > > > + *  * For LOAD_OP_LOAD at the top of a renderpass, we have to copy
> the
> > > clear
> > > > + *    value from the buffer into the surface state with MI commands.
> > > > + *
> > > > + *  * For any blorp operations, we pass the address to the clear
> value
> > > into
> > > > + *    blorp and it knows to copy the clear color.
> > > >   */
> > > >  static void
> > > > -add_fast_clear_state_buffer(struct anv_image *image,
> > > > -                            VkImageAspectFlagBits aspect,
> > > > -                            uint32_t plane,
> > > > -                            const struct anv_device *device)
> > > > +add_aux_state_tracking_buffer(struct anv_image *image,
> > > > +                              VkImageAspectFlagBits aspect,
> > > > +                              uint32_t plane,
> > > > +                              const struct anv_device *device)
> > >
> > > The comment referencing this function in genX_cmd_buffer.c needs to be
> > > updated.
> > >
> >
> > I don't see that comment anywhere.
> >
> >
>
> Maybe you delete it in a future series? This is the one I have in mind:
>
> /* We only allow fast clears in the GENERAL layout if the auxiliary
>  * buffer is always enabled and the fast-clear value is all 0's. See
>  * add_fast_clear_state_buffer() for more information.
>  */
>

Right.  Somehow my grepping missed it.  Fixed.

> > > >  {
> > > >     assert(image && device);
> > > >     assert(image->planes[plane].aux_surface.isl.size > 0 &&
> > > > @@ -251,20 +259,24 @@ add_fast_clear_state_buffer(struct anv_image
> > > *image,
> > > >               (image->planes[plane].offset +
> image->planes[plane].size));
> > > >     }
> > > >
> > > > -   const unsigned entry_size = anv_fast_clear_state_entry_
> size(device);
> > > > -   /* There's no padding between entries, so ensure that they're
> always
> > > a
> > > > -    * multiple of 32 bits in order to enable GPU memcpy operations.
> > > > -    */
> > > > -   assert(entry_size % 4 == 0);
> > > > +   /* Clear color and fast clear type */
> > > > +   unsigned state_size = device->isl_dev.ss.clear_value_size + 4;
> > > >
> > > > -   const unsigned plane_state_size =
> > > > -      entry_size * anv_image_aux_levels(image, aspect);
> > > > +   /* We only need to track compression on CCS_E surfaces. */
> > > > +   if (image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E) {
> > > > +      if (image->type == VK_IMAGE_TYPE_3D) {
> > > > +         for (uint32_t l = 0; l < image->levels; l++)
> > > > +            state_size += anv_minify(image->extent.depth, l) * 4;
> > > > +      } else {
> > > > +         state_size += image->levels * image->array_size * 4;
> > > > +      }
> > > > +   }
> > > >
> > > >     image->planes[plane].fast_clear_state_offset =
> > > >        image->planes[plane].offset + image->planes[plane].size;
> > > >
> > > > -   image->planes[plane].size += plane_state_size;
> > > > -   image->size += plane_state_size;
> > > > +   image->planes[plane].size += state_size;
> > > > +   image->size += state_size;
> > > >  }
> > > >
> > > >  /**
> > > > @@ -437,7 +449,7 @@ make_surface(const struct anv_device *dev,
> > > >              }
> > > >
> > > >              add_surface(image, &image->planes[plane].aux_surface,
> > > plane);
> > > > -            add_fast_clear_state_buffer(image, aspect, plane, dev);
> > > > +            add_aux_state_tracking_buffer(image, aspect, plane,
> dev);
> > > >
> > > >              /* For images created without MUTABLE_FORMAT_BIT set, we
> > > know that
> > > >               * they will always be used with the original format.
> In
> > > > @@ -461,7 +473,7 @@ make_surface(const struct anv_device *dev,
> > > >                                   &image->planes[plane].aux_
> > > surface.isl);
> > > >        if (ok) {
> > > >           add_surface(image, &image->planes[plane].aux_surface,
> plane);
> > > > -         add_fast_clear_state_buffer(image, aspect, plane, dev);
> > > > +         add_aux_state_tracking_buffer(image, aspect, plane, dev);
> > > >           image->planes[plane].aux_usage = ISL_AUX_USAGE_MCS;
> > > >        }
> > > >     }
> > > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > > private.h
> > > > index 5f82702..d38dd9e 100644
> > > > --- a/src/intel/vulkan/anv_private.h
> > > > +++ b/src/intel/vulkan/anv_private.h
> > > > @@ -2533,50 +2533,58 @@ anv_image_aux_layers(const struct anv_image *
> > > const image,
> > > >     }
> > > >  }
> > > >
> > > > -static inline unsigned
> > > > -anv_fast_clear_state_entry_size(const struct anv_device *device)
> > > > -{
> > > > -   assert(device);
> > > > -   /* Entry contents:
> > > > -    *   +--------------------------------------------+
> > > > -    *   | clear value dword(s) | needs resolve dword |
> > > > -    *   +--------------------------------------------+
> > > > -    */
> > > > -
> > > > -   /* Ensure that the needs resolve dword is in fact dword-aligned
> to
> > > enable
> > > > -    * GPU memcpy operations.
> > > > -    */
> > > > -   assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> > > > -   return device->isl_dev.ss.clear_value_size + 4;
> > > > -}
> > > > -
> > > >  static inline struct anv_address
> > > >  anv_image_get_clear_color_addr(const struct anv_device *device,
> > > >                                 const struct anv_image *image,
> > > > -                               VkImageAspectFlagBits aspect,
> > > > -                               unsigned level)
> > > > +                               VkImageAspectFlagBits aspect)
> > > >  {
> > > > +   assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> > > > +
> > > >     uint32_t plane = anv_image_aspect_to_plane(image->aspects,
> aspect);
> > > >     return (struct anv_address) {
> > > >        .bo = image->planes[plane].bo,
> > > >        .offset = image->planes[plane].bo_offset +
> > > > -                image->planes[plane].fast_clear_state_offset +
> > > > -                anv_fast_clear_state_entry_size(device) * level,
> > > > +                image->planes[plane].fast_clear_state_offset,
> > > >     };
> > > >  }
> > > >
> > > >  static inline struct anv_address
> > > > -anv_image_get_needs_resolve_addr(const struct anv_device *device,
> > > > -                                 const struct anv_image *image,
> > > > -                                 VkImageAspectFlagBits aspect,
> > > > -                                 unsigned level)
> > > > +anv_image_get_fast_clear_type_addr(const struct anv_device *device,
> > > > +                                   const struct anv_image *image,
> > > > +                                   VkImageAspectFlagBits aspect)
> > > >  {
> > > >     struct anv_address addr =
> > > > -      anv_image_get_clear_color_addr(device, image, aspect, level);
> > > > +      anv_image_get_clear_color_addr(device, image, aspect);
> > > >     addr.offset += device->isl_dev.ss.clear_value_size;
> > > >     return addr;
> > > >  }
> > > >
> > > > +static inline struct anv_address
> > > > +anv_image_get_compression_state_addr(const struct anv_device
> *device,
> > > > +                                     const struct anv_image *image,
> > > > +                                     VkImageAspectFlagBits aspect,
> > > > +                                     uint32_t level, uint32_t
> > > array_layer)
> > > > +{
> > > > +   assert(level < anv_image_aux_levels(image, aspect));
> > > > +   assert(array_layer < anv_image_aux_layers(image, aspect, level));
> > > > +   UNUSED uint32_t plane = anv_image_aspect_to_plane(
> image->aspects,
> > > aspect);
> > > > +   assert(image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E);
> > > > +
> > > > +   struct anv_address addr =
> > > > +      anv_image_get_fast_clear_type_addr(device, image, aspect);
> > > > +   addr.offset += 4; /* Go past the fast clear type */
> > > > +
> > > > +   if (image->type == VK_IMAGE_TYPE_3D) {
> > > > +      for (uint32_t l = 0; l < level; l++)
> > > > +         addr.offset += anv_minify(image->extent.depth, l) * 4;
> > > > +   } else {
> > > > +      addr.offset += level * image->array_size * 4;
> > > > +   }
> > > > +   addr.offset += array_layer * 4;
> > > > +
> > > > +   return addr;
> > > > +}
> > > > +
> > > >  /* Returns true if a HiZ-enabled depth buffer can be sampled from.
> */
> > > >  static inline bool
> > > >  anv_can_sample_with_hiz(const struct gen_device_info * const
> devinfo,
> > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > > b/src/intel/vulkan/genX_cmd_buffer.c
> > > > index e1a4d95..4c75e0c 100644
> > > > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > > > @@ -407,27 +407,45 @@ transition_depth_buffer(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > >  #define MI_PREDICATE_SRC0  0x2400
> > > >  #define MI_PREDICATE_SRC1  0x2408
> > > >
> > > > -/* Manages the state of an color image subresource to ensure
> resolves
> > > are
> > > > - * performed properly.
> > > > - */
> > > >  static void
> > > > -genX(set_image_needs_resolve)(struct anv_cmd_buffer *cmd_buffer,
> > > > -                        const struct anv_image *image,
> > > > -                        VkImageAspectFlagBits aspect,
> > > > -                        unsigned level, bool needs_resolve)
> > > > +set_image_fast_clear_state(struct anv_cmd_buffer *cmd_buffer,
> > > > +                           const struct anv_image *image,
> > > > +                           VkImageAspectFlagBits aspect,
> > > > +                           enum anv_fast_clear_type fast_clear)
> > > >  {
> > > > -   assert(cmd_buffer && image);
> > > > -   assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> > > > -   assert(level < anv_image_aux_levels(image, aspect));
> > > > -
> > > > -   /* The HW docs say that there is no way to guarantee the
> completion
> > > of
> > > > -    * the following command. We use it nevertheless because it
> shows no
> > > > -    * issues in testing is currently being used in the GL driver.
> > > > -    */
> > > >     anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM),
> sdi) {
> > > > -      sdi.Address = anv_image_get_needs_resolve_
> > > addr(cmd_buffer->device,
> > > > -                                                     image, aspect,
> > > level);
> > > > -      sdi.ImmediateData = needs_resolve;
> > > > +      sdi.Address = anv_image_get_fast_clear_type_
> > > addr(cmd_buffer->device,
> > > > +                                                       image,
> aspect);
> > > > +      sdi.ImmediateData = fast_clear;
> > > > +   }
> > > > +}
> > > > +
> > > > +static void
> > > > +set_image_compressed_bit(struct anv_cmd_buffer *cmd_buffer,
> > > > +                         const struct anv_image *image,
> > > > +                         VkImageAspectFlagBits aspect,
> > > > +                         uint32_t level,
> > > > +                         uint32_t base_layer, uint32_t layer_count,
> > > > +                         bool compressed)
> > > > +{
> > > > +   /* We only have CCS_E on gen9+ */
> > > > +   if (GEN_GEN < 9)
> > > > +      return;
> > > > +
> > >
> > > I'm also fine with dropping this if statement.
> > >
> >
> > Ok, we'll drop it.
> >
> >
> > > > +   uint32_t plane = anv_image_aspect_to_plane(image->aspects,
> aspect);
> > > > +
> > > > +   /* We only have compression tracking for CCS_E */
> > > > +   if (image->planes[plane].aux_usage != ISL_AUX_USAGE_CCS_E)
> > > > +      return;
> > > > +
> > > > +   for (uint32_t a = 0; a < layer_count; a++) {
> > > > +      uint32_t layer = base_layer + a;
> > > > +      anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM),
> sdi)
> > > {
> > > > +         sdi.Address = anv_image_get_compression_
> > > state_addr(cmd_buffer->device,
> > > > +                                                            image,
> > > aspect,
> > > > +                                                            level,
> > > layer);
> > > > +         sdi.ImmediateData = compressed ? UINT32_MAX : 0;
> > > > +      }
> > > >     }
> > > >  }
> > > >
> > > > @@ -451,32 +469,176 @@ mi_alu(uint32_t opcode, uint32_t operand1,
> > > uint32_t operand2)
> > > >  #define CS_GPR(n) (0x2600 + (n) * 8)
> > > >
> > > >  static void
> > > > -genX(load_needs_resolve_predicate)(struct anv_cmd_buffer
> *cmd_buffer,
> > > > -                                   const struct anv_image *image,
> > > > -                                   VkImageAspectFlagBits aspect,
> > > > -                                   unsigned level)
> > > > +anv_cmd_predicated_ccs_resolve(struct anv_cmd_buffer *cmd_buffer,
> > > > +                               const struct anv_image *image,
> > > > +                               VkImageAspectFlagBits aspect,
> > > > +                               uint32_t level, uint32_t array_layer,
> > > > +                               enum isl_aux_op resolve_op,
> > > > +                               enum anv_fast_clear_type
> > > fast_clear_supported)
> > > >  {
> > > > -   assert(cmd_buffer && image);
> > > > -   assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> > > > -   assert(level < anv_image_aux_levels(image, aspect));
> > > > +   struct anv_address fast_clear_type_addr =
> > > > +      anv_image_get_fast_clear_type_addr(cmd_buffer->device, image,
> > > aspect);
> > > > +
> > > > +#if GEN_GEN >= 9
> > > > +   const uint32_t plane = anv_image_aspect_to_plane(image->aspects,
> > > aspect);
> > > > +   const bool decompress =
> > > > +      resolve_op == ISL_AUX_OP_FULL_RESOLVE &&
> > > > +      image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E;
> > > > +
> > > > +   /* This function shouldn't get called if it isn't going to do
> > > anything */
> > > > +   assert(decompress || fast_clear_supported < ANV_FAST_CLEAR_ANY);
> > > > +
> > > > +   if (level == 0 && array_layer == 0) {
> > > > +      /* This is the complex case because we have to worry about
> > > dealing with
> > > > +       * the fast clear color.  Unfortunately, it's also the common
> > > case.
> > > > +       */
> > > > +
> > > > +      /* Poor-man's register allocation */
> > > > +      int next_reg = MI_ALU_REG0;
> > > > +      int pred_reg = -1;
> > > > +
> > > > +      /* Needed for ALU operations */
> > > > +      uint32_t *dw;
> > > > +
> > > > +      const int image_fc = next_reg++;
> > > > +      anv_batch_emit(&cmd_buffer->batch,
> GENX(MI_LOAD_REGISTER_MEM),
> > > lrm) {
> > > > +         lrm.RegisterAddress  = CS_GPR(image_fc);
> > > > +         lrm.MemoryAddress    = fast_clear_type_addr;
> > > > +      }
> > > > +      emit_lri(&cmd_buffer->batch, CS_GPR(image_fc) + 4, 0);
> > > > +
> > > > +      if (fast_clear_supported < ANV_FAST_CLEAR_ANY) {
> > > > +         /* We need to compute (fast_clear_supported <
> > > image->fast_clear).
> > > > +          * We do this by subtracting and storing the carry bit.
> > > > +          */
> > > > +         const int fc_imm = next_reg++;
> > > > +         emit_lri(&cmd_buffer->batch, CS_GPR(fc_imm),
> > > fast_clear_supported);
> > > > +         emit_lri(&cmd_buffer->batch, CS_GPR(fc_imm) + 4, 0);
> > > > +
> > > > +         assert(pred_reg == -1);
> > > > +         pred_reg = next_reg++;
> > > > +
> > > > +         dw = anv_batch_emitn(&cmd_buffer->batch, 5,
> GENX(MI_MATH));
> > > > +         dw[1] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCA, fc_imm);
> > > > +         dw[2] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCB, image_fc);
> > > > +         dw[3] = mi_alu(MI_ALU_SUB, 0, 0);
> > > > +         dw[4] = mi_alu(MI_ALU_STORE, pred_reg, MI_ALU_CF);
> > > > +      }
> > > > +
> > > > +      if (decompress) {
> > > > +         /* If we're doing a full resolve, we need the compression
> > > state */
> > > > +         struct anv_address compression_state_addr =
> > > > +            anv_image_get_compression_
> state_addr(cmd_buffer->device,
> > > image,
> > > > +                                                 aspect, level,
> > > array_layer);
> > > > +         if (pred_reg == -1) {
> > > > +            pred_reg = next_reg++;
> > > > +            anv_batch_emit(&cmd_buffer->batch,
> > > GENX(MI_LOAD_REGISTER_MEM), lrm) {
> > > > +               lrm.RegisterAddress  = CS_GPR(pred_reg);
> > > > +               lrm.MemoryAddress    = compression_state_addr;
> > > > +            }
> > >
> >
> > In this case, we don't have a fast-clear predicate.  This is because "if
> > (fast_clear_supported < ANV_FAST_CLEAR_ANY)" failed so we didn't compute
> > one.  In this case, we treat the fast-clear predicate as being 0 and OR
> > with 0 is just the original value.
> >
> >
> >
> > > > +         } else {
> > > > +            /* OR the compression state into the predicate.  The
> > > compression
> > > > +             * state is already in 0/~0 form.
> > > > +             */
> > >
> > > The OR is a result of this function's structure and wouldn't be needed
> > > if the control flow in this function was different right?
> > >
> >
> > In this case, we have both a fast-clear predicate and a compression
> > predicate.  We have to OR them together.
> >
> > I'm not sure what you mean about control-flow in this function being
> > different.  Yes, I'm probably over-optimizing a bit to avoid unneeded CS
> > operations.
> >
> >
>
> By that, I'm referring to the organization of the if's and else's in
> this function - though that still isn't very exact/clear. Sorry for the
> confusion.
>
> It seems like the OR is one more CS operation than neeeded. It's always
> the case that if we have some fast-clear data, the compression state
> will always be true.
>

I'm not sure if that's always true or not.  I think it is in practice since
we only fast clear at the top of render passes and we always call
mark_image_written at the top of the render pass too.  However, if the
first use of a fast-cleared image is an input attachment, this may not be
the case.  Also, if we ever try to do something clever like you suggested
with PS depth count, this won't be the case.  In GL, it's easy because all
you have to do is a fast-clear and some rendering as sRGB and you get into
the PARTIAL_CLEAR state.  In any case, I don't know that we want them to be
coupled.

> Here's how I understand the states could look:
>  FC | COMP
> ==========
>  0  | 0
>  0  | 1
>  1  | 1
>  2  | 1
>
> If that's the case we only have to check the compression_state when a
> decompress is called for. If we moved the "if (decompress) {}" block of
> the function higher we wouldn't need to OR the compression predicate
> with the fast clear predicate.
>

Ugh... This is subtle...  If the above table is accurate (which, as I said
above, I'm not convinced it is), then I think your statement there is
correct.

I'm a bit torn.  On the one hand, this code is complicated and we could
simplify it if we're willing to make the assumption above.  On the other
hand, there are a bunch of subtle interactions here and I don't like subtle.

> > > > +            const int image_comp = next_reg++;
> > > > +            anv_batch_emit(&cmd_buffer->batch,
> > > GENX(MI_LOAD_REGISTER_MEM), lrm) {
> > > > +               lrm.RegisterAddress  = CS_GPR(image_comp);
> > > > +               lrm.MemoryAddress    = compression_state_addr;
> > > > +            }
> > > > +
> > > > +            dw = anv_batch_emitn(&cmd_buffer->batch, 5,
> GENX(MI_MATH));
> > > > +            dw[1] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCA, pred_reg);
> > > > +            dw[2] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCB, image_comp);
> > > > +            dw[3] = mi_alu(MI_ALU_OR, 0, 0);
> > > > +            dw[4] = mi_alu(MI_ALU_STORE, pred_reg, MI_ALU_ACCU);
> > > > +         }
> > > > +
> > > > +         anv_batch_emit(&cmd_buffer->batch,
> GENX(MI_STORE_DATA_IMM),
> > > sdi) {
> > > > +            sdi.Address = compression_state_addr;
> > > > +            sdi.ImmediateData = 0;
> > > > +         }
> > > > +      }
> > > >
> > > > -   const struct anv_address resolve_flag_addr =
> > > > -      anv_image_get_needs_resolve_addr(cmd_buffer->device,
> > > > -                                       image, aspect, level);
> > > > +      /* Store the predicate */
> > > > +      assert(pred_reg != -1);
> > > > +      emit_lrr(&cmd_buffer->batch, MI_PREDICATE_SRC0,
> CS_GPR(pred_reg));
> > > >
> > > > -   /* Make the pending predicated resolve a no-op if one is not
> needed.
> > > > -    * predicate = do_resolve = resolve_flag != 0;
> > > > +      /* If the predicate is true, we want to write 0 to the fast
> clear
> > > type
> > > > +       * and, if it's false, leave it alone.  We can do this by
> writing
> > > > +       *
> > > > +       * clear_type = clear_type & ~predicate;
> > > > +       */
> > > > +      dw = anv_batch_emitn(&cmd_buffer->batch, 5, GENX(MI_MATH));
> > > > +      dw[1] = mi_alu(MI_ALU_LOAD, MI_ALU_SRCA, image_fc);
> > > > +      dw[2] = mi_alu(MI_ALU_LOADINV, MI_ALU_SRCB, pred_reg);
> > > > +      dw[3] = mi_alu(MI_ALU_AND, 0, 0);
> > > > +      dw[4] = mi_alu(MI_ALU_STORE, image_fc, MI_ALU_ACCU);
> > > > +
> > > > +      anv_batch_emit(&cmd_buffer->batch,
> GENX(MI_STORE_REGISTER_MEM),
> > > srm) {
> > > > +         srm.RegisterAddress  = CS_GPR(image_fc);
> > > > +         srm.MemoryAddress    = fast_clear_type_addr;
> > > > +      }
> > > > +   } else if (decompress) {
> > > > +      /* We're trying to get rid of compression but we don't care
> about
> > > fast
> > > > +       * clears so all we need is the compression predicate.
> > > > +       */
> > > > +      assert(resolve_op == ISL_AUX_OP_FULL_RESOLVE);
> > > > +      struct anv_address compression_state_addr =
> > > > +         anv_image_get_compression_state_addr(cmd_buffer->device,
> > > image,
> > > > +                                              aspect, level,
> > > array_layer);
> > > > +      anv_batch_emit(&cmd_buffer->batch,
> GENX(MI_LOAD_REGISTER_MEM),
> > > lrm) {
> > > > +         lrm.RegisterAddress  = MI_PREDICATE_SRC0;
> > > > +         lrm.MemoryAddress    = compression_state_addr;
> > > > +      }
> > > > +      anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM),
> sdi)
> > > {
> > > > +         sdi.Address = compression_state_addr;
> > > > +         sdi.ImmediateData = 0;
> > > > +      }
>

If we want them decoupled, we probably want to do "compressed |= predicate"
here.

> > > > +   } else {
> > > > +      /* In this case, we're trying to do a partial resolve on a
> slice
> > > that
> > > > +       * doesn't have clear color.  There's nothing to do.
> > > > +       */
> > > > +      return;
> > > > +   }
> > > > +
> > > > +#else /* GEN_GEN <= 8 */
> > > > +   assert(resolve_op == ISL_AUX_OP_FULL_RESOLVE);
> > > > +   assert(fast_clear_supported != ANV_FAST_CLEAR_ANY);
> > > > +
> > > > +   /* We don't support fast clears on anything other than the first
> > > slice. */
> > > > +   if (level > 0 || array_layer > 0)
> > > > +      return;
> > > > +
> > > > +   /* On gen8, we don't have a concept of default clear colors
> because
> > > we
> > > > +    * can't sample from CCS surfaces.  It's enough to just load the
> > > fast clear
> > > > +    * state into the predicate register.
> > > >      */
> > > > +   anv_batch_emit(&cmd_buffer->batch, GENX(MI_LOAD_REGISTER_MEM),
> lrm)
> > > {
> > > > +      lrm.RegisterAddress  = MI_PREDICATE_SRC0;
> > > > +      lrm.MemoryAddress    = fast_clear_type_addr;
> > > > +   }
> > > > +   anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM),
> sdi) {
> > > > +      sdi.Address          = fast_clear_type_addr;
> > > > +      sdi.ImmediateData    = 0;
> > > > +   }
> > > > +#endif
> > > > +
> > > > +   /* We use the first half of src0 for the actual predicate.  Set
> the
> > > second
> > > > +    * half of src0 and all of src1 to 0 as the predicate operation
> will
> > > be
> > > > +    * doing an implicit src0 != src1.
> > > > +    */
> > > > +   emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4, 0);
> > > >     emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1    , 0);
> > > >     emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC1 + 4, 0);
> > > > -   emit_lri(&cmd_buffer->batch, MI_PREDICATE_SRC0    , 0);
> > > > -   emit_lrm(&cmd_buffer->batch, MI_PREDICATE_SRC0 + 4,
> > > > -            resolve_flag_addr.bo, resolve_flag_addr.offset);
> > > > +
> > > >     anv_batch_emit(&cmd_buffer->batch, GENX(MI_PREDICATE), mip) {
> > > >        mip.LoadOperation    = LOAD_LOADINV;
> > > >        mip.CombineOperation = COMBINE_SET;
> > > >        mip.CompareOperation = COMPARE_SRCS_EQUAL;
> > > >     }
> > > > +
> > > > +   anv_image_ccs_op(cmd_buffer, image, aspect, level,
> > > > +                    array_layer, 1, resolve_op, true);
> > > >  }
> > > >
> > > >  void
> > > > @@ -490,17 +652,30 @@ genX(cmd_buffer_mark_image_written)(struct
> > > anv_cmd_buffer *cmd_buffer,
> > > >  {
> > > >     /* The aspect must be exactly one of the image aspects. */
> > > >     assert(_mesa_bitcount(aspect) == 1 && (aspect & image->aspects));
> > > > +
> > > > +   /* The only compression types with more than just fast-clears are
> > > MCS,
> > > > +    * CCS_E, and HiZ.  With HiZ we just trust the layout and don't
> > > actually
> > > > +    * track the current fast-clear and compression state.  This
> leaves
> > > us
> > > > +    * with just MCS and CCS_E.
> > > > +    */
> > > > +   if (aux_usage != ISL_AUX_USAGE_CCS_E &&
> > > > +       aux_usage != ISL_AUX_USAGE_MCS)
> > > > +      return;
> > > > +
> > > > +   set_image_compressed_bit(cmd_buffer, image, aspect,
> > > > +                            level, base_layer, layer_count, true);
> > > >  }
> > > >
> > > >  static void
> > > > -init_fast_clear_state_entry(struct anv_cmd_buffer *cmd_buffer,
> > > > -                            const struct anv_image *image,
> > > > -                            VkImageAspectFlagBits aspect,
> > > > -                            unsigned level)
> > > > +init_fast_clear_color(struct anv_cmd_buffer *cmd_buffer,
> > > > +                      const struct anv_image *image,
> > > > +                      VkImageAspectFlagBits aspect)
> > > >  {
> > > >     assert(cmd_buffer && image);
> > > >     assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> > > > -   assert(level < anv_image_aux_levels(image, aspect));
> > > > +
> > > > +   set_image_fast_clear_state(cmd_buffer, image, aspect,
> > > > +                              ANV_FAST_CLEAR_NONE);
> > > >
> > > >     uint32_t plane = anv_image_aspect_to_plane(image->aspects,
> aspect);
> > > >     enum isl_aux_usage aux_usage = image->planes[plane].aux_usage;
> > > > @@ -517,7 +692,7 @@ init_fast_clear_state_entry(struct
> anv_cmd_buffer
> > > *cmd_buffer,
> > > >      * values in the clear value dword(s).
> > > >      */
> > > >     struct anv_address addr =
> > > > -      anv_image_get_clear_color_addr(cmd_buffer->device, image,
> > > aspect, level);
> > > > +      anv_image_get_clear_color_addr(cmd_buffer->device, image,
> > > aspect);
> > > >     unsigned i = 0;
> > > >     for (; i < cmd_buffer->device->isl_dev.ss.clear_value_size; i
> += 4)
> > > {
> > > >        anv_batch_emit(&cmd_buffer->batch, GENX(MI_STORE_DATA_IMM),
> sdi)
> > > {
> > > > @@ -558,19 +733,17 @@ genX(copy_fast_clear_dwords)(struct
> > > anv_cmd_buffer *cmd_buffer,
> > > >                               struct anv_state surface_state,
> > > >                               const struct anv_image *image,
> > > >                               VkImageAspectFlagBits aspect,
> > > > -                             unsigned level,
> > > >                               bool copy_from_surface_state)
> > > >  {
> > > >     assert(cmd_buffer && image);
> > > >     assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> > > > -   assert(level < anv_image_aux_levels(image, aspect));
> > > >
> > > >     struct anv_bo *ss_bo =
> > > >        &cmd_buffer->device->surface_state_pool.block_pool.bo;
> > > >     uint32_t ss_clear_offset = surface_state.offset +
> > > >        cmd_buffer->device->isl_dev.ss.clear_value_offset;
> > > >     const struct anv_address entry_addr =
> > > > -      anv_image_get_clear_color_addr(cmd_buffer->device, image,
> > > aspect, level);
> > > > +      anv_image_get_clear_color_addr(cmd_buffer->device, image,
> > > aspect);
> > > >     unsigned copy_size = cmd_buffer->device->isl_dev.
> > > ss.clear_value_size;
> > > >
> > > >     if (copy_from_surface_state) {
> > > > @@ -660,18 +833,6 @@ transition_color_buffer(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > >     if (base_layer >= anv_image_aux_layers(image, aspect,
> base_level))
> > > >        return;
> > > >
> > > > -   /* A transition of a 3D subresource works on all slices at a
> time. */
> > > > -   if (image->type == VK_IMAGE_TYPE_3D) {
> > > > -      base_layer = 0;
> > > > -      layer_count = anv_minify(image->extent.depth, base_level);
> > > > -   }
> > > > -
> > > > -   /* We're interested in the subresource range subset that has aux
> > > data. */
> > > > -   level_count = MIN2(level_count, anv_image_aux_levels(image,
> aspect)
> > > - base_level);
> > > > -   layer_count = MIN2(layer_count,
> > > > -                      anv_image_aux_layers(image, aspect,
> base_level) -
> > > base_layer);
> > > > -   last_level_num = base_level + level_count;
> > > > -
> > > >     assert(image->tiling == VK_IMAGE_TILING_OPTIMAL);
> > > >
> > > >     if (initial_layout == VK_IMAGE_LAYOUT_UNDEFINED ||
> > > > @@ -684,8 +845,8 @@ transition_color_buffer(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > >         *
> > > >         * Initialize the relevant clear buffer entries.
> > > >         */
> > > > -      for (unsigned level = base_level; level < last_level_num;
> level++)
> > > > -         init_fast_clear_state_entry(cmd_buffer, image, aspect,
> level);
> > > > +      if (base_level == 0 && base_layer == 0)
> > > > +         init_fast_clear_color(cmd_buffer, image, aspect);
> > > >
> > > >        /* Initialize the aux buffers to enable correct rendering.  In
> > > order to
> > > >         * ensure that things such as storage images work correctly,
> aux
> > > buffers
> > > > @@ -723,13 +884,18 @@ transition_color_buffer(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > >        if (image->samples == 1) {
> > > >           for (uint32_t l = 0; l < level_count; l++) {
> > > >              const uint32_t level = base_level + l;
> > > > -            const uint32_t level_layer_count =
> > > > +            uint32_t level_layer_count =
> > > >                 MIN2(layer_count, anv_image_aux_layers(image, aspect,
> > > level));
> > > > +
> > > >              anv_image_ccs_op(cmd_buffer, image, aspect, level,
> > > >                               base_layer, level_layer_count,
> > > >                               ISL_AUX_OP_AMBIGUATE, false);
> > >
> > > On gen8, we're doing extra ambiguates when the level is greater than 0.
> > >
> >
> > Only if we never render to it with CCS_D enabled on miplevels > 0.  I'm
> not
> > sure if that's guaranteed by the current code.
> >
>
> You're right. Could you put a perf_warn for rendering with CCS_D enabled
> on miplevels > 0 for gen8 in the patch,
> "anv: Only fast clear single-slice images"?
>

We already warn for everything that isn't just single-slice rendering to
the first slice.  We could make a special warning for gen8+ but I don't
really see a reason.

--Jason

> -Nanley
>
> > --Jason
> >
> >
> > > -Nanley
> > >
> > > > -            genX(set_image_needs_resolve)(cmd_buffer, image,
> > > > -                                          aspect, level, false);
> > > > +
> > > > +            if (image->planes[plane].aux_usage ==
> ISL_AUX_USAGE_CCS_E)
> > > {
> > > > +               set_image_compressed_bit(cmd_buffer, image, aspect,
> > > > +                                        level, base_layer,
> > > level_layer_count,
> > > > +                                        false);
> > > > +            }
> > > >           }
> > > >        } else {
> > > >           if (image->samples == 4 || image->samples == 16) {
> > > > @@ -812,19 +978,17 @@ transition_color_buffer(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > >     cmd_buffer->state.pending_pipe_bits |=
> > > >        ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT |
> ANV_PIPE_CS_STALL_BIT;
> > > >
> > > > -   for (uint32_t level = base_level; level < last_level_num;
> level++) {
> > > > +   for (uint32_t l = 0; l < level_count; l++) {
> > > > +      uint32_t level = base_level + l;
> > > > +      uint32_t level_layer_count =
> > > > +         MIN2(layer_count, anv_image_aux_layers(image, aspect,
> level));
> > > >
> > > > -      /* The number of layers changes at each 3D miplevel. */
> > > > -      if (image->type == VK_IMAGE_TYPE_3D) {
> > > > -         layer_count = MIN2(layer_count, anv_image_aux_layers(image,
> > > aspect, level));
> > > > +      for (uint32_t a = 0; a < level_layer_count; a++) {
> > > > +         uint32_t array_layer = base_layer + a;
> > > > +         anv_cmd_predicated_ccs_resolve(cmd_buffer, image, aspect,
> > > > +                                        level, array_layer,
> resolve_op,
> > > > +                                        final_fast_clear);
> > > >        }
> > > > -
> > > > -      genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect,
> > > level);
> > > > -
> > > > -      anv_image_ccs_op(cmd_buffer, image, aspect, level,
> > > > -                       base_layer, layer_count, resolve_op, true);
> > > > -
> > > > -      genX(set_image_needs_resolve)(cmd_buffer, image, aspect,
> level,
> > > false);
> > > >     }
> > > >
> > > >     cmd_buffer->state.pending_pipe_bits |=
> > > > @@ -1488,12 +1652,20 @@ void genX(CmdPipelineBarrier)(
> > > >              anv_image_expand_aspects(image, range->aspectMask);
> > > >           uint32_t aspect_bit;
> > > >
> > > > +         uint32_t base_layer, layer_count;
> > > > +         if (image->type == VK_IMAGE_TYPE_3D) {
> > > > +            base_layer = 0;
> > > > +            layer_count = anv_minify(image->extent.depth,
> > > range->baseMipLevel);
> > > > +         } else {
> > > > +            base_layer = range->baseArrayLayer;
> > > > +            layer_count = anv_get_layerCount(image, range);
> > > > +         }
> > > > +
> > > >           anv_foreach_image_aspect_bit(aspect_bit, image,
> > > color_aspects) {
> > > >              transition_color_buffer(cmd_buffer, image, 1UL <<
> > > aspect_bit,
> > > >                                      range->baseMipLevel,
> > > >                                      anv_get_levelCount(image,
> range),
> > > > -                                    range->baseArrayLayer,
> > > > -                                    anv_get_layerCount(image,
> range),
> > > > +                                    base_layer, layer_count,
> > > >                                      pImageMemoryBarriers[i].
> oldLayout,
> > > >                                      pImageMemoryBarriers[i].
> newLayout);
> > > >           }
> > > > @@ -3203,28 +3375,26 @@ cmd_buffer_subpass_sync_fast_
> clear_values(struct
> > > anv_cmd_buffer *cmd_buffer)
> > > >           genX(copy_fast_clear_dwords)(cmd_buffer,
> > > att_state->color.state,
> > > >                                        iview->image,
> > > >                                        VK_IMAGE_ASPECT_COLOR_BIT,
> > > > -                                      iview->planes[0].isl.base_
> level,
> > > >                                        true /* copy from ss */);
> > > >
> > > >           /* Fast-clears impact whether or not a resolve will be
> > > necessary. */
> > > > -         if (iview->image->planes[0].aux_usage ==
> ISL_AUX_USAGE_CCS_E
> > > &&
> > > > -             att_state->clear_color_is_zero) {
> > > > +         if (att_state->clear_color_is_zero) {
> > > >              /* This image always has the auxiliary buffer enabled.
> We
> > > can mark
> > > >               * the subresource as not needing a resolve because the
> > > clear color
> > > >               * will match what's in every RENDER_SURFACE_STATE
> object
> > > when it's
> > > >               * being used for sampling.
> > > >               */
> > > > -            genX(set_image_needs_resolve)(cmd_buffer, iview->image,
> > > > -                                          VK_IMAGE_ASPECT_COLOR_BIT,
> > > > -                                          iview->planes[0].isl.base_
> > > level,
> > > > -                                          false);
> > > > +            set_image_fast_clear_state(cmd_buffer, iview->image,
> > > > +                                       VK_IMAGE_ASPECT_COLOR_BIT,
> > > > +
>  ANV_FAST_CLEAR_DEFAULT_VALUE);
> > > >           } else {
> > > > -            genX(set_image_needs_resolve)(cmd_buffer, iview->image,
> > > > -                                          VK_IMAGE_ASPECT_COLOR_BIT,
> > > > -                                          iview->planes[0].isl.base_
> > > level,
> > > > -                                          true);
> > > > +            set_image_fast_clear_state(cmd_buffer, iview->image,
> > > > +                                       VK_IMAGE_ASPECT_COLOR_BIT,
> > > > +                                       ANV_FAST_CLEAR_ANY);
> > > >           }
> > > > -      } else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD) {
> > > > +      } else if (rp_att->load_op == VK_ATTACHMENT_LOAD_OP_LOAD &&
> > > > +                 iview->planes[0].isl.base_level == 0 &&
> > > > +                 iview->planes[0].isl.base_array_layer == 0) {
> > > >           /* The attachment may have been fast-cleared in a previous
> > > render
> > > >            * pass and the value is needed now. Update the surface
> > > state(s).
> > > >            *
> > > > @@ -3233,7 +3403,6 @@ cmd_buffer_subpass_sync_fast_
> clear_values(struct
> > > anv_cmd_buffer *cmd_buffer)
> > > >           genX(copy_fast_clear_dwords)(cmd_buffer,
> > > att_state->color.state,
> > > >                                        iview->image,
> > > >                                        VK_IMAGE_ASPECT_COLOR_BIT,
> > > > -                                      iview->planes[0].isl.base_
> level,
> > > >                                        false /* copy to ss */);
> > > >
> > > >           if (need_input_attachment_state(rp_att) &&
> > > > @@ -3241,7 +3410,6 @@ cmd_buffer_subpass_sync_fast_
> clear_values(struct
> > > anv_cmd_buffer *cmd_buffer)
> > > >              genX(copy_fast_clear_dwords)(cmd_buffer,
> > > att_state->input.state,
> > > >                                           iview->image,
> > > >                                           VK_IMAGE_ASPECT_COLOR_BIT,
> > > > -                                         iview->planes[0].isl.base_
> > > level,
> > > >                                           false /* copy to ss */);
> > > >           }
> > > >        }
> > > > --
> > > > 2.5.0.400.gff86faf
> > > >
> > > > _______________________________________________
> > > > mesa-dev mailing list
> > > > mesa-dev at lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180207/58ba0f21/attachment-0001.html>