[Mesa-dev] [PATCH 16/22] anv/cmd_buffer: Add transition_color_buffer()

Wed May 10 22:29:17 UTC 2017

On Wed, May 10, 2017 at 02:00:19PM -0700, Jason Ekstrand wrote:
> On Wed, May 10, 2017 at 1:35 PM, Nanley Chery <nanleychery at gmail.com> wrote:
> 
> > On Wed, May 10, 2017 at 12:38:50PM -0700, Jason Ekstrand wrote:
> > > On Wed, May 10, 2017 at 11:30 AM, Nanley Chery <nanleychery at gmail.com>
> > > wrote:
> > >
> > > > On Wed, May 03, 2017 at 02:24:19PM -0700, Jason Ekstrand wrote:
> > > > > On Thu, Apr 27, 2017 at 11:32 AM, Nanley Chery <
> > nanleychery at gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Signed-off-by: Nanley Chery <nanley.g.chery at intel.com>
> > > > > > ---
> > > > > >  src/intel/vulkan/genX_cmd_buffer.c | 93
> > > > ++++++++++++++++++++++++++++++
> > > > > > ++++++++
> > > > > >  1 file changed, 93 insertions(+)
> > > > > >
> > > > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> > > > > > b/src/intel/vulkan/genX_cmd_buffer.c
> > > > > > index d5cc358aec..1ae0c3256e 100644
> > > > > > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > > > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > > > > > @@ -454,6 +454,99 @@ genX(transfer_clear_value)(struct
> > anv_cmd_buffer
> > > > *
> > > > > > const cmd_buffer,
> > > > > >     }
> > > > > >  }
> > > > > >
> > > > > > +/* Transitions a color buffer from one layout to another. */
> > > > > > +static void
> > > > > > +transition_color_buffer(struct anv_cmd_buffer * const cmd_buffer,
> > > > > > +                        const struct anv_image * const image,
> > > > > > +                        const uint32_t base_level, uint32_t
> > > > level_count,
> > > > > > +                        const uint32_t base_layer, uint32_t
> > > > layer_count,
> > > > > > +                        const VkImageLayout initial_layout,
> > > > > > +                        const VkImageLayout final_layout)
> > > > > > +{
> > > > > > +   assert(cmd_buffer && image);
> > > > > > +
> > > > > > +   /* This must be a color image. */
> > > > > > +   assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
> > > > > > +
> > > > > > +   /* Only color buffers with CCS need resolving.  */
> > > > > > +   if (image->aux_surface.isl.size == 0 || image->samples > 1)
> > > > > > +      return;
> > > > > > +
> > > > > > +   /* Don't transition this subresource range if it lacks
> > auxiliary
> > > > data.
> > > > > > */
> > > > > > +   if (base_level >= anv_color_aux_levels(image) ||
> > > > > > +       base_layer >= anv_color_aux_layers(image, base_level))
> > > > > > +      return;
> > > > > > +
> > > > > > +   /* The undefined layout indicates that the user doesn't care
> > about
> > > > the
> > > > > > +    * data that's currently in the buffer. The pre-initialized
> > layout
> > > > is
> > > > > > +    * equivalent to the undefined layout for optimally-tiled
> > images.
> > > > > > +    *
> > > > > > +    * We can only skip the resolve for CCS_E images in this layout
> > > > > > because it
> > > > > > +    * is enabled outside of render passes. This allows previously
> > > > > > fast-cleared
> > > > > > +    * and undefined buffers to be defined with transfer
> > operations.
> > > > > > +    */
> > > > > > +   const bool is_ccs_e = image->aux_usage == ISL_AUX_USAGE_CCS_E;
> > > > > > +   const bool undef_layout = initial_layout ==
> > > > VK_IMAGE_LAYOUT_UNDEFINED
> > > > > > ||
> > > > > > +                             initial_layout == VK_IMAGE_LAYOUT_
> > > > > > PREINITIALIZED;
> > > > > > +   if (is_ccs_e && undef_layout)
> > > > > > +      return;
> > > > > > +
> > > > > > +   /* A resolve isn't necessary when transitioning from a layout
> > that
> > > > > > doesn't
> > > > > > +    * have fast-clear data or to a layout that will be aware of
> > the
> > > > > > fast-clear
> > > > > > +    * value.
> > > > > > +    */
> > > > > > +   const bool maybe_fast_cleared = undef_layout || initial_layout
> > ==
> > > > > > +                                   VK_IMAGE_LAYOUT_COLOR_
> > > > > > ATTACHMENT_OPTIMAL;
> > > > > > +   if (!maybe_fast_cleared || final_layout ==
> > > > > > +       VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL)
> > > > > > +      return;
> > > > > > +
> > > > > > +   /* Determine the optimal resolve operation. For now we only
> > need to
> > > > > > resolve
> > > > > > +    * the clear color.
> > > > > > +    */
> > > > > > +   const enum blorp_fast_clear_op op = is_ccs_e ?
> > > > > > +
> >  BLORP_FAST_CLEAR_OP_RESOLVE_
> > > > PARTIAL
> > > > > > :
> > > > > > +
> >  BLORP_FAST_CLEAR_OP_RESOLVE_
> > > > FULL;
> > > > > > +
> > > > > > +   /* The actual range that will be transitioned is limited by the
> > > > number
> > > > > > of
> > > > > > +    * subresources that have auxiliary data.
> > > > > > +    */
> > > > > > +   level_count = MIN2(level_count, anv_color_aux_levels(image));
> > > > > > +
> > > > > > +   /* A write cache flush with an end-of-pipe sync is required
> > between
> > > > > > +    * rendering, clearing, and resolving operations. Perform a
> > flush
> > > > of
> > > > > > the
> > > > > > +    * write cache before and after the resolve operation to meet
> > this
> > > > > > +    * requirement.
> > > > > > +    */
> > > > > >
> > > > >
> > > > > anv_blorp.c has a more detailed comment about this.  I think I'd
> > rather
> > > > we
> > > > > copy it here so that it doesn't get lost when we delete
> > > > > ccs_resolve_attachments.
> > > > >
> > > > > Also, I have no idea whether we need to do it per-fast-clear-op or
> > once
> > > > > before we do fast-clear ops and once afterwards.  I think this is
> > > > probably
> > > > > fine.
> > > > >
> > > > >
> > > >
> > > > Contrary to the comment in anv_blorp, I found documentation that
> > > > provides us more information as to what's needed. How about I rewrite
> > my
> > > > comment to this instead?
> > > >
> > > >    From the Sky Lake PRM Vol. 7, "MCS Buffer for Render Target(s)":
> > > >
> > > >       Any transition from any value in {Clear, Render, Resolve} to a
> > > >       different value in {Clear, Render, Resolve} requires end of pipe
> > > >       synchronization.
> > > >
> > > >    Perform a flush of the write cache before and after the resolve
> > > >    operation to meet this requirement.
> > > >
> > >
> > > That PRM citation is fine and maybe better than the one I used (though
> > they
> > > say about the same thing).  The part I was hoping to preserve is
> > >
> > >     * This comment is a bit cryptic and doesn't really tell you what's
> > going
> > >     * or what's really needed.  It appears that fast clear ops are not
> > >     * properly synchronized with other drawing.  We need to use a
> > > PIPE_CONTROL
> > >     * to ensure that the contents of the previous draw hit the render
> > target
> > >     * before we resolve and then use a second PIPE_CONTROL after the
> > resolve
> > >     * to ensure that it is completed before any additional drawing
> > occurs.
> > >
> > > Yes, the PRM text tells you that an end of pipe sync is needed, but it
> > > doesn't tell you where.  It's empirical evidence from hunting down bugs
> > > that tells us we need it on both sides.
> > >
> >
> > How does the PRM text not tell you where the end of pipe sync is needed?
> > It states that we must perform the sync between any value in {Clear,
> > Render, Resolve} to a different value in {Clear, Render, Resolve}. This
> > means that we're left with two options:
> >
> > 1. Track every clear, render, resolve operation and perform the sync
> >    in between as described (should be easy in GL).
> > 2. Track none of the operations and perform the sync around 2 of the
> >    three operations (only feasible option in Vulkan).
> >
> 
> Maybe it does contain all of the data.  However, even reading the PRM bit
> in context, it's still seems cryptic.  What does it mean by "different
> value"?  Is it talking about surface values or values in the bits in
> 3DSTATE_PS?

The comment is using set notation (curly braces) to describe a set of
all operations that can occur on a CCS buffer. So by "different value in
{[...]}", it's referring to an element in the set consisting of {Clear,
Render, Resolve}.

> How about:
> 
> 
>    From the Sky Lake PRM Vol. 7, "MCS Buffer for Render Target(s)":
> 
>       Any transition from any value in {Clear, Render, Resolve} to a
>       different value in {Clear, Render, Resolve} requires end of pipe
>       synchronization.
> 
>    In other words, fast clear ops are not properly synchronized with other
>    drawing.  We need to use a PIPE_CONTROL to ensure that the contents of
>    the previous draw hit the render target before we resolve and again
> afterwards
>    to ensure that the resolve is complete before we do any more regular
> drawing.
> 
> In general, I'd rather the comment describe what's going on in the hardware
> and why we need to do the flush rather than just "the PRM says so"
> 

I think your hypothesis on the hardware's behaviour follows naturally
from the specified requirement. I'd like to change the wording because
it's seemingly claiming to be a fact and I don't know if this is
actually what's going on in the hardware. What do you think of this:

    From the Sky Lake PRM Vol. 7, "MCS Buffer for Render Target(s)":

       Any transition from any value in {Clear, Render, Resolve} to a
       different value in {Clear, Render, Resolve} requires end of pipe
       synchronization.

    We perform a flush of the write cache before and after the clear and
    resolve operations to meet this requirement.

    Unlike other drawing, it seems that fast clear operations are not
    properly synchronized. The first PIPE_CONTROL here likely ensures
    that the contents of the previous render or clear hit the render
    target before we resolve and the second likely ensures that the
    resolve is complete before we do any more rendering or clearing.

Changelog:
 * Acknowledge that this is a proposed explanation of hardware behavior.
 * Acknowledge that fast clear ops don't seem to be properly
   synchronized with themselves (clears and resolves).
 * Use "clear" and/or "render" instead of "draw" where appropriate.