Mesa (main): isl,docs: Add a chapter on AUX state tracking

GitLab Mirror gitlab-mirror at kemper.freedesktop.org
Thu Jun 24 14:21:03 UTC 2021


Module: Mesa
Branch: main
Commit: b8030ab1ea91cf4ba9ebac80d8d5aebd41f3bf80
URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b8030ab1ea91cf4ba9ebac80d8d5aebd41f3bf80

Author: Jason Ekstrand <jason at jlekstrand.net>
Date:   Fri Jun 18 16:34:44 2021 -0500

isl,docs: Add a chapter on AUX state tracking

We also update and improve the docs in isl.h which get pulled into this
new chapter.

Acked-by: Luis Strano <luis.strano at intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>

---

 docs/isl/aux-surf-comp.rst |  89 +++++++++
 docs/isl/index.rst         |   1 +
 src/intel/isl/isl.h        | 473 +++++++++++++++++++++++++++++----------------
 3 files changed, 394 insertions(+), 169 deletions(-)

diff --git a/docs/isl/aux-surf-comp.rst b/docs/isl/aux-surf-comp.rst
new file mode 100644
index 00000000000..6a101b01142
--- /dev/null
+++ b/docs/isl/aux-surf-comp.rst
@@ -0,0 +1,89 @@
+Auxiliary surface compression
+=============================
+
+Most lossless image compression on Intel hardware, be that CCS, MCS, or HiZ,
+works by way of some chunk of auxiliary data (often a surface) which is used
+together with the main surface to provide compression.  Even though this means
+more memory is allocated, the scheme allows us to reduce our over-all memory
+bandwidth since the auxiliary data is much smaller than the main surface.
+
+The simplest example of this is single-sample fast clears
+(:cpp:enumerator:`isl_aux_usage::ISL_AUX_USAGE_CCS_D`) on Ivy Bridge through
+Broadwell and later.  For this scheme, the auxiliary surface stores a single
+bit for each cache-line-pair in the main surface.  If that bit is set, then the
+entire cache line pair contains only the clear color as provided in the
+``RENDER_SURFACE_STATE`` for the image.  If the bit is unset, then it's not
+clear and you should look at the main surface.  Since a cache line is 64B, this
+yields a scale-down factor of 1:1024.
+
+Even the simple fast-clear scheme saves us bandwidth in two places.  The first
+is when we go to clear the surface.  If we're doing a full-surface clear or
+clearing to the same color that was used to clear before, we don't have to
+touch the main surface at all.  All we have to do is record the clear color and
+smash the aux data to ``0xff``.  The hardware then knows to ignore whatever is
+in the main surface and look at the clear color instead.  The second is when we
+go to render.  Say we're doing some color blending.  Instead of the blend unit
+having to read back actual surface contents to blend with, it looks at the
+clear bit and blends with the clear color recorded with the surface state
+instead.  Depending on the geometry and cache utilization, this can save as
+much as one whole read of the surface worth of bandwidth.
+
+The difficulty with a scheme like this comes when we want to do something else
+with that surface.  What happens if the sampler doesn't support this fast-clear
+scheme (it doesn't on IVB)?  In that case, we have to do a *resolve* where we
+run a special pipeline that reads the auxiliary data and applies it to the main
+surface.  In the case of fast clears, this means that, for every 1 bit in the
+auxiliary surface, the corresponding pair of cache lines in the main surface
+gets filled with the clear color.  At the end of the resolve operation, the
+main surface contents are the actual contents of the surface.
+
+Types of surface compression
+----------------------------
+
+Intel hardware has several different compression schemes that all work along
+similar lines:
+
+.. doxygenenum:: isl_aux_usage
+.. doxygenfunction:: isl_aux_usage_has_fast_clears
+.. doxygenfunction:: isl_aux_usage_has_compression
+.. doxygenfunction:: isl_aux_usage_has_hiz
+.. doxygenfunction:: isl_aux_usage_has_mcs
+.. doxygenfunction:: isl_aux_usage_has_ccs
+
+Creating auxiliary surfaces
+---------------------------
+
+Each type of data compression requires some type of auxiliary data on the side.
+For most, this involves a second auxiliary surface.  ISL provides helpers for
+creating each of these types of surfaces:
+
+.. doxygenfunction:: isl_surf_get_hiz_surf
+.. doxygenfunction:: isl_surf_get_mcs_surf
+.. doxygenfunction:: isl_surf_supports_ccs
+.. doxygenfunction:: isl_surf_get_ccs_surf
+
+Compression state tracking
+--------------------------
+
+All of the Intel auxiliary surface compression schemes share a common concept
+of a main surface which may or may not contain correct up-to-date data and some
+auxiliary data which says how to interpret it.  The main surface is divided
+into blocks of some fixed size and some smaller block in the auxiliary data
+controls how that main surface block is to be interpreted.  We then have to do
+resolves depending on the different HW units which need to interact with a
+given surface.
+
+To help drivers keep track of what all is going on and when resolves need to be
+inserted, ISL provides a finite state machine which tracks the current state of
+the main surface and auxiliary data and their relationship to each other.  The
+states are encoded with the :cpp:enum:`isl_aux_state` enum.  ISL also provides
+helper functions for operating the state machine and determining what aux op
+(if any) is required to get to the right state for a given operation.
+
+.. doxygenenum:: isl_aux_state
+.. doxygenfunction:: isl_aux_state_has_valid_primary
+.. doxygenfunction:: isl_aux_state_has_valid_aux
+.. doxygenenum:: isl_aux_op
+.. doxygenfunction:: isl_aux_prepare_access
+.. doxygenfunction:: isl_aux_state_transition_aux_op
+.. doxygenfunction:: isl_aux_state_transition_write
diff --git a/docs/isl/index.rst b/docs/isl/index.rst
index 8ebf7ddbac8..f2f619f9926 100644
--- a/docs/isl/index.rst
+++ b/docs/isl/index.rst
@@ -12,6 +12,7 @@ Chery.
    units
    formats
    tiling
+   aux-surf-comp
    ccs
    hiz
 
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 49a6c90744b..d393846700d 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -694,82 +694,155 @@ enum isl_dim_layout {
    ISL_DIM_LAYOUT_GFX9_1D,
 };
 
+/**
+ * Enumerates the different forms of auxiliary surface compression
+ */
 enum isl_aux_usage {
    /** No Auxiliary surface is used */
    ISL_AUX_USAGE_NONE,
 
-   /** The primary surface is a depth surface and the auxiliary surface is HiZ */
+   /** Hierarchical depth compression
+    *
+    * First introduced on Iron Lake, this compression scheme compresses depth
+    * surfaces by storing alternate forms of the depth value in a HiZ surface.
+    * Possible (not all) compressed forms include:
+    *
+    *  - An uncompressed "look at the main surface" value
+    *
+    *  - A special value indicating that the main surface data should be
+    *    ignored and considered to contain the clear value.
+    *
+    *  - The depth for the entire main-surface block as a plane equation
+    *
+    *  - The minimum/maximum depth for the main-surface block
+    *
+    * This second one isn't helpful for getting exact depth values but can
+    * still substantially accelerate depth testing if the specified range is
+    * sufficiently small.
+    */
    ISL_AUX_USAGE_HIZ,
 
-   /** The auxiliary surface is an MCS
+   /** Multisampled color compression
+    *
+    * Introduced on Ivy Bridge, this compression scheme compresses
+    * multisampled color surfaces by storing a mapping from samples to planes
+    * in the MCS surface, allowing for de-duplication of identical samples.
+    * The MCS value of all 1's is reserved to indicate that the pixel contains
+    * the clear color. Exact details about the data stored in the MCS and how
+    * it maps samples to slices is documented in the PRMs.
     *
     * @invariant isl_surf::samples > 1
     */
    ISL_AUX_USAGE_MCS,
 
-   /** The auxiliary surface is a fast-clear-only compression surface
+   /** Single-sampled fast-clear-only color compression
     *
+    * Introduced on Ivy Bridge, this compression scheme compresses
+    * single-sampled color surfaces by storing a bit for each cache line pair
+    * in the main surface in the CCS which indicates that the corresponding
+    * pair of cache lines in the main surface only contains the clear color.
+    * On Skylake, this is increased to two bits per cache line pair with 0x0
+    * meaning resolved and 0x3 meaning clear.
+    *
+    * @invariant The surface is a color surface
     * @invariant isl_surf::samples == 1
     */
    ISL_AUX_USAGE_CCS_D,
 
-   /** The auxiliary surface provides full lossless color compression
+   /** Single-sample lossless color compression
+    *
+    * Introduced on Skylake, this compression scheme compresses single-sampled
+    * color surfaces by storing a 2-bit value for each cache line pair in the
+    * main surface which says how the corresponding pair of cache lines in the
+    * main surface are to be interpreted.  Valid CCS values include:
+    *
+    *  - `0x0`: Indicates that the corresponding pair of cache lines in the
+    *    main surface contain valid color data
+    *
+    *  - `0x1`: Indicates that the corresponding pair of cache lines in the
+    *    main surface contain compressed color data.  Typically, the
+    *    compressed data fits in one of the two cache lines.
+    *
+    *  - `0x3`: Indicates that the corresponding pair of cache lines in the
+    *    main surface should be ignored.  Those cache lines should be
+    *    considered to contain the clear color.
+    *
+    * Starting with Tigerlake, each CCS value is 4 bits per cache line pair in
+    * the main surface.
     *
+    * @invariant The surface is a color surface
     * @invariant isl_surf::samples == 1
     */
    ISL_AUX_USAGE_CCS_E,
 
-   /** The auxiliary surface provides full lossless color compression on
-    *  Gfx12.
+   /** Single-sample lossless color compression on Tigerlake
     *
+    * This is identical to ISL_AUX_USAGE_CCS_E except it also encodes the
+    * Tigerlake quirk about regular render writes possibly fast-clearing
+    * blocks in the surface.
+    *
+    * @invariant The surface is a color surface
     * @invariant isl_surf::samples == 1
     */
    ISL_AUX_USAGE_GFX12_CCS_E,
 
-   /** The auxiliary surface provides full lossless media color compression
+   /** Media color compression
+    *
+    * Used by the media engine on Tigerlake and above.  This compression form
+    * is typically not produced by 3D drivers but they need to be able to
+    * consume it in order to get end-to-end compression when the image comes
+    * from media decode.
     *
+    * @invariant The surface is a color surface
     * @invariant isl_surf::samples == 1
     */
    ISL_AUX_USAGE_MC,
 
-   /** The auxiliary surface is a HiZ surface operating in write-through mode
-    *  and CCS is also enabled
+   /** Combined HiZ+CCS in write-through mode
     *
-    * In this mode, the HiZ and CCS surfaces act as a single fused compression
-    * surface where resolves (but not ambiguates) operate on both surfaces at
-    * the same time.  In this mode, the HiZ surface operates in write-through
-    * mode where it is only used for accelerating depth testing and not for
-    * actual compression.  The CCS-compressed surface contains valid data at
-    * all times.
+    * In this mode, introduced on Tigerlake, the HiZ and CCS surfaces act as a
+    * single fused compression surface where resolves (but not ambiguates)
+    * operate on both surfaces at the same time.  In this mode, the HiZ
+    * surface operates in write-through mode where it is only used for
+    * accelerating depth testing and not for actual compression.  The
+    * CCS-compressed surface contains valid data at all times.
     *
+    * @invariant The surface is a color surface
     * @invariant isl_surf::samples == 1
     */
    ISL_AUX_USAGE_HIZ_CCS_WT,
 
-   /** The auxiliary surface is a HiZ surface with and CCS is also enabled
+   /** Combined HiZ+CCS without write-through
     *
-    * In this mode, the HiZ and CCS surfaces act as a single fused compression
-    * surface where resolves (but not ambiguates) operate on both surfaces at
-    * the same time.  In this mode, full HiZ compression is enabled and the
-    * CCS-compressed main surface may not contain valid data.  The only way to
-    * read the surface outside of the depth hardware is to do a full resolve
-    * which resolves both HiZ and CCS so the surface is in the pass-through
-    * state.
+    * In this mode, introduced on Tigerlake, the HiZ and CCS surfaces act as a
+    * single fused compression surface where resolves (but not ambiguates)
+    * operate on both surfaces at the same time.  In this mode, full HiZ
+    * compression is enabled and the CCS-compressed main surface may not
+    * contain valid data.  The only way to read the surface outside of the
+    * depth hardware is to do a full resolve which resolves both HiZ and CCS
+    * so the surface is in the pass-through state.
+    *
+    * @invariant The surface is a depth surface
     */
    ISL_AUX_USAGE_HIZ_CCS,
 
-   /** The auxiliary surface is an MCS and CCS is also enabled
+   /** Combined MCS+CCS without write-through
     *
-    * In this mode, we have fused MCS+CCS compression where the MCS is used
-    * for fast-clears and "identical samples" compression just like on Gfx7-11
-    * but each plane is then CCS compressed.
+    * In this mode, introduced on Tigerlake, we have fused MCS+CCS compression
+    * where the MCS is used for fast-clears and "identical samples"
+    * compression just like on Gfx7-11 but each plane is then CCS compressed.
     *
+    * @invariant The surface is a depth surface
     * @invariant isl_surf::samples > 1
     */
    ISL_AUX_USAGE_MCS_CCS,
 
-   /** CCS auxiliary data is used to compress a stencil buffer
+   /** Stencil compression
+    *
+    * Introduced on Tigerlake, this is similar to CCS_E only used to compress
+    * stencil surfaces.
     *
+    * @invariant The surface is a stencil surface
     * @invariant isl_surf::samples == 1
     */
    ISL_AUX_USAGE_STC_CCS,
@@ -780,112 +853,58 @@ enum isl_aux_usage {
  *
  * For any given auxiliary surface compression format (HiZ, CCS, or MCS), any
  * given slice (lod + array layer) can be in one of the seven states described
- * by this enum.  Draw and resolve operations may cause the slice to change
- * from one state to another.  The six valid states are:
- *
- *    1) Clear:  In this state, each block in the auxiliary surface contains a
- *       magic value that indicates that the block is in the clear state.  If
- *       a block is in the clear state, it's values in the primary surface are
- *       ignored and the color of the samples in the block is taken either the
- *       RENDER_SURFACE_STATE packet for color or 3DSTATE_CLEAR_PARAMS for
- *       depth.  Since neither the primary surface nor the auxiliary surface
- *       contains the clear value, the surface can be cleared to a different
- *       color by simply changing the clear color without modifying either
- *       surface.
- *
- *    2) Partial Clear:  In this state, each block in the auxiliary surface
- *       contains either the magic clear or pass-through value.  See Clear and
- *       Pass-through for more details.
- *
- *    3) Compressed w/ Clear:  In this state, neither the auxiliary surface
- *       nor the primary surface has a complete representation of the data.
- *       Instead, both surfaces must be used together or else rendering
- *       corruption may occur.  Depending on the auxiliary compression format
- *       and the data, any given block in the primary surface may contain all,
- *       some, or none of the data required to reconstruct the actual sample
- *       values.  Blocks may also be in the clear state (see Clear) and have
- *       their value taken from outside the surface.
- *
- *    4) Compressed w/o Clear:  This state is identical to the state above
- *       except that no blocks are in the clear state.  In this state, all of
- *       the data required to reconstruct the final sample values is contained
- *       in the auxiliary and primary surface and the clear value is not
- *       considered.
- *
- *    5) Resolved:  In this state, the primary surface contains 100% of the
- *       data.  The auxiliary surface is also valid so the surface can be
- *       validly used with or without aux enabled.  The auxiliary surface may,
- *       however, contain non-trivial data and any update to the primary
- *       surface with aux disabled will cause the two to get out of sync.
- *
- *    6) Pass-through:  In this state, the primary surface contains 100% of the
- *       data and every block in the auxiliary surface contains a magic value
- *       which indicates that the auxiliary surface should be ignored and the
- *       only the primary surface should be considered.  Updating the primary
- *       surface without aux works fine and can be done repeatedly in this
- *       mode.  Writing to a surface in pass-through mode with aux enabled may
- *       cause the auxiliary buffer to contain non-trivial data and no longer
- *       be in the pass-through state.
- *
- *    7) Aux Invalid:  In this state, the primary surface contains 100% of the
- *       data and the auxiliary surface is completely bogus.  Any attempt to
- *       use the auxiliary surface is liable to result in rendering
- *       corruption.  The only thing that one can do to re-enable aux once
- *       this state is reached is to use an ambiguate pass to transition into
- *       the pass-through state.
- *
- * Drawing with or without aux enabled may implicitly cause the surface to
- * transition between these states.  There are also four types of auxiliary
- * compression operations which cause an explicit transition which are
- * described by the isl_aux_op enum below.
+ * by this enum. Drawing with or without aux enabled may implicitly cause the
+ * surface to transition between these states.  There are also four types of
+ * auxiliary compression operations which cause an explicit transition which
+ * are described by the isl_aux_op enum below.
  *
  * Not all operations are valid or useful in all states.  The diagram below
  * contains a complete description of the states and all valid and useful
  * transitions except clear.
  *
- *   Draw w/ Aux
- *   +----------+
- *   |          |
- *   |       +-------------+    Draw w/ Aux     +-------------+
- *   +------>| Compressed  |<-------------------|    Clear    |
- *           |  w/ Clear   |----->----+         |             |
- *           +-------------+          |         +-------------+
- *                  |  /|\            |            |   |
- *                  |   |             |            |   |
- *                  |   |             +------<-----+   |  Draw w/
- *                  |   |             |                | Clear Only
- *                  |   |      Full   |                |   +----------+
- *          Partial |   |     Resolve |               \|/  |          |
- *          Resolve |   |             |         +-------------+       |
- *                  |   |             |         |   Partial   |<------+
- *                  |   |             |         |    Clear    |<----------+
- *                  |   |             |         +-------------+           |
- *                  |   |             |                |                  |
- *                  |   |             +------>---------+  Full            |
- *                  |   |                              | Resolve          |
- *   Draw w/ aux    |   |   Partial Fast Clear         |                  |
- *   +----------+   |   +--------------------------+   |                  |
- *   |          |  \|/                             |  \|/                 |
- *   |       +-------------+    Full Resolve    +-------------+           |
- *   +------>| Compressed  |------------------->|  Resolved   |           |
- *           |  w/o Clear  |<-------------------|             |           |
- *           +-------------+    Draw w/ Aux     +-------------+           |
- *                 /|\                             |   |                  |
- *                  |  Draw                        |   |  Draw            |
- *                  | w/ Aux                       |   | w/o Aux          |
- *                  |            Ambiguate         |   |                  |
- *                  |   +--------------------------+   |                  |
- *   Draw w/o Aux   |   |                              |   Draw w/o Aux   |
- *   +----------+   |   |                              |   +----------+   |
- *   |          |   |  \|/                            \|/  |          |   |
- *   |       +-------------+     Ambiguate      +-------------+       |   |
- *   +------>|    Pass-    |<-------------------|     Aux     |<------+   |
- *   +------>|   through   |                    |   Invalid   |           |
- *   |       +-------------+                    +-------------+           |
- *   |          |   |                                                     |
- *   +----------+   +-----------------------------------------------------+
- *     Draw w/                       Partial Fast Clear
- *    Clear Only
+ *     Draw w/ Aux
+ *     +----------+
+ *     |          |
+ *     |       +-------------+    Draw w/ Aux     +-------------+
+ *     +------>| Compressed  |<-------------------|    Clear    |
+ *             |  w/ Clear   |----->----+         |             |
+ *             +-------------+          |         +-------------+
+ *                    |  /|\            |            |   |
+ *                    |   |             |            |   |
+ *                    |   |             +------<-----+   |  Draw w/
+ *                    |   |             |                | Clear Only
+ *                    |   |      Full   |                |   +----------+
+ *            Partial |   |     Resolve |               \|/  |          |
+ *            Resolve |   |             |         +-------------+       |
+ *                    |   |             |         |   Partial   |<------+
+ *                    |   |             |         |    Clear    |<----------+
+ *                    |   |             |         +-------------+           |
+ *                    |   |             |                |                  |
+ *                    |   |             +------>---------+  Full            |
+ *                    |   |                              | Resolve          |
+ *     Draw w/ aux    |   |   Partial Fast Clear         |                  |
+ *     +----------+   |   +--------------------------+   |                  |
+ *     |          |  \|/                             |  \|/                 |
+ *     |       +-------------+    Full Resolve    +-------------+           |
+ *     +------>| Compressed  |------------------->|  Resolved   |           |
+ *             |  w/o Clear  |<-------------------|             |           |
+ *             +-------------+    Draw w/ Aux     +-------------+           |
+ *                   /|\                             |   |                  |
+ *                    |  Draw                        |   |  Draw            |
+ *                    | w/ Aux                       |   | w/o Aux          |
+ *                    |            Ambiguate         |   |                  |
+ *                    |   +--------------------------+   |                  |
+ *     Draw w/o Aux   |   |                              |   Draw w/o Aux   |
+ *     +----------+   |   |                              |   +----------+   |
+ *     |          |   |  \|/                            \|/  |          |   |
+ *     |       +-------------+     Ambiguate      +-------------+       |   |
+ *     +------>|    Pass-    |<-------------------|     Aux     |<------+   |
+ *     +------>|   through   |                    |   Invalid   |           |
+ *     |       +-------------+                    +-------------+           |
+ *     |          |   |                                                     |
+ *     +----------+   +-----------------------------------------------------+
+ *       Draw w/                       Partial Fast Clear
+ *      Clear Only
  *
  *
  * While the above general theory applies to all forms of auxiliary
@@ -893,63 +912,135 @@ enum isl_aux_usage {
  * on all compression types.  However, each of the auxiliary states and
  * operations can be fairly easily mapped onto the above diagram:
  *
- * HiZ:     Hierarchical depth compression is capable of being in any of the
- *          states above.  Hardware provides three HiZ operations: "Depth
- *          Clear", "Depth Resolve", and "HiZ Resolve" which map to "Fast
- *          Clear", "Full Resolve", and "Ambiguate" respectively.  The
- *          hardware provides no HiZ partial resolve operation so the only way
- *          to get into the "Compressed w/o Clear" state is to render with HiZ
- *          when the surface is in the resolved or pass-through states.
+ * **HiZ:** Hierarchical depth compression is capable of being in any of
+ * the states above.  Hardware provides three HiZ operations: "Depth
+ * Clear", "Depth Resolve", and "HiZ Resolve" which map to "Fast Clear",
+ * "Full Resolve", and "Ambiguate" respectively.  The hardware provides no
+ * HiZ partial resolve operation so the only way to get into the
+ * "Compressed w/o Clear" state is to render with HiZ when the surface is
+ * in the resolved or pass-through states.
  *
- * MCS:     Multisample compression is technically capable of being in any of
- *          the states above except that most of them aren't useful.  Both the
- *          render engine and the sampler support MCS compression and, apart
- *          from clear color, MCS is format-unaware so we leave the surface
- *          compressed 100% of the time.  The hardware provides no MCS
- *          operations.
+ * **MCS:** Multisample compression is technically capable of being in any of
+ * the states above except that most of them aren't useful.  Both the render
+ * engine and the sampler support MCS compression and, apart from clear color,
+ * MCS is format-unaware so we leave the surface compressed 100% of the time.
+ * The hardware provides no MCS operations.
  *
- * CCS_D:   Single-sample fast-clears (also called CCS_D in ISL) are one of
- *          the simplest forms of compression since they don't do anything
- *          beyond clear color tracking.  They really only support three of
- *          the six states: Clear, Partial Clear, and Pass-through.  The
- *          only CCS_D operation is "Resolve" which maps to a full resolve
- *          followed by an ambiguate.
+ * **CCS_D:** Single-sample fast-clears (also called CCS_D in ISL) are one of
+ * the simplest forms of compression since they don't do anything beyond clear
+ * color tracking.  They really only support three of the six states: Clear,
+ * Partial Clear, and Pass-through.  The only CCS_D operation is "Resolve"
+ * which maps to a full resolve followed by an ambiguate.
  *
- * CCS_E:   Single-sample render target compression (also called CCS_E in ISL)
- *          is capable of being in almost all of the above states.  THe only
- *          exception is that it does not have separate resolved and pass-
- *          through states.  Instead, the CCS_E full resolve operation does
- *          both a resolve and an ambiguate so it goes directly into the
- *          pass-through state.  CCS_E also provides fast clear and partial
- *          resolve operations which work as described above.
+ * **CCS_E:** Single-sample render target compression (also called CCS_E in
+ * ISL) is capable of being in almost all of the above states.  THe only
+ * exception is that it does not have separate resolved and pass- through
+ * states.  Instead, the CCS_E full resolve operation does both a resolve and
+ * an ambiguate so it goes directly into the pass-through state.  CCS_E also
+ * provides fast clear and partial resolve operations which work as described
+ * above.
  *
- *          While it is technically possible to perform a CCS_E ambiguate, it
- *          is not provided by Sky Lake hardware so we choose to avoid the aux
- *          invalid state.  If the aux invalid state were determined to be
- *          useful, a CCS ambiguate could be done by carefully rendering to
- *          the CCS and filling it with zeros.
+ * @note
+ * The state machine above isn't quite correct for CCS on TGL.  There is a HW
+ * bug (or feature, depending on who you ask) which can cause blocks to enter
+ * the fast-clear state as a side-effect of a regular draw call.  This means
+ * that a draw in the resolved or compressed without clear states takes you to
+ * the compressed with clear state, not the compressed without clear state.
  */
 enum isl_aux_state {
 #ifdef IN_UNIT_TEST
    ISL_AUX_STATE_ASSERT,
 #endif
+   /** Clear
+    *
+    * In this state, each block in the auxiliary surface contains a magic
+    * value that indicates that the block is in the clear state.  If a block
+    * is in the clear state, its values in the primary surface are ignored
+    * and the color of the samples in the block is taken either the
+    * RENDER_SURFACE_STATE packet for color or 3DSTATE_CLEAR_PARAMS for depth.
+    * Since neither the primary surface nor the auxiliary surface contains the
+    * clear value, the surface can be cleared to a different color by simply
+    * changing the clear color without modifying either surface.
+    */
    ISL_AUX_STATE_CLEAR,
+
+   /** Partial Clear
+    *
+    * In this state, each block in the auxiliary surface contains either the
+    * magic clear or pass-through value.  See Clear and Pass-through for more
+    * details.
+    */
    ISL_AUX_STATE_PARTIAL_CLEAR,
+
+   /** Compressed with clear color
+    *
+    * In this state, neither the auxiliary surface nor the primary surface has
+    * a complete representation of the data. Instead, both surfaces must be
+    * used together or else rendering corruption may occur.  Depending on the
+    * auxiliary compression format and the data, any given block in the
+    * primary surface may contain all, some, or none of the data required to
+    * reconstruct the actual sample values.  Blocks may also be in the clear
+    * state (see Clear) and have their value taken from outside the surface.
+    */
    ISL_AUX_STATE_COMPRESSED_CLEAR,
+
+   /** Compressed without clear color
+    *
+    * This state is identical to the state above except that no blocks are in
+    * the clear state.  In this state, all of the data required to reconstruct
+    * the final sample values is contained in the auxiliary and primary
+    * surface and the clear value is not considered.
+    */
    ISL_AUX_STATE_COMPRESSED_NO_CLEAR,
+
+   /** Resolved
+    *
+    * In this state, the primary surface contains 100% of the data.  The
+    * auxiliary surface is also valid so the surface can be validly used with
+    * or without aux enabled.  The auxiliary surface may, however, contain
+    * non-trivial data and any update to the primary surface with aux disabled
+    * will cause the two to get out of sync.
+    */
    ISL_AUX_STATE_RESOLVED,
+
+   /** Pass-through
+    *
+    * In this state, the primary surface contains 100% of the data and every
+    * block in the auxiliary surface contains a magic value which indicates
+    * that the auxiliary surface should be ignored and only the primary
+    * surface should be considered.  In this mode, the primary surface can
+    * safely be written with ISL_AUX_USAGE_NONE or by something that ignores
+    * compression such as the blit/copy engine or a CPU map and it will stay
+    * in the pass-through state.  Writing to a surface in pass-through mode
+    * with aux enabled may cause the auxiliary to be updated to contain
+    * non-trivial data and it will no longer be in the pass-through state.
+    * Likely, it will end up compressed, with or without clear color.
+    */
    ISL_AUX_STATE_PASS_THROUGH,
+
+   /** Aux Invalid
+    *
+    * In this state, the primary surface contains 100% of the data and the
+    * auxiliary surface is completely bogus.  Any attempt to use the auxiliary
+    * surface is liable to result in rendering corruption.  The only thing
+    * that one can do to re-enable aux once this state is reached is to use an
+    * ambiguate pass to transition into the pass-through state.
+    */
    ISL_AUX_STATE_AUX_INVALID,
 };
 
-/**
- * Enum which describes explicit aux transition operations.
+/** Enum describing explicit aux transition operations
+ *
+ * These operations are used to transition from one isl_aux_state to another.
+ * Even though a draw does transition the state machine, it's not included in
+ * this enum as it's something of a special case.
  */
 enum isl_aux_op {
 #ifdef IN_UNIT_TEST
    ISL_AUX_OP_ASSERT,
 #endif
 
+   /** Do nothing */
    ISL_AUX_OP_NONE,
 
    /** Fast Clear
@@ -1927,9 +2018,10 @@ isl_tiling_from_i915_tiling(uint32_t tiling);
  * Return an isl_aux_op needed to enable an access to occur in an
  * isl_aux_state suitable for the isl_aux_usage.
  *
- * NOTE: If the access will invalidate the main surface, this function should
- *       not be called and the isl_aux_op of NONE should be used instead.
- *       Otherwise, an extra (but still lossless) ambiguate may occur.
+ * @note
+ * If the access will invalidate the main surface, this function should not be
+ * called and the isl_aux_op of NONE should be used instead. Otherwise, an
+ * extra (but still lossless) ambiguate may occur.
  *
  * @invariant initial_state is possible with an isl_aux_usage compatible with
  *            the given usage. Two usages are compatible if it's possible to
@@ -1956,9 +2048,10 @@ isl_aux_state_transition_aux_op(enum isl_aux_state initial_state,
 /**
  * Return the isl_aux_state entered after performing a write.
  *
- * NOTE: full_surface should be true if the write covers the entire
- *       slice. Setting it to false in this case will still result in a
- *       correct (but imprecise) aux state.
+ * @note
+ * full_surface should be true if the write covers the entire slice. Setting
+ * it to false in this case will still result in a correct (but imprecise) aux
+ * state.
  *
  * @invariant if usage is not ISL_AUX_USAGE_NONE, then initial_state is
  *            possible with the given usage.
@@ -2210,27 +2303,69 @@ void
 isl_surf_get_tile_info(const struct isl_surf *surf,
                        struct isl_tile_info *tile_info);
 
+/**
+ * @param[in]  surf              The main surface
+ * @param[in]  hiz_or_mcs_surf   HiZ or MCS surface associated with the main
+ *                               surface
+ * @returns true if the given surface supports CCS.
+ */
 bool
 isl_surf_supports_ccs(const struct isl_device *dev,
                       const struct isl_surf *surf,
                       const struct isl_surf *hiz_or_mcs_surf);
 
+/** Constructs a HiZ surface for the given main surface.
+ *
+ * @param[in]  surf     The main surface
+ * @param[out] hiz_surf The HiZ surface to populate on success
+ * @returns false if the main surface cannot support HiZ.
+ */
 bool
 isl_surf_get_hiz_surf(const struct isl_device *dev,
                       const struct isl_surf *surf,
                       struct isl_surf *hiz_surf);
 
+/** Constructs a MCS for the given main surface.
+ *
+ * @param[in]  surf     The main surface
+ * @param[out] mcs_surf The MCS to populate on success
+ * @returns false if the main surface cannot support MCS.
+ */
 bool
 isl_surf_get_mcs_surf(const struct isl_device *dev,
                       const struct isl_surf *surf,
                       struct isl_surf *mcs_surf);
 
+/** Constructs a CCS for the given main surface.
+ *
+ * @note
+ * Starting with Tigerlake, the CCS is no longer really a surface.  It's not
+ * laid out as an independent surface and isn't referenced by
+ * RENDER_SURFACE_STATE::"Auxiliary Surface Base Address" like other auxiliary
+ * compression surfaces.  It's a blob of memory that's a 1:256 scale-down from
+ * the main surfaced that's attached side-band via a second set of page
+ * tables.
+ *
+ * @par
+ * In spite of this, it's sometimes useful to think of it as being a linear
+ * buffer-like surface, at least for the purposes of allocation.  When invoked
+ * on Tigerlake or later, this function still works and produces such a linear
+ * surface.
+ *
+ * @param[in]  surf              The main surface
+ * @param[in]  hiz_or_mcs_surf   HiZ or MCS surface associated with the main
+ *                               surface
+ * @param[out] ccs_surf          The CCS to populate on success
+ * @param row_pitch_B:           The row pitch for the CCS in bytes or 0 if
+ *                               ISL should calculate the row pitch.
+ * @returns false if the main surface cannot support CCS.
+ */
 bool
 isl_surf_get_ccs_surf(const struct isl_device *dev,
                       const struct isl_surf *surf,
                       const struct isl_surf *hiz_or_mcs_surf,
                       struct isl_surf *ccs_surf,
-                      uint32_t row_pitch_B /**< Ignored if 0 */);
+                      uint32_t row_pitch_B);
 
 #define isl_surf_fill_state(dev, state, ...) \
    isl_surf_fill_state_s((dev), (state), \



More information about the mesa-commit mailing list