[Mesa-dev] [PATCH 11/15] i965/msaa: Properly handle sliced layout for Gen7.
Kenneth Graunke
kenneth at whitecape.org
Wed May 23 09:47:21 PDT 2012
On 05/23/2012 08:01 AM, Paul Berry wrote:
> On 22 May 2012 09:21, Paul Berry <stereotype441 at gmail.com
> <mailto:stereotype441 at gmail.com>> wrote:
>
> On 22 May 2012 08:19, Kenneth Graunke <kenneth at whitecape.org
> <mailto:kenneth at whitecape.org>> wrote:
>
> On 05/11/2012 11:03 AM, Paul Berry wrote:
>
> Starting in Gen7, there are two possible layouts for MSAA
> surfaces:
>
> - Interleaved, in which additional samples are accommodated
> by scaling
> up the width and height of the surface. This is the only
> layout
> available in Gen6. On Gen7 it is used for depth and stencil
> surfaces only.
>
> - Sliced, in which the surface is stored as a 2D array, with
> array
> slice n containing all pixel data for sample n. On Gen7
> this layout
> is used for color surfaces.
>
> The "Sliced" layout has an additional requirement: it must
> be used in
> ARYSPC_LOD0 mode, which means that the surface doesn't leave
> any extra
> room between array slices for miplevels other than 0.
>
> This patch modifies the surface allocation functions to use the
> correct layout when allocating MSAA surfaces in Gen7, and to
> set the
> array offsets properly when using ARYSPC_LOD0 mode. It also
> modifies
> the code that populates SURFACE_STATE structures to ensure that
> ARYSPC_LOD0 mode is selected in the appropriate circumstances.
> ---
> src/mesa/drivers/dri/i965/brw___blorp.cpp | 1 +
> src/mesa/drivers/dri/i965/brw___blorp.h | 5 +
> src/mesa/drivers/dri/i965/brw___tex_layout.c | 10 ++-
> src/mesa/drivers/dri/i965/brw___wm_surface_state.c | 3 +-
> src/mesa/drivers/dri/i965/__gen7_blorp.cpp | 2 +
> src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c | 10 ++
> src/mesa/drivers/dri/intel/__intel_mipmap_tree.c | 102
> ++++++++++++++++-----
> src/mesa/drivers/dri/intel/__intel_mipmap_tree.h | 23
> +++++-
> src/mesa/drivers/dri/intel/__intel_tex_image.c | 3 +-
> src/mesa/drivers/dri/intel/__intel_tex_validate.c | 3 +-
> 10 files changed, 134 insertions(+), 28 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/__brw_blorp.cpp
> b/src/mesa/drivers/dri/i965/__brw_blorp.cpp
> index f6aff44..84f92b4 100644
> --- a/src/mesa/drivers/dri/i965/__brw_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/__brw_blorp.cpp
> @@ -58,6 +58,7 @@ brw_blorp_surface_info::set(__struct
> intel_mipmap_tree *mt,
> {
> brw_blorp_mip_info::set(mt, level, layer);
> this->num_samples = mt->num_samples;
> + this->array_spacing_lod0 = mt->array_spacing_lod0;
>
> if (mt->format == MESA_FORMAT_S8) {
> /* The miptree is a W-tiled stencil buffer. Surface
> states can't be set
> diff --git a/src/mesa/drivers/dri/i965/__brw_blorp.h
> b/src/mesa/drivers/dri/i965/__brw_blorp.h
> index 8dabf2c..3e37081 100644
> --- a/src/mesa/drivers/dri/i965/__brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/__brw_blorp.h
> @@ -99,6 +99,11 @@ public:
> bool map_stencil_as_y_tiled;
>
> unsigned num_samples;
> +
> + /* Setting this flag indicates that the surface should
> be set up in
> + * ARYSPC_LOD0 mode. Ignored prior to Gen7.
> + */
> + bool array_spacing_lod0;
> };
>
>
> diff --git a/src/mesa/drivers/dri/i965/__brw_tex_layout.c
> b/src/mesa/drivers/dri/i965/__brw_tex_layout.c
> index 8bf1d3d..f742131 100644
> --- a/src/mesa/drivers/dri/i965/__brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/__brw_tex_layout.c
> @@ -49,7 +49,10 @@ brw_miptree_layout_texture___array(struct
> intel_context *intel,
>
> h0 = ALIGN(mt->height0, mt->align_h);
> h1 = ALIGN(minify(mt->height0), mt->align_h);
> - qpitch = (h0 + h1 + (intel->gen>= 7 ? 12 : 11) *
> mt->align_h);
> + if (mt->array_spacing_lod0)
> + qpitch = h0;
> + else
> + qpitch = (h0 + h1 + (intel->gen>= 7 ? 12 : 11) *
> mt->align_h);
> if (mt->compressed)
> qpitch /= 4;
>
> @@ -165,7 +168,10 @@ brw_miptree_layout(struct intel_context
> *intel, struct intel_mipmap_tree *mt)
> break;
>
> default:
> - i945_miptree_layout_2d(mt);
> + if (mt->num_samples> 0&& !mt->msaa_is_interleaved)
>
> + brw_miptree_layout_texture___array(intel, mt);
> + else
> + i945_miptree_layout_2d(mt);
> break;
> }
> DBG("%s: %dx%dx%d\n", __FUNCTION__,
> diff --git
> a/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
> index 849da85..6e745cf 100644
> --- a/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
> @@ -955,7 +955,8 @@ brw_update_renderbuffer___surface(struct
> brw_context *brw,
>
> intel_image->base.Base.Level,
> width, height, depth,
> true,
> - 0 /* num_samples */);
> + 0 /* num_samples */,
> + false /*
> msaa_is_interleaved */);
>
> intel_miptree_copy_teximage(__intel, intel_image,
> new_mt);
> intel_miptree_reference(&irb->__mt, intel_image->mt);
> diff --git a/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
> b/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
> index 3775eec..dfc2272 100644
> --- a/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
> @@ -149,6 +149,8 @@ gen7_blorp_emit_surface_state(__struct
> brw_context *brw,
>
> surf->ss0.surface_format = format;
> surf->ss0.surface_type = BRW_SURFACE_2D;
> + surf->ss0.surface_array___spacing =
> surface->array_spacing_lod0 ?
> + GEN7_SURFACE_ARYSPC_LOD0 : GEN7_SURFACE_ARYSPC_FULL;
>
> /* reloc */
> surf->ss1.base_addr = region->bo->offset; /* No tile
> offsets needed */
> diff --git
> a/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
> b/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
> index 5aa62bd..9a01ddf 100644
> --- a/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
> @@ -142,6 +142,10 @@ gen7_update_texture_surface(__struct
> gl_context *ctx, GLuint unit)
> return;
> }
>
> + /* We don't support MSAA for textures. */
> + assert(!mt->array_spacing___lod0);
> + assert(mt->num_samples == 0);
> +
>
> intel_miptree_get_dimensions___for_image(firstImage,&width,&__height,&depth);
>
> surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> @@ -296,6 +300,9 @@
> gen7_update_renderbuffer___surface(struct brw_context *brw,
> sizeof(*surf),
> 32,&brw->wm.surf_offset[unit])__;
>
> memset(surf, 0, sizeof(*surf));
>
> + /* Render targets can't use MSAA interleaved layout */
> + assert(!irb->mt->msaa_is___interleaved);
> +
> if (irb->mt->align_h == 4)
> surf->ss0.vertical_alignment = 1;
> if (irb->mt->align_w == 8)
> @@ -324,6 +331,9 @@
> gen7_update_renderbuffer___surface(struct brw_context *brw,
> }
>
> surf->ss0.surface_type = BRW_SURFACE_2D;
> + surf->ss0.surface_array___spacing =
> irb->mt->array_spacing_lod0 ?
> + GEN7_SURFACE_ARYSPC_LOD0 : GEN7_SURFACE_ARYSPC_FULL;
> +
> /* reloc */
> surf->ss1.base_addr =
> intel_renderbuffer_tile___offsets(irb,&tile_x,&tile_y);
> surf->ss1.base_addr += region->bo->offset; /* reloc */
> diff --git
> a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
> b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
> index 36d7b77..4132351 100644
> --- a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
> @@ -73,7 +73,8 @@ intel_miptree_create_internal(__struct
> intel_context *intel,
> GLuint height0,
> GLuint depth0,
> bool for_region,
> - GLuint num_samples)
> + GLuint num_samples,
> + bool msaa_is_interleaved)
> {
> struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1);
> int compress_byte = 0;
> @@ -95,8 +96,14 @@ intel_miptree_create_internal(__struct
> intel_context *intel,
> mt->cpp = compress_byte ? compress_byte :
> _mesa_get_format_bytes(mt->__format);
> mt->num_samples = num_samples;
> mt->compressed = compress_byte ? 1 : 0;
> + mt->msaa_is_interleaved = msaa_is_interleaved;
> mt->refcount = 1;
>
> + /* array_spacing_lod0 is only used for non-interleaved
> MSAA surfaces.
> + * TODO: can we use it elsewhere?
> + */
> + mt->array_spacing_lod0 = num_samples> 0&&
> !msaa_is_interleaved;
>
> +
> if (target == GL_TEXTURE_CUBE_MAP) {
> assert(depth0 == 1);
> mt->depth0 = 6;
> @@ -109,6 +116,8 @@ intel_miptree_create_internal(__struct
> intel_context *intel,
> (intel->must_use_separate___stencil ||
> (intel->has_separate_stencil&&
> intel->vtbl.is_hiz_depth___format(intel, format)))) {
> + /* MSAA stencil surfaces are always interleaved. */
> + bool msaa_is_interleaved = num_samples> 0;
> mt->stencil_mt = intel_miptree_create(intel,
> mt->target,
> MESA_FORMAT_S8,
> @@ -118,7 +127,8 @@ intel_miptree_create_internal(__struct
> intel_context *intel,
> mt->height0,
> mt->depth0,
> true,
> - num_samples);
> + num_samples,
> +
> msaa_is_interleaved);
> if (!mt->stencil_mt) {
> intel_miptree_release(&mt);
> return NULL;
> @@ -165,7 +175,8 @@ intel_miptree_create(struct
> intel_context *intel,
> GLuint height0,
> GLuint depth0,
> bool expect_accelerated_upload,
> - GLuint num_samples)
> + GLuint num_samples,
> + bool msaa_is_interleaved)
> {
> struct intel_mipmap_tree *mt;
> uint32_t tiling = I915_TILING_NONE;
> @@ -207,7 +218,7 @@ intel_miptree_create(struct
> intel_context *intel,
> mt = intel_miptree_create_internal(__intel, target, format,
> first_level,
> last_level, width0,
> height0, depth0,
> - false, num_samples);
> + false, num_samples,
> msaa_is_interleaved);
> /*
> * pitch == 0 || height == 0 indicates the null texture
> */
> @@ -243,7 +254,8 @@ intel_miptree_create_for___region(struct
> intel_context *intel,
> mt = intel_miptree_create_internal(__intel, target, format,
> 0, 0,
> region->width,
> region->height, 1,
> - true, 0 /* num_samples
> */);
> + true, 0 /* num_samples */,
> + false /*
> msaa_is_interleaved */);
> if (!mt)
> return mt;
>
> @@ -252,6 +264,31 @@
> intel_miptree_create_for___region(struct intel_context *intel,
> return mt;
> }
>
> +/**
> + * Determine whether the MSAA surface being created should
> use an interleaved
> + * layout or a sliced layout, based on the chip generation
> and the surface
> + * type.
> + */
> +static bool
> +msaa_format_is_interleaved(__struct intel_context *intel,
> gl_format format)
> +{
> + /* Prior to Gen7, all surfaces used interleaved layout. */
> + if (intel->gen< 7)
> + return true;
> +
> + /* In Gen7, interleaved layout is only used for depth
> and stencil
> + * buffers.
> + */
> + switch (_mesa_get_format_base_format(__format)) {
> + case GL_DEPTH_COMPONENT:
> + case GL_STENCIL_INDEX:
> + case GL_DEPTH_STENCIL:
> + return true;
> + default:
> + return false;
> + }
> +}
> +
> struct intel_mipmap_tree*
> intel_miptree_create_for___renderbuffer(struct
> intel_context *intel,
> gl_format format,
> @@ -260,25 +297,43 @@
> intel_miptree_create_for___renderbuffer(struct intel_context
> *intel,
> uint32_t num_samples)
> {
> struct intel_mipmap_tree *mt;
> -
> - /* Adjust width/height for MSAA. Note: since samples
> are interleaved in a
> - * complex pattern that repeats for every 2x2 block of
> pixels, we need to
> - * expand the width and height to even numbers before
> the width/height
> - * adjustment, otherwise some of the samples in the last
> row and column
> - * will fall outside the boundary of the texture.
> - */
> - if (num_samples> 4) {
> - num_samples = 8;
> - width = ALIGN(width, 2) * 4;
> - height = ALIGN(height, 2) * 2;
> - } else if (num_samples> 0) {
> - num_samples = 4;
> - width = ALIGN(width, 2) * 2;
> - height = ALIGN(height, 2) * 2;
> + uint32_t depth = 1;
> + bool msaa_is_interleaved = false;
> +
> + if (num_samples> 0) {
> + /* Adjust width/height/depth for MSAA */
> + msaa_is_interleaved =
> msaa_format_is_interleaved(__intel, format);
> + if (msaa_is_interleaved) {
> + /* Note: since samples are interleaved in a
> complex pattern that
> + * repeats for every 2x2 block of pixels, we need
> to expand the width
> + * and height to even numbers before the
> width/height adjustment,
> + * otherwise some of the samples in the last row
> and column will fall
> + * outside the boundary of the texture.
> + */
> + switch (num_samples) {
> + case 4:
> + width = ALIGN(width, 2) * 2;
> + height = ALIGN(height, 2) * 2;
> + break;
> + case 8:
> + width = ALIGN(width, 2) * 4;
> + height = ALIGN(height, 2) * 2;
> + break;
> + default:
> + /* num_samples should already have been
> quantized to 0, 4, or
> + * 8.
> + */
> + assert(false);
> + }
> + } else {
> + /* Non-interleaved */
> + depth = num_samples;
> + }
> }
>
> mt = intel_miptree_create(intel, GL_TEXTURE_2D, format,
> 0, 0,
> - width, height, 1, true,
> num_samples);
> + width, height, depth, true,
> num_samples,
> + msaa_is_interleaved);
>
> return mt;
> }
> @@ -552,6 +607,8 @@ intel_miptree_alloc_hiz(struct
> intel_context *intel,
> GLuint num_samples)
> {
> assert(mt->hiz_mt == NULL);
> + /* MSAA HiZ surfaces are always interleaved. */
> + bool msaa_is_interleaved = num_samples> 0;
> mt->hiz_mt = intel_miptree_create(intel,
> mt->target,
> MESA_FORMAT_X8_Z24,
> @@ -561,7 +618,8 @@ intel_miptree_alloc_hiz(struct
> intel_context *intel,
> mt->height0,
> mt->depth0,
> true,
> - num_samples);
> + num_samples,
> + msaa_is_interleaved);
>
> if (!mt->hiz_mt)
> return false;
> diff --git
> a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
> b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
> index ca1666d..3883d2b 100644
> --- a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
> @@ -172,6 +172,26 @@ struct intel_mipmap_tree
> GLuint num_samples;
> bool compressed;
>
> + /**
> + * For 1D array, 2D array, cube, and 2D multisampled
> surfaces on Gen7: true
> + * if the surface only contains LOD 0, and hence no
> space is for LOD's
> + * other than 0 in between array slices.
> + *
> + * Corresponds to the surface_array_spacing bit in
> gen7_surface_state.
> + */
> + bool array_spacing_lod0;
>
>
> It seems like we ought to use ARYSPC_LOD0 mode for all
> non-mipmapped renderbuffers and textures...it should save
> memory. If we did that, I think we could probably just use
> mt->first_level != mt->last_level rather than adding a new flag.
>
> Then again, I haven't thought through all the ramifications of
> that: I know there's some period of time where the user has
> defined LOD0, and you're not sure whether they're going to
> defined LOD1, 2, 3 too or just not do mipmapping. It would suck
> to allocate something in ARYSPC_LOD0/non-mipmap-capable mode and
> then have to redo it. But maybe we have to redo it already?
>
>
> I think you're right that we have to redo it anyhow, because even
> without using ARYSPC_LOD0, if we speculatively allocate a texture
> that just has LOD0, then we haven't reserved enough memory for the
> other mip levels of the last array slice.
>
> Your suggestion seems reasonable. The only thing I can think of
> that might get in the way is if there are restrictions in the bspec
> requiring us to disable ARYSPC_LOD0 at certain times. I'll try to
> dig through the bspec today and check that.
>
>
> We discussed this further in person yesterday, and decided that it would
> be better to wait and do it as a future optimization, on the grounds
> that changing the layout of every non-mipmapped array texture is a
> little more risk than we want to incur in the middle of this patch
> series. I've added it to a list of items to revisit once the bulk of
> MSAA is complete.
>
> I also seem to recall that you found a subtlety about ARYSPC_LOD0 in the
> bspec yesterday that we'll need to keep in mind when we do revisit this
> topic. Do you remember what that was? I've forgotten the details.
vol1a discussion of QPitch:
"If Surface Array Spacing is set to ARYSPC_FULL (note that the depth
buffer and stencil buffer have an implied value of ARYSPC_FULL):"
which suggested that maybe ARYSPC_LOD0 for depth/stencil buffers doesn't
work.
More information about the mesa-dev
mailing list