<p dir="ltr"></p>
<p dir="ltr">On Jul 28, 2016 7:59 AM, "Pohjolainen, Topi" <<a href="mailto:topi.pohjolainen@intel.com">topi.pohjolainen@intel.com</a>> wrote:<br>
><br>
> On Tue, Jul 26, 2016 at 03:02:20PM -0700, Jason Ekstrand wrote:<br>
> > Since the dawn of time, blorp has used offsets directly to get at different<br>
> > mip levels and array slices of surfaces. This isn't really necessary since<br>
> > we can just use the base level/layer provided in the surface state. While<br>
> > it may have simplified blorp's original design, we haven't been using the<br>
> > blorp path for surface state on gen8 thanks to render compression and<br>
> > there's really no good need for it most of the time. This commit restricts<br>
> > such surface munging to the cases of fake W-tiling and fake interleaved<br>
> > multisampling.<br>
> > ---<br>
> > src/mesa/drivers/dri/i965/brw_blorp.c | 74 ++---------<br>
> > src/mesa/drivers/dri/i965/brw_blorp.h | 6 +<br>
> > src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 185 +++++++++++++++++++--------<br>
> > 3 files changed, 152 insertions(+), 113 deletions(-)<br>
> ><br>
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/i965/brw_blorp.c<br>
> > index 64e507a..215f765 100644<br>
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.c<br>
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c<br>
> > @@ -90,7 +90,7 @@ get_image_offset_sa_gen6_stencil(const struct isl_surf *surf,<br>
> > *y_offset_sa = y;<br>
> > }<br>
> ><br>
> > -static void<br>
> > +void<br>
><br>
> I noticed that this is now only used in brw_blorp_blit.cpp, should we just<br>
> move it there instead?</p>
<p dir="ltr">It gets deleted entirely later.</p>
<p dir="ltr">> > blorp_get_image_offset_sa(struct isl_device *dev, const struct isl_surf *surf,<br>
> > uint32_t level, uint32_t layer,<br>
> > uint32_t *x_offset_sa,<br>
> > @@ -100,60 +100,11 @@ blorp_get_image_offset_sa(struct isl_device *dev, const struct isl_surf *surf,<br>
> > get_image_offset_sa_gen6_stencil(surf, level, layer,<br>
> > x_offset_sa, y_offset_sa);<br>
> > } else {<br>
> > - /* Using base_array_layer for Z in 3-D surfaces is a bit abusive, but it<br>
> > - * will go away soon enough.<br>
> > - */<br>
> > - uint32_t z = 0;<br>
> > - if (surf->dim == ISL_SURF_DIM_3D) {<br>
> > - z = layer;<br>
> > - layer = 0;<br>
> > - }<br>
> > -<br>
> > - isl_surf_get_image_offset_sa(surf, level, layer, z,<br>
> > + isl_surf_get_image_offset_sa(surf, level, layer, 0,<br>
> > x_offset_sa, y_offset_sa);<br>
> > }<br>
> > }<br>
> ><br>
> > -static void<br>
> > -surf_apply_level_layer_offsets(struct isl_device *dev, struct isl_surf *surf,<br>
> > - struct isl_view *view, uint32_t *byte_offset,<br>
> > - uint32_t *tile_x_sa, uint32_t *tile_y_sa)<br>
> > -{<br>
> > - /* This only makes sense for a single level and array slice */<br>
> > - assert(view->levels == 1 && view->array_len == 1);<br>
> > -<br>
> > - uint32_t x_offset_sa, y_offset_sa;<br>
> > - blorp_get_image_offset_sa(dev, surf, view->base_level,<br>
> > - view->base_array_layer,<br>
> > - &x_offset_sa, &y_offset_sa);<br>
> > -<br>
> > - isl_tiling_get_intratile_offset_sa(dev, surf->tiling, view->format,<br>
> > - surf->row_pitch, x_offset_sa, y_offset_sa,<br>
> > - byte_offset, tile_x_sa, tile_y_sa);<br>
> > -<br>
> > - /* Now that that's done, we have a very bare 2-D surface */<br>
> > - surf->dim = ISL_SURF_DIM_2D;<br>
> > - surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;<br>
> > -<br>
> > - surf->logical_level0_px.width =<br>
> > - minify(surf->logical_level0_px.width, view->base_level);<br>
> > - surf->logical_level0_px.height =<br>
> > - minify(surf->logical_level0_px.height, view->base_level);<br>
> > - surf->logical_level0_px.depth = 1;<br>
> > - surf->logical_level0_px.array_len = 1;<br>
> > - surf->levels = 1;<br>
> > -<br>
> > - /* Alignment doesn't matter since we have 1 miplevel and 1 array slice so<br>
> > - * just pick something that works for everybody.<br>
> > - */<br>
> > - surf->image_alignment_el = isl_extent3d(4, 4, 1);<br>
> > -<br>
> > - /* TODO: surf->physcal_level0_extent_sa? */<br>
><br>
> I was wondering about this in the introducing patch...<br>
><br>
> > -<br>
> > - view->base_level = 0;<br>
> > - view->base_array_layer = 0;<br>
> > -}<br>
> > -<br>
> > void<br>
> > brw_blorp_surface_info_init(struct brw_context *brw,<br>
> > struct brw_blorp_surface_info *info,<br>
> > @@ -191,8 +142,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,<br>
> > .format = ISL_FORMAT_UNSUPPORTED, /* Set later */<br>
> > .base_level = level,<br>
><br>
> This still prevents surf_convert_to_single_slice() from skipping the munging,<br>
> I think. Shouldn't we drop it also?</p>
<p dir="ltr">I'm confused. If it las a level other than zero then we need the munging. The only reason to skip is because there are cases where we call convert_two_single_slice twice.</p>
<p dir="ltr">> > .levels = 1,<br>
> > - .base_array_layer = layer / layer_multiplier,<br>
> > - .array_len = 1,<br>
> > .channel_select = {<br>
> > ISL_CHANNEL_SELECT_RED,<br>
> > ISL_CHANNEL_SELECT_GREEN,<br>
> > @@ -201,12 +150,21 @@ brw_blorp_surface_info_init(struct brw_context *brw,<br>
> > },<br>
> > };<br>
> ><br>
> > - if (brw->gen >= 8 && !is_render_target && info->surf.dim == ISL_SURF_DIM_3D) {<br>
> > - /* On gen8+ we use actual 3-D textures so we need to pass the layer<br>
> > - * through to the sampler.<br>
> > + if (!is_render_target &&<br>
> > + (info->surf.dim == ISL_SURF_DIM_3D ||<br>
> > + info->surf.msaa_layout == ISL_MSAA_LAYOUT_ARRAY)) {<br>
> > + /* 3-D textures don't support base_array layer and neither do 2-D<br>
> > + * multisampled textures on IVB so we need to pass it through the<br>
> > + * sampler in those cases. These are also two cases where we are<br>
> > + * guaranteed that we won't be doing any funny surface hacks.<br>
> > */<br>
> > + info->view.base_array_layer = 0;<br>
> > + info->view.array_len = MAX2(info->surf.logical_level0_px.depth,<br>
> > + info->surf.logical_level0_px.array_len);<br>
> > info->z_offset = layer / layer_multiplier;<br>
> > } else {<br>
> > + info->view.base_array_layer = layer / layer_multiplier;<br>
> > + info->view.array_len = 1;<br>
> > info->z_offset = 0;<br>
> > }<br>
> ><br>
> > @@ -252,10 +210,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,<br>
> > break;<br>
> > }<br>
> > }<br>
> > -<br>
> > - surf_apply_level_layer_offsets(&brw->isl_dev, &info->surf, &info->view,<br>
> > - &info->bo_offset,<br>
> > - &info->tile_x_sa, &info->tile_y_sa);<br>
> > }<br>
> ><br>
> ><br>
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h b/src/mesa/drivers/dri/i965/brw_blorp.h<br>
> > index ec12dfe..706d53e 100644<br>
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.h<br>
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.h<br>
> > @@ -338,6 +338,12 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, struct nir_shader *nir,<br>
> > struct brw_blorp_prog_data *prog_data,<br>
> > unsigned *program_size);<br>
> ><br>
> > +void<br>
> > +blorp_get_image_offset_sa(struct isl_device *dev, const struct isl_surf *surf,<br>
> > + uint32_t level, uint32_t layer,<br>
> > + uint32_t *x_offset_sa,<br>
> > + uint32_t *y_offset_sa);<br>
> > +<br>
> > uint32_t<br>
> > brw_blorp_emit_surface_state(struct brw_context *brw,<br>
> > const struct brw_blorp_surface_info *surface,<br>
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp<br>
> > index a35cdb3..007c061 100644<br>
> > --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp<br>
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp<br>
> > @@ -1586,6 +1586,115 @@ swizzle_to_scs(GLenum swizzle)<br>
> > return (enum isl_channel_select)((swizzle + 4) & 7);<br>
> > }<br>
> ><br>
> > +static void<br>
> > +surf_convert_to_single_slice(struct brw_context *brw,<br>
> > + struct brw_blorp_surface_info *info)<br>
> > +{<br>
> > + /* This only makes sense for a single level and array slice */<br>
> > + assert(info->view.levels == 1 && info->view.array_len == 1);<br>
> > +<br>
> > + /* Just bail if we have nothing to do. */<br>
> > + if (info->surf.dim == ISL_SURF_DIM_2D &&<br>
> > + info->view.base_level == 0 && info->view.base_array_layer == 0 &&<br>
> > + info->surf.levels == 0 && info->surf.logical_level0_px.array_len == 0)<br>
> > + return;<br>
> > +<br>
> > + uint32_t x_offset_sa, y_offset_sa;<br>
> > + blorp_get_image_offset_sa(&brw->isl_dev, &info->surf, info->view.base_level,<br>
> > + info->view.base_array_layer,<br>
> > + &x_offset_sa, &y_offset_sa);<br>
> > +<br>
> > + isl_tiling_get_intratile_offset_sa(&brw->isl_dev, info->surf.tiling,<br>
> > + info->view.format, info->surf.row_pitch,<br>
> > + x_offset_sa, y_offset_sa,<br>
> > + &info->bo_offset,<br>
> > + &info->tile_x_sa, &info->tile_y_sa);<br>
> > +<br>
> > + /* TODO: Once this file gets converted to C, we shouls just use designated<br>
> > + * initializers.<br>
> > + */<br>
> > + struct isl_surf_init_info init_info = isl_surf_init_info();<br>
> > +<br>
> > + init_info.dim = ISL_SURF_DIM_2D;<br>
> > + init_info.format = ISL_FORMAT_R8_UINT;<br>
> > + init_info.width =<br>
> > + minify(info->surf.logical_level0_px.width, info->view.base_level);<br>
> > + init_info.height =<br>
> > + minify(info->surf.logical_level0_px.height, info->view.base_level);<br>
> > + init_info.depth = 1;<br>
> > + init_info.levels = 1;<br>
> > + init_info.array_len = 1;<br>
> > + init_info.samples = info->surf.samples;<br>
> > + init_info.min_pitch = info->surf.row_pitch;<br>
> > + init_info.usage = info->surf.usage;<br>
> > + init_info.tiling_flags = 1 << info->surf.tiling;<br>
> > +<br>
> > + isl_surf_init_s(&brw->isl_dev, &info->surf, &init_info);<br>
> > + assert(info->surf.row_pitch == init_info.min_pitch);<br>
> > +<br>
> > + /* The view is also different now. */<br>
> > + info->view.base_level = 0;<br>
> > + info->view.levels = 1;<br>
> > + info->view.base_array_layer = 0;<br>
> > + info->view.array_len = 1;<br>
> > +}<br>
> > +<br>
> > +static void<br>
> > +surf_fake_interleaved_msaa(struct brw_context *brw,<br>
> > + struct brw_blorp_surface_info *info)<br>
> > +{<br>
> > + assert(info->surf.msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED);<br>
> > +<br>
> > + /* First, we need to convert it to a simple 1-level 1-layer 2-D surface */<br>
> > + surf_convert_to_single_slice(brw, info);<br>
> > +<br>
> > + info->surf.logical_level0_px = info->surf.phys_level0_sa;<br>
> > + info->surf.samples = 1;<br>
> > + info->surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;<br>
> > +}<br>
> > +<br>
> > +static void<br>
> > +surf_retile_w_to_y(struct brw_context *brw,<br>
> > + struct brw_blorp_surface_info *info)<br>
> > +{<br>
> > + assert(info->surf.tiling == ISL_TILING_W);<br>
> > +<br>
> > + /* First, we need to convert it to a simple 1-level 1-layer 2-D surface */<br>
> > + surf_convert_to_single_slice(brw, info);<br>
> > +<br>
> > + /* On gen7+, we don't have interleaved multisampling for color render<br>
> > + * targets so we have to fake it.<br>
> > + *<br>
> > + * TODO: Are we sure we don't also need to fake it on gen6?<br>
> > + */<br>
> > + if (brw->gen > 6 && info->surf.msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {<br>
> > + info->surf.logical_level0_px = info->surf.phys_level0_sa;<br>
> > + info->surf.samples = 1;<br>
> > + info->surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;<br>
> > + }<br>
> > +<br>
> > + if (brw->gen == 6) {<br>
> > + /* Gen6 stencil buffers have a very large alignment coming in from the<br>
> > + * miptree. It's out-of-bounds for what the surface state can handle.<br>
> > + * Since we have a single layer and level, it doesn't really matter as<br>
> > + * long as we don't pass a bogus value into isl_surf_fill_state().<br>
> > + */<br>
> > + info->surf.image_alignment_el = isl_extent3d(4, 2, 1);<br>
> > + }<br>
> > +<br>
> > + /* Now that we've converted everything to a simple 2-D surface with only<br>
> > + * one miplevel, we can go about retiling it.<br>
> > + */<br>
> > + const unsigned x_align = 8, y_align = info->surf.samples != 0 ? 8 : 4;<br>
> > + info->surf.tiling = ISL_TILING_Y0;<br>
> > + info->surf.logical_level0_px.width =<br>
> > + ALIGN(info->surf.logical_level0_px.width, x_align) * 2;<br>
> > + info->surf.logical_level0_px.height =<br>
> > + ALIGN(info->surf.logical_level0_px.height, y_align) / 2;<br>
> > + info->tile_x_sa *= 2;<br>
> > + info->tile_y_sa /= 2;<br>
> > +}<br>
> > +<br>
> > /**<br>
> > * Note: if the src (or dst) is a 2D multisample array texture on Gen7+ using<br>
> > * INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, src_layer (dst_layer) is<br>
> > @@ -1782,7 +1891,10 @@ brw_blorp_blit_miptrees(struct brw_context *brw,<br>
> > /* For some texture types, we need to pass the layer through the sampler. */<br>
> > params.wm_inputs.src_z = params.src.z_offset;<br>
> ><br>
> > - if (brw->gen > 6 && dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {<br>
> > + if (brw->gen > 6 &&<br>
> > + params.dst.surf.msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {<br>
> > + assert(params.dst.surf.samples > 1);<br>
> > +<br>
> > /* We must expand the rectangle we send through the rendering pipeline,<br>
> > * to account for the fact that we are mapping the destination region as<br>
> > * single-sampled when it is in fact multisampled. We must also align<br>
> > @@ -1795,71 +1907,41 @@ brw_blorp_blit_miptrees(struct brw_context *brw,<br>
> > * If it's UMS, then we have no choice but to set up the rendering<br>
> > * pipeline as multisampled.<br>
> > */<br>
> > - assert(params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED);<br>
> > switch (params.dst.surf.samples) {<br>
> > case 2:<br>
> > params.x0 = ROUND_DOWN_TO(params.x0 * 2, 4);<br>
> > params.y0 = ROUND_DOWN_TO(params.y0, 4);<br>
> > params.x1 = ALIGN(params.x1 * 2, 4);<br>
> > params.y1 = ALIGN(params.y1, 4);<br>
> > - params.dst.surf.logical_level0_px.width *= 2;<br>
> > break;<br>
> > case 4:<br>
> > params.x0 = ROUND_DOWN_TO(params.x0 * 2, 4);<br>
> > params.y0 = ROUND_DOWN_TO(params.y0 * 2, 4);<br>
> > params.x1 = ALIGN(params.x1 * 2, 4);<br>
> > params.y1 = ALIGN(params.y1 * 2, 4);<br>
> > - params.dst.surf.logical_level0_px.width *= 2;<br>
> > - params.dst.surf.logical_level0_px.height *= 2;<br>
> > break;<br>
> > case 8:<br>
> > params.x0 = ROUND_DOWN_TO(params.x0 * 4, 8);<br>
> > params.y0 = ROUND_DOWN_TO(params.y0 * 2, 4);<br>
> > params.x1 = ALIGN(params.x1 * 4, 8);<br>
> > params.y1 = ALIGN(params.y1 * 2, 4);<br>
> > - params.dst.surf.logical_level0_px.width *= 4;<br>
> > - params.dst.surf.logical_level0_px.height *= 2;<br>
> > break;<br>
> > case 16:<br>
> > params.x0 = ROUND_DOWN_TO(params.x0 * 4, 8);<br>
> > params.y0 = ROUND_DOWN_TO(params.y0 * 4, 8);<br>
> > params.x1 = ALIGN(params.x1 * 4, 8);<br>
> > params.y1 = ALIGN(params.y1 * 4, 8);<br>
> > - params.dst.surf.logical_level0_px.width *= 4;<br>
> > - params.dst.surf.logical_level0_px.height *= 4;<br>
> > break;<br>
> > default:<br>
> > unreachable("Unrecognized sample count in brw_blorp_blit_params ctor");<br>
> > }<br>
> ><br>
> > - /* Gen7's rendering hardware only supports the IMS layout for depth and<br>
> > - * stencil render targets. Blorp always maps its destination surface as<br>
> > - * a color render target (even if it's actually a depth or stencil<br>
> > - * buffer). So if the destination is IMS, we'll have to map it as a<br>
> > - * single-sampled texture and interleave the samples ourselves.<br>
> > - */<br>
> > - params.dst.surf.samples = 1;<br>
> > - params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;<br>
> > + surf_fake_interleaved_msaa(brw, ¶ms.dst);<br>
> ><br>
> > wm_prog_key.use_kill = true;<br>
> > }<br>
> ><br>
> > if (params.dst.surf.tiling == ISL_TILING_W) {<br>
> > - /* We need to fake W-tiling with Y-tiling */<br>
> > - params.dst.surf.tiling = ISL_TILING_Y0;<br>
> > -<br>
> > - wm_prog_key.dst_tiled_w = true;<br>
><br>
> This is just moved further down, right?</p>
<p dir="ltr">I believe so yes</p>
<p dir="ltr">> > -<br>
> > - if (params.dst.surf.samples > 1) {<br>
> > - /* If the destination surface is a W-tiled multisampled stencil<br>
> > - * buffer that we're mapping as Y tiled, then we need to arrange for<br>
> > - * the WM program to run once per sample rather than once per pixel,<br>
> > - * because the memory layout of related samples doesn't match between<br>
> > - * W and Y tiling.<br>
> > - */<br>
> > - wm_prog_key.persample_msaa_dispatch = true;<br>
> > - }<br>
> > -<br>
> > /* We must modify the rectangle we send through the rendering pipeline<br>
> > * (and the size and x/y offset of the destination surface), to account<br>
> > * for the fact that we are mapping it as Y-tiled when it is in fact<br>
> > @@ -1911,39 +1993,36 @@ brw_blorp_blit_miptrees(struct brw_context *brw,<br>
> > params.y0 = ROUND_DOWN_TO(params.y0, y_align) / 2;<br>
> > params.x1 = ALIGN(params.x1, x_align) * 2;<br>
> > params.y1 = ALIGN(params.y1, y_align) / 2;<br>
> > - params.dst.surf.logical_level0_px.width =<br>
> > - ALIGN(params.dst.surf.logical_level0_px.width, x_align) * 2;<br>
> > - params.dst.surf.logical_level0_px.height =<br>
> > - ALIGN(params.dst.surf.logical_level0_px.height, y_align) / 2;<br>
> > - params.dst.tile_x_sa *= 2;<br>
> > - params.dst.tile_y_sa /= 2;<br>
> > +<br>
> > + /* Retile the surface to Y-tiled */<br>
> > + surf_retile_w_to_y(brw, ¶ms.dst);<br>
> > +<br>
> > + wm_prog_key.dst_tiled_w = true;<br>
> > wm_prog_key.use_kill = true;<br>
> > +<br>
> > + if (params.dst.surf.samples > 1) {<br>
> > + /* If the destination surface is a W-tiled multisampled stencil<br>
> > + * buffer that we're mapping as Y tiled, then we need to arrange for<br>
> > + * the WM program to run once per sample rather than once per pixel,<br>
> > + * because the memory layout of related samples doesn't match between<br>
> > + * W and Y tiling.<br>
> > + */<br>
> > + wm_prog_key.persample_msaa_dispatch = true;<br>
> > + }<br>
> > }<br>
> ><br>
> > if (brw->gen < 8 && params.src.surf.tiling == ISL_TILING_W) {<br>
> > /* On Haswell and earlier, we have to fake W-tiled sources as Y-tiled.<br>
> > * Broadwell adds support for sampling from stencil.<br>
> > - */<br>
> > - params.src.surf.tiling = ISL_TILING_Y0;<br>
> > -<br>
> > - wm_prog_key.src_tiled_w = true;<br>
><br>
> Same as this?</p>
<p dir="ltr">Yes</p>
<p dir="ltr">> > -<br>
> > - /* We must modify the size and x/y offset of the source surface to<br>
> > - * account for the fact that we are mapping it as Y-tiled when it is in<br>
> > - * fact W tiled.<br>
> > *<br>
> > * See the comments above concerning x/y offset alignment for the<br>
> > * destination surface.<br>
> > *<br>
> > * TODO: what if this makes the texture size too large?<br>
> > */<br>
> > - const unsigned x_align = 8, y_align = params.src.surf.samples != 0 ? 8 : 4;<br>
> > - params.src.surf.logical_level0_px.width =<br>
> > - ALIGN(params.src.surf.logical_level0_px.width, x_align) * 2;<br>
> > - params.src.surf.logical_level0_px.height =<br>
> > - ALIGN(params.src.surf.logical_level0_px.height, y_align) / 2;<br>
> > - params.src.tile_x_sa *= 2;<br>
> > - params.src.tile_y_sa /= 2;<br>
> > + surf_retile_w_to_y(brw, ¶ms.src);<br>
> > +<br>
> > + wm_prog_key.src_tiled_w = true;<br>
> > }<br>
> ><br>
> > /* tex_samples and rt_samples are the sample counts that are set up in<br>
> > --<br>
> > 2.5.0.400.gff86faf<br>
> ><br>
> > _______________________________________________<br>
> > mesa-dev mailing list<br>
> > <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br></p>