[Mesa-dev] [PATCH 8/8] i965: Rework gl_TessLevel*[] handling to use NIR compact arrays.

Jason Ekstrand jason at jlekstrand.net
Wed Jan 4 22:57:14 UTC 2017


Ok, given what you said below, I think everything is sane.

Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

Note that I didn't actually verify that all of the code you deleted is
going to work once we switch over to NIR handling it with varyings.

On Wed, Jan 4, 2017 at 2:12 PM, Kenneth Graunke <kenneth at whitecape.org>
wrote:

> On Wednesday, January 4, 2017 1:35:55 PM PST Jason Ekstrand wrote:
> > On Wed, Jan 4, 2017 at 3:07 AM, Kenneth Graunke <kenneth at whitecape.org>
> > wrote:
> >
> > > Treating everything as scalar arrays allows us to drop a bunch of
> > > special case input/output munging all throughout the backend.
> > > Instead, we just need to remap the TessLevel components to the
> > > appropriate patch URB header locations in remap_patch_urb_offsets().
> > >
> > > We also switch to treating the TES input versions of these as ordinary
> > > shader inputs rather than system values, as remap_patch_urb_offsets()
> > > just makes everything work out without special handling.
> > >
> > > This regresses one Piglit test:
> > > arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_
> > > SHADER-array-at-limit
> > >
> > > The compiler starts promoting the constant arrays assigned to
> gl_TessLevel*
> > > to uniform arrays.  Since the shader also has a uniform array that uses
> > > the maximum number of uniform components, this puts it over the uniform
> > > component limit enforced by the linker.  This is arguably a bug in the
> > > constant array promotion code (it should avoid pushing us over limits),
> > > but is unlikely to penalize any real application.
> > >
> > > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_context.c            |   2 +-
> > >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp           | 164
> > > +--------------------
> > >  src/mesa/drivers/dri/i965/brw_nir.c                |  74 +++++++++-
> > >  src/mesa/drivers/dri/i965/brw_nir.h                |   3 +-
> > >  .../drivers/dri/i965/brw_nir_tcs_workarounds.c     |   8 +-
> > >  src/mesa/drivers/dri/i965/brw_shader.cpp           |  61 +-------
> > >  src/mesa/drivers/dri/i965/brw_shader.h             |   5 -
> > >  src/mesa/drivers/dri/i965/brw_tcs.c                |   5 +-
> > >  src/mesa/drivers/dri/i965/brw_tes.c                |  17 +--
> > >  src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp         | 117
> +--------------
> > >  10 files changed, 92 insertions(+), 364 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> > > b/src/mesa/drivers/dri/i965/brw_context.c
> > > index 45490a0f5cf..22f872fe782 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > > @@ -672,7 +672,7 @@ brw_initialize_context_constants(struct
> brw_context
> > > *brw)
> > >     if (brw->gen >= 5 || brw->is_g4x)
> > >        ctx->Const.MaxClipPlanes = 8;
> > >
> > > -   ctx->Const.LowerTessLevel = true;
> > > +   ctx->Const.GLSLTessLevelsAsInputs = true;
> > >     ctx->Const.LowerTCSPatchVerticesIn = brw->gen >= 8;
> > >     ctx->Const.LowerTESPatchVerticesIn = true;
> > >     ctx->Const.PrimitiveRestartForPatches = true;
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > > index 2ed843bd03d..8f745dff440 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > > @@ -2520,78 +2520,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const
> > > fs_builder &bld,
> > >           bld.MOV(patch_handle,
> > >                   retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UD));
> > >
> > > -         if (imm_offset == 0) {
> > > -            /* This is a read of gl_TessLevelInner[], which lives in
> the
> > > -             * Patch URB header.  The layout depends on the domain.
> > > -             */
> > > -            dst.type = BRW_REGISTER_TYPE_F;
> > > -            switch (tcs_key->tes_primitive_mode) {
> > > -            case GL_QUADS: {
> > > -               /* DWords 3-2 (reversed) */
> > > -               fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 4);
> > > -
> > > -               inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp,
> > > patch_handle);
> > > -               inst->offset = 0;
> > > -               inst->mlen = 1;
> > > -               inst->size_written = 4 * REG_SIZE;
> > > -
> > > -               /* dst.xy = tmp.wz */
> > > -               bld.MOV(dst,                 offset(tmp, bld, 3));
> > > -               bld.MOV(offset(dst, bld, 1), offset(tmp, bld, 2));
> > > -               break;
> > > -            }
> > > -            case GL_TRIANGLES:
> > > -               /* DWord 4; hardcode offset = 1 and size_written =
> > > REG_SIZE */
> > > -               inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst,
> > > patch_handle);
> > > -               inst->offset = 1;
> > > -               inst->mlen = 1;
> > > -               inst->size_written = REG_SIZE;
> > > -               break;
> > > -            case GL_ISOLINES:
> > > -               /* All channels are undefined. */
> > > -               break;
> > > -            default:
> > > -               unreachable("Bogus tessellation domain");
> > > -            }
> > > -         } else if (imm_offset == 1) {
> > > -            /* This is a read of gl_TessLevelOuter[], which lives in
> the
> > > -             * Patch URB header.  The layout depends on the domain.
> > > -             */
> > > -            dst.type = BRW_REGISTER_TYPE_F;
> > > -
> > > -            fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 4);
> > > -            inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp,
> > > patch_handle);
> > > -            inst->offset = 1;
> > > -            inst->mlen = 1;
> > > -            inst->size_written = 4 * REG_SIZE;
> > > -
> > > -            /* Reswizzle: WZYX */
> > > -            fs_reg srcs[4] = {
> > > -               offset(tmp, bld, 3),
> > > -               offset(tmp, bld, 2),
> > > -               offset(tmp, bld, 1),
> > > -               offset(tmp, bld, 0),
> > > -            };
> > > -
> > > -            unsigned num_components;
> > > -            switch (tcs_key->tes_primitive_mode) {
> > > -            case GL_QUADS:
> > > -               num_components = 4;
> > > -               break;
> > > -            case GL_TRIANGLES:
> > > -               num_components = 3;
> > > -               break;
> > > -            case GL_ISOLINES:
> > > -               /* Isolines are not reversed; swizzle .zw -> .xy */
> > > -               srcs[0] = offset(tmp, bld, 2);
> > > -               srcs[1] = offset(tmp, bld, 3);
> > > -               num_components = 2;
> > > -               break;
> > > -            default:
> > > -               unreachable("Bogus tessellation domain");
> > > -            }
> > > -            bld.LOAD_PAYLOAD(dst, srcs, num_components, 0);
> > > -         } else {
> > > +         {
> > >              if (first_component != 0) {
> > >                 unsigned read_components =
> > >                    instr->num_components + first_component;
> > > @@ -2656,55 +2585,6 @@ fs_visitor::nir_emit_tcs_intrinsic(const
> > > fs_builder &bld,
> > >
> > >        if (indirect_offset.file != BAD_FILE) {
> > >           srcs[header_regs++] = indirect_offset;
> > > -      } else if (!is_passthrough_shader) {
> > > -         if (imm_offset == 0) {
> > > -            value.type = BRW_REGISTER_TYPE_F;
> > > -
> > > -            mask &= (1 << tesslevel_inner_components(
> tcs_key->tes_primitive_mode))
> > > - 1;
> > > -
> > > -            /* This is a write to gl_TessLevelInner[], which lives in
> the
> > > -             * Patch URB header.  The layout depends on the domain.
> > > -             */
> > > -            switch (tcs_key->tes_primitive_mode) {
> > > -            case GL_QUADS:
> > > -               /* gl_TessLevelInner[].xy lives at DWords 3-2
> (reversed).
> > > -                * We use an XXYX swizzle to reverse put .xy in the .wz
> > > -                * channels, and use a .zw writemask.
> > > -                */
> > > -               mask = writemask_for_backwards_vector(mask);
> > > -               swiz = BRW_SWIZZLE4(0, 0, 1, 0);
> > > -               break;
> > > -            case GL_TRIANGLES:
> > > -               /* gl_TessLevelInner[].x lives at DWord 4, so we set
> the
> > > -                * writemask to X and bump the URB offset by 1.
> > > -                */
> > > -               imm_offset = 1;
> > > -               break;
> > > -            case GL_ISOLINES:
> > > -               /* Skip; gl_TessLevelInner[] doesn't exist for
> isolines. */
> > > -               return;
> > > -            default:
> > > -               unreachable("Bogus tessellation domain");
> > > -            }
> > > -         } else if (imm_offset == 1) {
> > > -            /* This is a write to gl_TessLevelOuter[] which lives in
> the
> > > -             * Patch URB Header at DWords 4-7.  However, it's
> reversed, so
> > > -             * instead of .xyzw we have .wzyx.
> > > -             */
> > > -            value.type = BRW_REGISTER_TYPE_F;
> > > -
> > > -            mask &= (1 << tesslevel_outer_components(
> tcs_key->tes_primitive_mode))
> > > - 1;
> > > -
> > > -            if (tcs_key->tes_primitive_mode == GL_ISOLINES) {
> > > -               /* Isolines .xy should be stored in .zw, in order. */
> > > -               swiz = BRW_SWIZZLE4(0, 0, 0, 1);
> > > -               mask <<= 2;
> > > -            } else {
> > > -               /* Other domains are reversed; store .wzyx instead of
> > > .xyzw */
> > > -               swiz = BRW_SWIZZLE_WZYX;
> > > -               mask = writemask_for_backwards_vector(mask);
> > > -            }
> > > -         }
> > >        }
> > >
> > >        if (mask == 0)
> > > @@ -2851,48 +2731,6 @@ fs_visitor::nir_emit_tes_intrinsic(const
> > > fs_builder &bld,
> > >        }
> > >        break;
> > >
> > > -   case nir_intrinsic_load_tess_level_outer:
> > > -      /* When the TES reads gl_TessLevelOuter, we ensure that the
> patch
> > > header
> > > -       * appears as a push-model input.  So, we can simply use the
> ATTR
> > > file
> > > -       * rather than issuing URB read messages.  The data is stored
> in the
> > > -       * high DWords in reverse order - DWord 7 contains .x, DWord 6
> > > contains
> > > -       * .y, and so on.
> > > -       */
> > > -      switch (tes_prog_data->domain) {
> > > -      case BRW_TESS_DOMAIN_QUAD:
> > > -         for (unsigned i = 0; i < 4; i++)
> > > -            bld.MOV(offset(dest, bld, i), component(fs_reg(ATTR, 0),
> 7 -
> > > i));
> > > -         break;
> > > -      case BRW_TESS_DOMAIN_TRI:
> > > -         for (unsigned i = 0; i < 3; i++)
> > > -            bld.MOV(offset(dest, bld, i), component(fs_reg(ATTR, 0),
> 7 -
> > > i));
> > > -         break;
> > > -      case BRW_TESS_DOMAIN_ISOLINE:
> > > -         for (unsigned i = 0; i < 2; i++)
> > > -            bld.MOV(offset(dest, bld, i), component(fs_reg(ATTR, 0),
> 6 +
> > > i));
> > > -         break;
> > > -      }
> > > -      break;
> > > -
> > > -   case nir_intrinsic_load_tess_level_inner:
> > > -      /* When the TES reads gl_TessLevelInner, we ensure that the
> patch
> > > header
> > > -       * appears as a push-model input.  So, we can simply use the
> ATTR
> > > file
> > > -       * rather than issuing URB read messages.
> > > -       */
> > > -      switch (tes_prog_data->domain) {
> > > -      case BRW_TESS_DOMAIN_QUAD:
> > > -         bld.MOV(dest, component(fs_reg(ATTR, 0), 3));
> > > -         bld.MOV(offset(dest, bld, 1), component(fs_reg(ATTR, 0), 2));
> > > -         break;
> > > -      case BRW_TESS_DOMAIN_TRI:
> > > -         bld.MOV(dest, component(fs_reg(ATTR, 0), 4));
> > > -         break;
> > > -      case BRW_TESS_DOMAIN_ISOLINE:
> > > -         /* ignore - value is undefined */
> > > -         break;
> > > -      }
> > > -      break;
> > > -
> > >     case nir_intrinsic_load_input:
> > >     case nir_intrinsic_load_per_vertex_input: {
> > >        fs_reg indirect_offset = get_indirect_offset(instr);
> > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> > > b/src/mesa/drivers/dri/i965/brw_nir.c
> > > index 6f37e97a86f..46eeb1723b4 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > > @@ -141,9 +141,68 @@ remap_inputs_with_vue_map(nir_block *block, const
> > > struct brw_vue_map *vue_map)
> > >  }
> > >
> > >  static bool
> > > +remap_tess_levels(nir_builder *b, nir_intrinsic_instr *intr,
> > > +                  GLenum primitive_mode)
> > > +{
> > > +   const int location = nir_intrinsic_base(intr);
> > > +   const unsigned component = nir_intrinsic_component(intr);
> > > +   bool out_of_bounds;
> > > +
> > > +   if (location == VARYING_SLOT_TESS_LEVEL_INNER) {
> > > +      switch (primitive_mode) {
> > > +      case GL_QUADS:
> > > +         /* gl_TessLevelInner[0..1] lives at DWords 3-2 (reversed). */
> > > +         nir_intrinsic_set_base(intr, 0);
> > > +         nir_intrinsic_set_component(intr, 3 - component);
> > > +         out_of_bounds = false;
> > >
> >
> > What if component > 1?  I guess that's not really a problem but it is
> > out-of-bounds...
>
> The GLSL array is a float[2], so I guess I was assuming something would
> reject the out of bounds array access higher up in the compiler.
>
> Maybe that isn't a valid assumption though...
>
> > > +         break;
> > > +      case GL_TRIANGLES:
> > > +         /* gl_TessLevelInner[0] lives at DWord 4. */
> > > +         nir_intrinsic_set_base(intr, 1);
> > > +         out_of_bounds = component > 0;
> > > +         break;
> > > +      case GL_ISOLINES:
> > > +         out_of_bounds = true;
> > > +         break;
> > > +      default:
> > > +         unreachable("Bogus tessellation domain");
> > > +      }
> > > +   } else if (location == VARYING_SLOT_TESS_LEVEL_OUTER) {
> > > +      if (primitive_mode == GL_ISOLINES) {
> > > +         /* gl_TessLevelOuter[0..1] lives at DWords 6-7 (in order). */
> > > +         nir_intrinsic_set_base(intr, 1);
> > > +         nir_intrinsic_set_component(intr, 2 +
> > > nir_intrinsic_component(intr));
> > > +         out_of_bounds = component > 1;
> > > +      } else {
> > > +         /* Triangles use DWords 7-5 (reversed); Quads use 7-4
> (reversed)
> > > */
> > > +         nir_intrinsic_set_base(intr, 1);
> > > +         nir_intrinsic_set_component(intr, 3 -
> > > nir_intrinsic_component(intr));
> > > +         out_of_bounds = component == 3 && primitive_mode ==
> GL_TRIANGLES;
> > >
> > +      }
> > > +   } else {
> > > +      return false;
> > > +   }
> > > +
> > > +   if (out_of_bounds) {
> > > +      if (nir_intrinsic_infos[intr->intrinsic].has_dest) {
> > > +         b->cursor = nir_before_instr(&intr->instr);
> > > +         nir_ssa_def *undef = nir_ssa_undef(b, 1, 32);
> > > +         nir_ssa_def_rewrite_uses(&intr->dest.ssa,
> > > nir_src_for_ssa(undef));
> > > +      }
> > > +      nir_instr_remove(&intr->instr);
> > > +   }
> > > +
> > > +   return true;
> > > +}
> > > +
> > > +static bool
> > >  remap_patch_urb_offsets(nir_block *block, nir_builder *b,
> > > -                        const struct brw_vue_map *vue_map)
> > > +                        const struct brw_vue_map *vue_map,
> > > +                        GLenum tes_primitive_mode)
> > >  {
> > > +   const bool is_passthrough_tcs = b->shader->info->name &&
> > > +      strcmp(b->shader->info->name, "passthrough") == 0;
> > >
> >
> > This is gross...
> >
> > Also... Why?  What's so special about the passthrough that it doesn't
> need
> > tess level remaps?  I have a feeling there's some more general thing we
> > could be doing here.
>
> We've always been special casing it, so I'm not sure it's more gross
> than it was before...*shrug*.
>
> All TCS must write the patch header, which is two vec4 slots:
>
>    - gl_TessLevelOuter[4]
>    - gl_TessLevelInner[2]
>    - TR DS Cache Disable bit (DWord 0)
>
> For the passthrough shader, the inner/outer levels come from API state.
> So, I hooked up two built-in vec4 uniforms, and made their contents
> match the patch URB header exactly.  So, the shader doesn't have to
> do any remapping or shuffling...it just straight copies both vec4s.
> Simpler and more efficient.  But...a special case.
>
> > > +
> > >     nir_foreach_instr_safe(instr, block) {
> > >        if (instr->type != nir_instr_type_intrinsic)
> > >           continue;
> > > @@ -154,6 +213,11 @@ remap_patch_urb_offsets(nir_block *block,
> > > nir_builder *b,
> > >
> > >        if ((stage == MESA_SHADER_TESS_CTRL && is_output(intrin)) ||
> > >            (stage == MESA_SHADER_TESS_EVAL && is_input(intrin))) {
> > > +
> > > +         if (!is_passthrough_tcs &&
> > > +             remap_tess_levels(b, intrin, tes_primitive_mode))
> > > +            continue;
> > >
> >
> > Let's make sure I've got this right...
> >
> > We map everything from the varying identifiers to VUE slots.  For the
> case
> > of tesslevel, they will be assigned VUE slots 0 and 1 so the code below
> > will hever cause any other output to alias.  Then we rework stuff in
> > remap_tess_levels so that they map to the right locaion in 8-dword chunk
> at
> > the begining of the VUE.
>
> Yep, exactly.  I assigned VARYING_SLOT_TESS_LEVEL_INNER/OUTER to slots 0
> and 1, since both together live in those slots.  (Which is which doesn't
> actually matter.)  Here we special case those and take care of mapping
> the individual components to the right spot.
>
> > This function (remap_patch_urb_offsets) could really use a comment at the
> > top saying what it's doing.
>
> I suppose so.  It makes sense to me, but I wrote it...
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170104/49325cd7/attachment-0001.html>


More information about the mesa-dev mailing list