[Mesa-dev] [PATCH 4/6] i965/fs: Add support for ir_tg4

Kenneth Graunke kenneth at whitecape.org
Tue Apr 2 22:15:32 PDT 2013


On 03/31/2013 02:10 AM, Chris Forbes wrote:
> Lowers ir_tg4 (from textureGather and textureGatherOffset builtins) to
> SHADER_OPCODE_TG4.
>
> The usual post-sampling swizzle workaround can't work for ir_tg4,
> so avoid doing that:
>
> * For R/G/B/A swizzles use the hardware channel select (lives in the
>     same dword in the header as the texel offset), and then don't do
>     anything afterward in the shader.
> * For 0/1 swizzles blast the appropriate constant over all the output
>     channels in swizzle_result().
>
> Signed-off-by: Chris Forbes <chrisf at ijw.co.nz>
> ---
>   src/mesa/drivers/dri/i965/brw_fs.h           |  1 +
>   src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 55 ++++++++++++++++++++++++++++
>   2 files changed, 56 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
> index d9d17a2..bc93bdf 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -250,6 +250,7 @@ public:
>      void visit(ir_function *ir);
>      void visit(ir_function_signature *ir);
>
> +   uint32_t gather_channel(ir_texture *ir, int sampler);
>      void swizzle_result(ir_texture *ir, fs_reg orig_val, int sampler);
>
>      bool can_do_source_mods(fs_inst *inst);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 8556b56..2b77883 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1119,6 +1119,14 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate,
>         base_mrf--;
>      }
>
> +   if (ir->op == ir_tg4 && !header_present) {
> +      /* ir_tg4 needs to place its channel select in the header,
> +       * for interaction with ARB_texture_swizzle */
> +      header_present = true;
> +      mlen++;
> +      base_mrf--;
> +   }

I'm not a fan of duplicating this block.  Why not just change the one 
above's condition to:

    if (ir->op == ir_tg4 || (ir->offset && ir->op != ir_txf)) {

Feel free to keep the comment if you like.

> +
>      if (ir->shadow_comparitor) {
>         emit(MOV(fs_reg(MRF, base_mrf + mlen), shadow_c));
>         mlen += reg_width;
> @@ -1128,6 +1136,7 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate,
>      switch (ir->op) {
>      case ir_tex:
>      case ir_lod:
> +   case ir_tg4:
>         break;
>      case ir_txb:
>         emit(MOV(fs_reg(MRF, base_mrf + mlen), lod));
> @@ -1242,6 +1251,7 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate,
>      case ir_txf_ms: inst = emit(SHADER_OPCODE_TXF_MS, dst); break;
>      case ir_txs: inst = emit(SHADER_OPCODE_TXS, dst); break;
>      case ir_lod: inst = emit(SHADER_OPCODE_LOD, dst); break;
> +   case ir_tg4: inst = emit(SHADER_OPCODE_TG4, dst); break;
>      }
>      inst->base_mrf = base_mrf;
>      inst->mlen = mlen;
> @@ -1394,6 +1404,7 @@ fs_visitor::visit(ir_texture *ir)
>      switch (ir->op) {
>      case ir_tex:
>      case ir_lod:
> +   case ir_tg4:
>         break;
>      case ir_txb:
>         ir->lod_info.bias->accept(this);
> @@ -1416,6 +1427,8 @@ fs_visitor::visit(ir_texture *ir)
>         ir->lod_info.sample_index->accept(this);
>         sample_index = this->result;
>         break;
> +   default:
> +      assert(!"Unrecognized texture opcode");
>      };
>
>      /* Writemasking doesn't eliminate channels on SIMD8 texture
> @@ -1440,6 +1453,9 @@ fs_visitor::visit(ir_texture *ir)
>      if (ir->offset != NULL && ir->op != ir_txf)
>         inst->texture_offset = brw_texture_offset(ir->offset->as_constant());
>
> +   if (ir->op == ir_tg4)
> +      inst->texture_offset |= gather_channel(ir, sampler) << 16; // M0.2:16-17

Clever.  Not a bad approach, but perhaps we should rename the field to 
something more general.  Then again, message_header_bits isn't much 
better...

>      inst->sampler = sampler;
>
>      if (ir->shadow_comparitor)
> @@ -1460,6 +1476,24 @@ fs_visitor::visit(ir_texture *ir)
>   }
>
>   /**
> + * Set up the gather channel based on the swizzle, for gather4.
> + */
> +uint32_t
> +fs_visitor::gather_channel(ir_texture *ir, int sampler)
> +{
> +   int swiz = GET_SWZ(c->key.tex.swizzles[sampler], 0 /* red */);
> +   switch (swiz) {
> +      case SWIZZLE_X: return 0;
> +      case SWIZZLE_Y: return 1;
> +      case SWIZZLE_Z: return 2;
> +      case SWIZZLE_W: return 3;
> +      default:
> +         /* zero, one swizzles */
> +         return 0;
> +   }
> +}
> +
> +/**
>    * Swizzle the result of a texture result.  This is necessary for
>    * EXT_texture_swizzle as well as DEPTH_TEXTURE_MODE for shadow comparisons.
>    */
> @@ -1468,9 +1502,30 @@ fs_visitor::swizzle_result(ir_texture *ir, fs_reg orig_val, int sampler)
>   {
>      this->result = orig_val;
>
> +   /* txs isn't actually sampling the texture */
>      if (ir->op == ir_txs)
>         return;
>
> +   /* tg4 does the channel select in hardware for 'real' swizzles, but can't
> +    * do the degenerate ZERO/ONE cases, so we do them here:
> +    *
> +    * blast all the output channels with zero or one as appropriate
> +    */

So, if texture swizzling selects the channel...then zero/one are stupid.

It might be better to put this in visit(ir_texture *), and short-circuit 
the texturing operation altogether.  Why emit a gather4 only to throw 
away the results entirely?

> +   if (ir->op == ir_tg4) {
> +      int swiz = GET_SWZ(c->key.tex.swizzles[sampler], 0);
> +      if (swiz != SWIZZLE_ZERO && swiz != SWIZZLE_ONE)
> +         return;
> +
> +      fs_reg res = fs_reg(this, glsl_type::vec4_type);
> +      this->result = res;
> +
> +      for (int i=0; i<4; i++) {
> +         emit(MOV(res, fs_reg(swiz == SWIZZLE_ZERO ? 0.0f : 1.0f)));
> +         res.reg_offset++;
> +      }
> +      return;
> +   }
> +
>      if (ir->type == glsl_type::float_type) {
>         /* Ignore DEPTH_TEXTURE_MODE swizzling. */
>         assert(ir->sampler->type->sampler_shadow);


More information about the mesa-dev mailing list