[Mesa-dev] [PATCH] i965/vec4: Allow CSE on uniform-vec4 expansion MOVs.

Kenneth Graunke kenneth at whitecape.org
Tue Nov 11 09:48:55 PST 2014


On Friday, October 24, 2014 03:09:45 PM Matt Turner wrote:
> Three source instructions cannot directly source a packed vec4 (<0,4,1>
> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to
> both halves of a register.
> 
> If these uniform values are used by multiple three-source instructions,
> we'll emit multiple expansion moves, which we cannot combine in CSE
> (because CSE emits moves itself).
> 
> So emit a virtual instruction that we can CSE.
> 
> Sometimes we demote a uniform to to a pull constant after emitting an
> expansion move for it. In that case, recognize in opt_algebraic that if
> the .file of the new instruction is GRF then it's just a real move that
> we can copy propagate and such.
> 
> total instructions in shared programs: 5509805 -> 5499940 (-0.18%)
> instructions in affected programs:     348923 -> 339058 (-2.83%)
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h          | 1 +
>  src/mesa/drivers/dri/i965/brw_shader.cpp         | 2 ++
>  src/mesa/drivers/dri/i965/brw_vec4.cpp           | 7 +++++++
>  src/mesa/drivers/dri/i965/brw_vec4_cse.cpp       | 1 +
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 1 +
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 2 +-
>  6 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
> index a68d607..a6807ae 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -904,6 +904,7 @@ enum opcode {
>     SHADER_OPCODE_GEN7_SCRATCH_READ,
>  
>     VEC4_OPCODE_PACK_BYTES,
> +   VEC4_OPCODE_UNPACK_UNIFORM,
>  
>     FS_OPCODE_DDX,
>     FS_OPCODE_DDY,
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 87f2681..ea73dd7 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -457,6 +457,8 @@ brw_instruction_name(enum opcode op)
>  
>     case VEC4_OPCODE_PACK_BYTES:
>        return "pack_bytes";
> +   case VEC$_OPCODE_UNPACK_UNIFORM:

MONEY!     ^^^^^^^

I'm really not crazy about this patch...it's just another MOV operation, which 
can be CSE'd, but isn't considered everywhere else we handle MOV.  Such as 
copy propagation, for example.

I suppose this is OK, because you're only using it for "the following 
instruction can't handle <whatever>" MOVs, where you're not likely to be able 
to do things like copy propagation anyway.

Speaking of, you could probably use it for other workarounds too:
- resolve_ud_negate
- fix_math_operand
- emit_math1_gen6
- emit_math2_gen6

But I don't know if that would be helpful.

Although I'm not a fan, I don't have anything better to suggest, and your 
statistics clearly show a benefit.  It's probably fine.

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

> +      return "unpack_uniform";
>  
>     case FS_OPCODE_DDX:
>        return "ddx";
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 5029495..cb8658d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -717,6 +717,13 @@ vec4_visitor::opt_algebraic()
>  
>     foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
>        switch (inst->opcode) {
> +      case VEC4_OPCODE_UNPACK_UNIFORM:
> +         if (inst->src[0].file != UNIFORM) {
> +            inst->opcode = BRW_OPCODE_MOV;
> +            progress = true;
> +         }
> +         break;
> +
>        case BRW_OPCODE_ADD:
>  	 if (inst->src[1].is_zero()) {
>  	    inst->opcode = BRW_OPCODE_MOV;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> index 28c69ca..6ff6902 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> @@ -69,6 +69,7 @@ is_expression(const vec4_instruction *const inst)
>     case BRW_OPCODE_PLN:
>     case BRW_OPCODE_MAD:
>     case BRW_OPCODE_LRP:
> +   case VEC4_OPCODE_UNPACK_UNIFORM:
>        return true;
>     case SHADER_OPCODE_RCP:
>     case SHADER_OPCODE_RSQ:
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index 37c0806..6150d1d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1183,6 +1183,7 @@ vec4_generator::generate_code(const cfg_t *cfg)
>        }
>  
>        switch (inst->opcode) {
> +      case VEC4_OPCODE_UNPACK_UNIFORM:
>        case BRW_OPCODE_MOV:
>           brw_MOV(p, dst, src[0]);
>           break;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index 3b279dc..e10972b 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -302,7 +302,7 @@ vec4_visitor::fix_3src_operand(src_reg src)
>  
>     dst_reg expanded = dst_reg(this, glsl_type::vec4_type);
>     expanded.type = src.type;
> -   emit(MOV(expanded, src));
> +   emit(VEC4_OPCODE_UNPACK_UNIFORM, expanded, src);
>     return src_reg(expanded);
>  }
>  
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20141111/7fd9aa2a/attachment.sig>


More information about the mesa-dev mailing list