[Mesa-dev] [PATCH] i965/vec4: Allow CSE on uniform-vec4 expansion MOVs.
Kenneth Graunke
kenneth at whitecape.org
Tue Nov 11 09:48:55 PST 2014
On Friday, October 24, 2014 03:09:45 PM Matt Turner wrote:
> Three source instructions cannot directly source a packed vec4 (<0,4,1>
> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to
> both halves of a register.
>
> If these uniform values are used by multiple three-source instructions,
> we'll emit multiple expansion moves, which we cannot combine in CSE
> (because CSE emits moves itself).
>
> So emit a virtual instruction that we can CSE.
>
> Sometimes we demote a uniform to to a pull constant after emitting an
> expansion move for it. In that case, recognize in opt_algebraic that if
> the .file of the new instruction is GRF then it's just a real move that
> we can copy propagate and such.
>
> total instructions in shared programs: 5509805 -> 5499940 (-0.18%)
> instructions in affected programs: 348923 -> 339058 (-2.83%)
> ---
> src/mesa/drivers/dri/i965/brw_defines.h | 1 +
> src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++
> src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 +++++++
> src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 1 +
> src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 1 +
> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +-
> 6 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
b/src/mesa/drivers/dri/i965/brw_defines.h
> index a68d607..a6807ae 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -904,6 +904,7 @@ enum opcode {
> SHADER_OPCODE_GEN7_SCRATCH_READ,
>
> VEC4_OPCODE_PACK_BYTES,
> + VEC4_OPCODE_UNPACK_UNIFORM,
>
> FS_OPCODE_DDX,
> FS_OPCODE_DDY,
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 87f2681..ea73dd7 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -457,6 +457,8 @@ brw_instruction_name(enum opcode op)
>
> case VEC4_OPCODE_PACK_BYTES:
> return "pack_bytes";
> + case VEC$_OPCODE_UNPACK_UNIFORM:
MONEY! ^^^^^^^
I'm really not crazy about this patch...it's just another MOV operation, which
can be CSE'd, but isn't considered everywhere else we handle MOV. Such as
copy propagation, for example.
I suppose this is OK, because you're only using it for "the following
instruction can't handle <whatever>" MOVs, where you're not likely to be able
to do things like copy propagation anyway.
Speaking of, you could probably use it for other workarounds too:
- resolve_ud_negate
- fix_math_operand
- emit_math1_gen6
- emit_math2_gen6
But I don't know if that would be helpful.
Although I'm not a fan, I don't have anything better to suggest, and your
statistics clearly show a benefit. It's probably fine.
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
> + return "unpack_uniform";
>
> case FS_OPCODE_DDX:
> return "ddx";
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 5029495..cb8658d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -717,6 +717,13 @@ vec4_visitor::opt_algebraic()
>
> foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
> switch (inst->opcode) {
> + case VEC4_OPCODE_UNPACK_UNIFORM:
> + if (inst->src[0].file != UNIFORM) {
> + inst->opcode = BRW_OPCODE_MOV;
> + progress = true;
> + }
> + break;
> +
> case BRW_OPCODE_ADD:
> if (inst->src[1].is_zero()) {
> inst->opcode = BRW_OPCODE_MOV;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> index 28c69ca..6ff6902 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
> @@ -69,6 +69,7 @@ is_expression(const vec4_instruction *const inst)
> case BRW_OPCODE_PLN:
> case BRW_OPCODE_MAD:
> case BRW_OPCODE_LRP:
> + case VEC4_OPCODE_UNPACK_UNIFORM:
> return true;
> case SHADER_OPCODE_RCP:
> case SHADER_OPCODE_RSQ:
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index 37c0806..6150d1d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1183,6 +1183,7 @@ vec4_generator::generate_code(const cfg_t *cfg)
> }
>
> switch (inst->opcode) {
> + case VEC4_OPCODE_UNPACK_UNIFORM:
> case BRW_OPCODE_MOV:
> brw_MOV(p, dst, src[0]);
> break;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index 3b279dc..e10972b 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -302,7 +302,7 @@ vec4_visitor::fix_3src_operand(src_reg src)
>
> dst_reg expanded = dst_reg(this, glsl_type::vec4_type);
> expanded.type = src.type;
> - emit(MOV(expanded, src));
> + emit(VEC4_OPCODE_UNPACK_UNIFORM, expanded, src);
> return src_reg(expanded);
> }
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20141111/7fd9aa2a/attachment.sig>
More information about the mesa-dev
mailing list