[Mesa-dev] [PATCH] i965/vec4: Allow CSE on uniform-vec4 expansion MOVs.

Matt Turner mattst88 at gmail.com
Tue Nov 11 09:56:22 PST 2014


On Tue, Nov 11, 2014 at 9:48 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Friday, October 24, 2014 03:09:45 PM Matt Turner wrote:
>> Three source instructions cannot directly source a packed vec4 (<0,4,1>
>> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to
>> both halves of a register.
>>
>> If these uniform values are used by multiple three-source instructions,
>> we'll emit multiple expansion moves, which we cannot combine in CSE
>> (because CSE emits moves itself).
>>
>> So emit a virtual instruction that we can CSE.
>>
>> Sometimes we demote a uniform to to a pull constant after emitting an
>> expansion move for it. In that case, recognize in opt_algebraic that if
>> the .file of the new instruction is GRF then it's just a real move that
>> we can copy propagate and such.
>>
>> total instructions in shared programs: 5509805 -> 5499940 (-0.18%)
>> instructions in affected programs:     348923 -> 339058 (-2.83%)
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h          | 1 +
>>  src/mesa/drivers/dri/i965/brw_shader.cpp         | 2 ++
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp           | 7 +++++++
>>  src/mesa/drivers/dri/i965/brw_vec4_cse.cpp       | 1 +
>>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 1 +
>>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 2 +-
>>  6 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index a68d607..a6807ae 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -904,6 +904,7 @@ enum opcode {
>>     SHADER_OPCODE_GEN7_SCRATCH_READ,
>>
>>     VEC4_OPCODE_PACK_BYTES,
>> +   VEC4_OPCODE_UNPACK_UNIFORM,
>>
>>     FS_OPCODE_DDX,
>>     FS_OPCODE_DDY,
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> index 87f2681..ea73dd7 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> @@ -457,6 +457,8 @@ brw_instruction_name(enum opcode op)
>>
>>     case VEC4_OPCODE_PACK_BYTES:
>>        return "pack_bytes";
>> +   case VEC$_OPCODE_UNPACK_UNIFORM:
>
> MONEY!     ^^^^^^^
>
> I'm really not crazy about this patch...it's just another MOV operation, which
> can be CSE'd, but isn't considered everywhere else we handle MOV.  Such as
> copy propagation, for example.
>
> I suppose this is OK, because you're only using it for "the following
> instruction can't handle <whatever>" MOVs, where you're not likely to be able
> to do things like copy propagation anyway.

That's exactly right. The only place it's emitted is from is
fix_3src_operand, which we know can't take these arguments directly so
copy propagation can't help us.

opt_algebraic changes the opcode to MOV if the source isn't a uniform,
like for an immediate. From there we can constant propagate it or
whatever.

> Speaking of, you could probably use it for other workarounds too:
> - resolve_ud_negate
> - fix_math_operand
> - emit_math1_gen6
> - emit_math2_gen6
>
> But I don't know if that would be helpful.
>
> Although I'm not a fan, I don't have anything better to suggest, and your
> statistics clearly show a benefit.  It's probably fine.

I really don't have any idea why. Short of adding a bunch of code to
CSE to rewrite operands rather than emitting another MOV (I don't know
how feasible that is, given the complexity of register coalescing) I
really don't know of a more elegant solution.

> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

Thanks!


More information about the mesa-dev mailing list