[Mesa-dev] [PATCH] i965/vec4: Allow CSE on uniform-vec4 expansion MOVs.
Matt Turner
mattst88 at gmail.com
Tue Nov 11 09:56:22 PST 2014
On Tue, Nov 11, 2014 at 9:48 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Friday, October 24, 2014 03:09:45 PM Matt Turner wrote:
>> Three source instructions cannot directly source a packed vec4 (<0,4,1>
>> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to
>> both halves of a register.
>>
>> If these uniform values are used by multiple three-source instructions,
>> we'll emit multiple expansion moves, which we cannot combine in CSE
>> (because CSE emits moves itself).
>>
>> So emit a virtual instruction that we can CSE.
>>
>> Sometimes we demote a uniform to to a pull constant after emitting an
>> expansion move for it. In that case, recognize in opt_algebraic that if
>> the .file of the new instruction is GRF then it's just a real move that
>> we can copy propagate and such.
>>
>> total instructions in shared programs: 5509805 -> 5499940 (-0.18%)
>> instructions in affected programs: 348923 -> 339058 (-2.83%)
>> ---
>> src/mesa/drivers/dri/i965/brw_defines.h | 1 +
>> src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++
>> src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 +++++++
>> src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 1 +
>> src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 1 +
>> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +-
>> 6 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index a68d607..a6807ae 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -904,6 +904,7 @@ enum opcode {
>> SHADER_OPCODE_GEN7_SCRATCH_READ,
>>
>> VEC4_OPCODE_PACK_BYTES,
>> + VEC4_OPCODE_UNPACK_UNIFORM,
>>
>> FS_OPCODE_DDX,
>> FS_OPCODE_DDY,
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> index 87f2681..ea73dd7 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> @@ -457,6 +457,8 @@ brw_instruction_name(enum opcode op)
>>
>> case VEC4_OPCODE_PACK_BYTES:
>> return "pack_bytes";
>> + case VEC$_OPCODE_UNPACK_UNIFORM:
>
> MONEY! ^^^^^^^
>
> I'm really not crazy about this patch...it's just another MOV operation, which
> can be CSE'd, but isn't considered everywhere else we handle MOV. Such as
> copy propagation, for example.
>
> I suppose this is OK, because you're only using it for "the following
> instruction can't handle <whatever>" MOVs, where you're not likely to be able
> to do things like copy propagation anyway.
That's exactly right. The only place it's emitted is from is
fix_3src_operand, which we know can't take these arguments directly so
copy propagation can't help us.
opt_algebraic changes the opcode to MOV if the source isn't a uniform,
like for an immediate. From there we can constant propagate it or
whatever.
> Speaking of, you could probably use it for other workarounds too:
> - resolve_ud_negate
> - fix_math_operand
> - emit_math1_gen6
> - emit_math2_gen6
>
> But I don't know if that would be helpful.
>
> Although I'm not a fan, I don't have anything better to suggest, and your
> statistics clearly show a benefit. It's probably fine.
I really don't have any idea why. Short of adding a bunch of code to
CSE to rewrite operands rather than emitting another MOV (I don't know
how feasible that is, given the complexity of register coalescing) I
really don't know of a more elegant solution.
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
Thanks!
More information about the mesa-dev
mailing list