[Mesa-dev] [PATCH 8/9] mesa: Add partial constant propagation pass for Mesa IR
Ian Romanick
idr at freedesktop.org
Mon Aug 15 16:17:20 PDT 2011
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 08/15/2011 01:44 PM, Eric Anholt wrote:
> On Mon, 15 Aug 2011 12:02:42 -0700, "Ian Romanick" <idr at freedesktop.org> wrote:
>> From: Ian Romanick <ian.d.romanick at intel.com>
>>
>> This cleans up some code generated by the IR-to-Mesa pass for i915.
>> In particular, some shaders involving arrays of constant matrices
>> result in really bad code.
>
> I'm curious what sort of constructs led to this being needed at this
> level but not at GLSL IR level. I suspect that some of it (SEQ temp, a,
> a handling, for example) might be things that we should be doing in
> opt_algebraic and just failing to do. Then one comment below.
So... I wrote that commit message back in February, so some of the
problems may have been fixed by then. However, this pass does reduce
the OpenGL ES2 conformance test acos_float_frag_xvary from 109
instructions to 93. Looking at the instruction diffs, it appears that a
fair amount of the constant folding opportunities derive from changes
made by earlier Mesa IR optimization passes (e.g., CMP simplification).
For example, this sequence:
28: (expression bool all_equal (var_ref arr0) (constant int (0)) )
SEQ TEMP[30].x, TEMP[23].xxxx, CONST[2].xxxx;
29: (assign (x) (var_ref if_to_cond_assign_then) (expression bool
all_equal (var_ref arr0) (constant int (0)) ) )
MOV TEMP[31], TEMP[30].xxxx;
30: (assign (var_ref if_to_cond_assign_then) (x) (var_ref a)
(array_ref (var_ref asinValues) (constant int (0)) ) )
CMP TEMP[32], TEMP[30].-x-x-x-x, CONST[0].xxxx, TEMP[32];
31: (expression float + (array_ref (var_ref asinValues) (constant int
(1)) ) (expression float neg (var_ref a) ) )
ADD TEMP[34].x, CONST[0].yyyy, TEMP[32].-x-x-x-x;
becomes
7: SEQ TEMP[0].x, TEMP[2].xxxx, CONST[2].xxxx;
8: ADD TEMP[1].x, CONST[0].yyyy, CONST[0].-x-x-x-x;
and the constant folding eliminates the ADD.
We also emit sequences of DP, SEQ, SLT, etc. for some GLSL IR opcodes.
For example, ir_binop_any_equal(a,b) becomes:
SNE temp, a, b;
DP4 temp, temp, temp;
SLT temp, -temp, 0.0;
Previous to the earlier part of this series, the SLT would have been a SEQ.
I hacked up the optimizer to dump which instructions are (or could be)
optimized. Here are the results for a full piglit run (with ES2
conform). I looked at the code that generated some of these, and it's
not clear how they could be optimized at the GLSL IR level.
ADD: 408
CMP: 260
DP2: 21
DP3: 20
DP4: 280
MAD: 4
MUL: 19
RCP: 6
SEQ: 37
SEQ (same register): 2
SGE: 3
SGE (same register): 2
SGT: 10
SGT (same register): 1
SLE (same register): 4
SLT: 177
SLT (same register): 2
SNE: 604
SNE (same register): 57
TRUNC: 2
In any case, I have another version of this patch coming.
>> diff --git a/src/mesa/program/prog_opt_constant_fold.c b/src/mesa/program/prog_opt_constant_fold.c
>> new file mode 100644
>> index 0000000..2acd4f35
>> --- /dev/null
>> +++ b/src/mesa/program/prog_opt_constant_fold.c
>
>> + case OPCODE_DP2:
>> + case OPCODE_DP3:
>> + case OPCODE_DP4:
>> + if (src_regs_are_constant(inst, 2)) {
>> + float a[4];
>> + float b[4];
>> + float result;
>> +
>> + get_value(prog, &inst->SrcReg[0], a);
>> + get_value(prog, &inst->SrcReg[1], b);
>> +
>> + result = (a[0] * b[0]) + (a[1] * b[1])
>> + + (a[2] * b[2]) + (a[3] * b[3]);
>> +
>> + inst->Opcode = OPCODE_MOV;
>> + inst->SrcReg[0] = src_reg_for_float(prog, result);
>> + memset(& inst->SrcReg[1], 0, sizeof(inst->SrcReg[1]));
>> +
>> + progress = true;
>> + }
>> + break;
>
> This seems unlikely to be correct for DP2, DP3.
I think it happens to work because the swizzles of constants for these
opcodes put 0.0 in the unused slots. That is pretty fragile, though. I
can fix that.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
iEYEARECAAYFAk5JqQAACgkQX1gOwKyEAw+TZwCfVQJcPHFNQrrCJwMFm7pJa3RC
6pYAnRId3mh/6axlUvbfAbF7b6vhrDsU
=KaNc
-----END PGP SIGNATURE-----
More information about the mesa-dev
mailing list