[Mesa-dev] [PATCH] i915: Optimize SEQ and SNE when two operands are uniforms
Eric Anholt
eric at anholt.net
Mon Aug 26 18:09:25 PDT 2013
Ian Romanick <idr at freedesktop.org> writes:
> From: Ian Romanick <ian.d.romanick at intel.com>
>
> SEQ and SNE are not native i915 instructions, so they each generate at
> least 3 instructions. If both operands are uniforms or constants, we
> get 5 instructions like:
>
> U[1] = MOV CONST[1]
> U[0].xyz = SGE CONST[0].xxxx, U[1]
> U[1] = MOV CONST[1].-x-y-z-w
> R[0].xyz = SGE CONST[0].-x-x-x-x, U[1]
> R[0].xyz = MUL R[0], U[0]
>
> This code is stupid. Instead of having the individual calls to
> i915_emit_arith generate the moves to utemps, do it in the caller. This
> results in code like:
>
> U[1] = MOV CONST[1]
> U[0].xyz = SGE CONST[0].xxxx, U[1]
> R[0].xyz = SGE CONST[0].-x-x-x-x, U[1].-x-y-z-w
> R[0].xyz = MUL R[0], U[0]
>
> This allows fs-temp-array-mat2-index-col-wr and
> fs-temp-array-mat2-index-row-wr to fit in hardware limits (instead of
> falling back to software rasterization).
>
> NOTE: Without pending patches to the piglit tests, these tests will now
> fail. This is an unrelated, pre-existing issue.
>
> Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
> ---
> src/mesa/drivers/dri/i915/i915_fragprog.c | 46 +++++++++++++++++++++++--------
> 1 file changed, 34 insertions(+), 12 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i915/i915_fragprog.c b/src/mesa/drivers/dri/i915/i915_fragprog.c
> index 930c2b8..c304d22 100644
> --- a/src/mesa/drivers/dri/i915/i915_fragprog.c
> +++ b/src/mesa/drivers/dri/i915/i915_fragprog.c
> @@ -817,23 +817,34 @@ upload_program(struct i915_fragment_program *p)
> flags = get_result_flags(inst);
> dst = get_result_vector(p, inst);
>
> + src0 = src_vector(p, &inst->SrcReg[0], program);
> + src1 = src_vector(p, &inst->SrcReg[1], program);
> +
> + if (GET_UREG_TYPE(src0) == REG_TYPE_CONST
> + && GET_UREG_TYPE(src1) == REG_TYPE_CONST) {
> + unsigned tmp = i915_get_utemp(p);
> +
> + i915_emit_arith(p, A0_MOV, tmp, A0_DEST_CHANNEL_ALL, 0,
> + src1, 0, 0);
> +
> + src1 = tmp;
> + }
I'd copy in the note from the commit message here (and below) that this
is just optimization to reduce instruction count. Other than that, this
seems like a reasonable solution to the problem.
Of course, we could potentially do this more generally by walking up our
instruction stream in emit_arith and finding if the const has already
been moved to a temp and reusing it. But if we were to block i915 fixes
on making i915 not suck in general, we'd never get anywhere, so:
Reviewed-by: Eric Anholt <eric at anholt.net>
(In the process of reviewing this, I noted at least 3 obvious things you
ought to do in the i915 compiler that aren't done. Sigh.)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130826/f10f3110/attachment-0001.pgp>
More information about the mesa-dev
mailing list