[Mesa-dev] [PATCH] i915: Optimize SEQ and SNE when two operands are uniforms

Eric Anholt eric at anholt.net
Mon Aug 26 18:09:25 PDT 2013


Ian Romanick <idr at freedesktop.org> writes:

> From: Ian Romanick <ian.d.romanick at intel.com>
>
> SEQ and SNE are not native i915 instructions, so they each generate at
> least 3 instructions.  If both operands are uniforms or constants, we
> get 5 instructions like:
>
>                 U[1] = MOV CONST[1]
>                 U[0].xyz = SGE CONST[0].xxxx, U[1]
>                 U[1] = MOV CONST[1].-x-y-z-w
>                 R[0].xyz = SGE CONST[0].-x-x-x-x, U[1]
>                 R[0].xyz = MUL R[0], U[0]
>
> This code is stupid.  Instead of having the individual calls to
> i915_emit_arith generate the moves to utemps, do it in the caller.  This
> results in code like:
>
>                 U[1] = MOV CONST[1]
>                 U[0].xyz = SGE CONST[0].xxxx, U[1]
>                 R[0].xyz = SGE CONST[0].-x-x-x-x, U[1].-x-y-z-w
>                 R[0].xyz = MUL R[0], U[0]
>
> This allows fs-temp-array-mat2-index-col-wr and
> fs-temp-array-mat2-index-row-wr to fit in hardware limits (instead of
> falling back to software rasterization).
>
> NOTE: Without pending patches to the piglit tests, these tests will now
> fail.  This is an unrelated, pre-existing issue.
>
> Signed-off-by: Ian Romanick <ian.d.romanick at intel.com>
> ---
>  src/mesa/drivers/dri/i915/i915_fragprog.c | 46 +++++++++++++++++++++++--------
>  1 file changed, 34 insertions(+), 12 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i915/i915_fragprog.c b/src/mesa/drivers/dri/i915/i915_fragprog.c
> index 930c2b8..c304d22 100644
> --- a/src/mesa/drivers/dri/i915/i915_fragprog.c
> +++ b/src/mesa/drivers/dri/i915/i915_fragprog.c
> @@ -817,23 +817,34 @@ upload_program(struct i915_fragment_program *p)
>  	 flags = get_result_flags(inst);
>  	 dst = get_result_vector(p, inst);
>  
> +         src0 = src_vector(p, &inst->SrcReg[0], program);
> +         src1 = src_vector(p, &inst->SrcReg[1], program);
> +
> +         if (GET_UREG_TYPE(src0) == REG_TYPE_CONST
> +             && GET_UREG_TYPE(src1) == REG_TYPE_CONST) {
> +            unsigned tmp = i915_get_utemp(p);
> +
> +            i915_emit_arith(p, A0_MOV, tmp, A0_DEST_CHANNEL_ALL, 0,
> +                            src1, 0, 0);
> +
> +            src1 = tmp;
> +         }

I'd copy in the note from the commit message here (and below) that this
is just optimization to reduce instruction count.  Other than that, this
seems like a reasonable solution to the problem.

Of course, we could potentially do this more generally by walking up our
instruction stream in emit_arith and finding if the const has already
been moved to a temp and reusing it.  But if we were to block i915 fixes
on making i915 not suck in general, we'd never get anywhere, so:

Reviewed-by: Eric Anholt <eric at anholt.net>

(In the process of reviewing this, I noted at least 3 obvious things you
ought to do in the i915 compiler that aren't done.  Sigh.)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130826/f10f3110/attachment-0001.pgp>


More information about the mesa-dev mailing list