[Mesa-dev] [PATCH] nir: Fix output swizzle in get_mul_for_src

Jason Ekstrand jason at jlekstrand.net
Wed May 27 07:28:09 PDT 2015


Upon further consideration and actually seeing the patch, I think
using num_components would be better.  Calling it writemask isn't
really true since it isn't the writemask for the mul.  I thought about
calling it read_mask but that isn't really true either because it
isn't the components of the mul that get read either.  It's only the
write/read mask when combined with the swizzle.  I guess we could call
it swizzle_mask, but that just seems strange.  Using the number of
components nicely side-steps the whole problem.  Since this pass
fundamentally requires SSA, I don't think that's a problem.
--Jason

On Wed, May 27, 2015 at 1:10 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
> When we compute the output swizzle we want to consider the writemask of the
> add operation, not the one from the multiplication.
> ---
>  src/glsl/nir/nir_opt_peephole_ffma.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c b/src/glsl/nir/nir_opt_peephole_ffma.c
> index b430eac..c895c22 100644
> --- a/src/glsl/nir/nir_opt_peephole_ffma.c
> +++ b/src/glsl/nir/nir_opt_peephole_ffma.c
> @@ -73,7 +73,8 @@ are_all_uses_fadd(nir_ssa_def *def)
>  }
>
>  static nir_alu_instr *
> -get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs)
> +get_mul_for_src(nir_alu_src *src, int writemask,
> +                uint8_t swizzle[4], bool *negate, bool *abs)
>  {
>     assert(src->src.is_ssa && !src->abs && !src->negate);
>
> @@ -85,16 +86,16 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs)
>     switch (alu->op) {
>     case nir_op_imov:
>     case nir_op_fmov:
> -      alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs);
> +      alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs);
>        break;
>
>     case nir_op_fneg:
> -      alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs);
> +      alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs);
>        *negate = !*negate;
>        break;
>
>     case nir_op_fabs:
> -      alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs);
> +      alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs);
>        *negate = false;
>        *abs = true;
>        break;
> @@ -116,7 +117,7 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool *abs)
>        return NULL;
>
>     for (unsigned i = 0; i < 4; i++) {
> -      if (!(alu->dest.write_mask & (1 << i)))
> +      if (!(writemask & (1 << i)))
>           break;
>
>        swizzle[i] = swizzle[src->swizzle[i]];
> @@ -160,7 +161,8 @@ nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>           negate = false;
>           abs = false;
>
> -         mul = get_mul_for_src(&add->src[add_mul_src], swizzle, &negate, &abs);
> +         mul = get_mul_for_src(&add->src[add_mul_src], add->dest.write_mask,
> +                               swizzle, &negate, &abs);
>
>           if (mul != NULL)
>              break;
> --
> 1.9.1
>


More information about the mesa-dev mailing list