[Mesa-dev] [PATCH] llvmpipe: fix arguments order given to vec_andc

Roland Scheidegger sroland at vmware.com
Sun Jan 17 09:00:13 PST 2016


Am 17.01.2016 um 14:31 schrieb Oded Gabbay:
> This patch fixes a classic "confuse the enemy" bug.
> 
> _mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the
> arguments are opposite.
> 
> _mm_andnot_si128 performs "r = (~a) & b" while
> vec_andc performs "r = a & (~b)"
> 
> To make sure this error won't return in another place, I added a wrapper
> function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside.
> 
> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
> ---
>  src/gallium/auxiliary/util/u_pwr8.h         | 6 ++++++
>  src/gallium/drivers/llvmpipe/lp_setup_tri.c | 2 +-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_pwr8.h b/src/gallium/auxiliary/util/u_pwr8.h
> index 1eca6d6..ffd9f92 100644
> --- a/src/gallium/auxiliary/util/u_pwr8.h
> +++ b/src/gallium/auxiliary/util/u_pwr8.h
> @@ -153,6 +153,12 @@ vec_mullo_epi32 (__m128i a, __m128i b)
>     return v;
>  }
>  
> +static inline __m128i
> +vec_andnot_si128 (__m128i a, __m128i b)
> +{
> +   return vec_andc (b, a);
> +}
> +
>  static inline void
>  transpose4_epi32(const __m128i * restrict a,
>                   const __m128i * restrict b,
> diff --git a/src/gallium/drivers/llvmpipe/lp_setup_tri.c b/src/gallium/drivers/llvmpipe/lp_setup_tri.c
> index aa24176..907129d 100644
> --- a/src/gallium/drivers/llvmpipe/lp_setup_tri.c
> +++ b/src/gallium/drivers/llvmpipe/lp_setup_tri.c
> @@ -556,7 +556,7 @@ do_triangle_ccw(struct lp_setup_context *setup,
>  
>        /* Calculate trivial reject values:
>         */
> -      eo = vec_sub_epi32(vec_andc(dcdy_neg_mask, dcdy),
> +      eo = vec_sub_epi32(vec_andnot_si128(dcdy_neg_mask, dcdy),
>                           vec_and(dcdx_neg_mask, dcdx));
>  
>        /* ei = _mm_sub_epi32(_mm_sub_epi32(dcdy, dcdx), eo); */
> 

Reviewed-by: Roland Scheidegger <sroland at vmware.com>

FWIW the order on the SSE intrinsic intuitively feels wrong to me, which
is why I end up doing it wrong
(http://lists.freedesktop.org/archives/mesa-dev/2016-January/105058.html) but
ah well...

Roland



More information about the mesa-dev mailing list