[Mesa-dev] [PATCH] i965/tiled_memcpy: realign rgba8_copy_aligned_dst stack in 32-bit builds

Chris Wilson chris at chris-wilson.co.uk
Tue Mar 20 20:50:39 UTC 2018


Quoting Scott D Phillips (2018-03-20 20:39:25)
> When building intel_tiled_memcpy for i686, the stack will only be
> 4-byte aligned. This isn't sufficient for SSE temporaries which
> require 16-byte alignment.  Use the force_align_arg_pointer
> function attribute in that case to ensure sufficient alignment.
> ---
>  src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> index 69306828d72..bd8bafbd2d7 100644
> --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> @@ -42,6 +42,12 @@
>  #include <emmintrin.h>
>  #endif
>  
> +#if defined(__GNUC__) && defined(__i386__) && (defined(__SSSE3__) || defined(__SSE2__))
> +#define REALIGN __attribute__((force_align_arg_pointer))
> +#else
> +#define REALIGN
> +#endif

It would be a harmless no-op on x86-64 (or essential ;)

> +
>  #define FILE_DEBUG_FLAG DEBUG_TEXTURE
>  
>  #define ALIGN_DOWN(a, b) ROUND_DOWN_TO(a, b)
> @@ -156,7 +162,7 @@ rgba8_copy_16_aligned_src(void *dst, const void *src)
>  /**
>   * Copy RGBA to BGRA - swap R and B, with the destination 16-byte aligned.
>   */
> -static inline void *
> +static REALIGN inline void *
>  rgba8_copy_aligned_dst(void *dst, const void *src, size_t bytes)
>  {
>     assert(bytes == 0 || !(((uintptr_t)dst) & 0xf));

Hmm, if aligned_dst is spilling to stack, so would be aligned_src. As
these are supposed to be inlined (and constant folded), do you not want
to realign the callers instead? Perhaps with an explicit FLATTEN.
-Chris


More information about the mesa-dev mailing list