[Mesa-dev] [PATCH] gallivm: avx-512 changes for texel fetcher
Roland Scheidegger
sroland at vmware.com
Thu Jan 18 22:46:53 UTC 2018
Am 18.01.2018 um 20:26 schrieb Kyriazis, George:
>> On Jan 18, 2018, at 1:10 PM, Roland Scheidegger <sroland at vmware.com
>> <mailto:sroland at vmware.com>> wrote:
>>
>> Am 17.01.2018 um 23:33 schrieb George Kyriazis:
>>> The texture swizzle was not doing the right thing for avx512-style
>>> 16-wide loads.
>>>
>>> Special-case the post-load swizzle operations for avx512 so that we move
>>> the xyzw components correctly to the outputs.
>>>
>>> cc: Jose Fonseca <jfonseca at vmware.com <mailto:jfonseca at vmware.com>>
>>> ---
>>> src/gallium/auxiliary/gallivm/lp_bld_pack.c | 40
>>> +++++++++++++++++++++++++++--
>>> 1 file changed, 38 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_pack.c
>>> b/src/gallium/auxiliary/gallivm/lp_bld_pack.c
>>> index e8d4fcd..7879826 100644
>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_pack.c
>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_pack.c
>>> @@ -129,6 +129,31 @@ lp_build_const_unpack_shuffle_half(struct
>>> gallivm_state *gallivm,
>>> }
>>>
>>> /**
>>> + * Similar to lp_build_const_unpack_shuffle_half, but for AVX512
>>> + * See comment above lp_build_interleave2_half for more details.
>>> + */
>>> +static LLVMValueRef
>>> +lp_build_const_unpack_shuffle_16wide(struct gallivm_state *gallivm,
>>> + unsigned lo_hi)
>>> +{
>>> + LLVMValueRef elems[LP_MAX_VECTOR_LENGTH];
>>> + unsigned i, j;
>>> +
>>> + assert(lo_hi < 2);
>>> +
>>> + // for the following lo_hi setting, convert 0 -> f to:
>>> + // 0: 0 16 4 20 8 24 12 28 1 17 5 21 9 25 13 29
>>> + // 1: 2 18 6 22 10 26 14 30 3 19 7 23 11 27 15 31
>>> + for (i = 0; i < 16; i++) {
>>> + j = ((i&0x06)<<1) + ((i&1)<<4) + (i>>3) + (lo_hi<<1);
>>> +
>>> + elems[i] = lp_build_const_int32(gallivm, j);
>>> + }
>>> +
>>> + return LLVMConstVector(elems, 16);
>>> +}
>>> +
>>> +/**
>>> * Build shuffle vectors that match PACKxx (SSE) instructions or
>>> * VPERM (Altivec).
>>> */
>>> @@ -325,8 +350,8 @@ lp_build_interleave2(struct gallivm_state *gallivm,
>>> }
>>>
>>> /**
>>> - * Interleave vector elements but with 256 bit,
>>> - * treats it as interleave with 2 concatenated 128 bit vectors.
>>> + * Interleave vector elements but with 256 (or 512) bit,
>>> + * treats it as interleave with 2 concatenated 128 (or 256) bit vectors.
>>> *
>>> * This differs to lp_build_interleave2 as that function would do the
>>> following (for lo):
>>> * a0 b0 a1 b1 a2 b2 a3 b3, and this does not compile into an AVX
>>> unpack instruction.
>>> @@ -343,6 +368,14 @@ lp_build_interleave2(struct gallivm_state *gallivm,
>>> *
>>> * And interleave-hi would result in:
>>> * a2 b2 a3 b3 a6 b6 a7 b7
>>> + *
>>> + * For 512 bits, the following are true:
>>> + *
>>> + * Interleave-lo would result in (capital letters denote hex indices):
>>> + * a0 b0 a1 b1 a4 b4 a5 b5 a8 b8 a9 b9 aC bC aD bD
>>> + *
>>> + * Interleave-hi would result in:
>>> + * a2 b2 a3 b3 a6 b6 a7 b7 aA bA aB bB aE bE aF bF
>>> */
>>> LLVMValueRef
>>> lp_build_interleave2_half(struct gallivm_state *gallivm,
>>> @@ -354,6 +387,9 @@ lp_build_interleave2_half(struct gallivm_state
>>> *gallivm,
>>> if (type.length * type.width == 256) {
>>> LLVMValueRef shuffle =
>>> lp_build_const_unpack_shuffle_half(gallivm, type.length, lo_hi);
>>> return LLVMBuildShuffleVector(gallivm->builder, a, b, shuffle, "");
>>> + } else if ((type.length == 16) && (type.width == 32)) {
>>> + LLVMValueRef shuffle =
>>> lp_build_const_unpack_shuffle_16wide(gallivm, lo_hi);
>>> + return LLVMBuildShuffleVector(gallivm->builder, a, b, shuffle,
>>> "");
>> This is not really "interleave_half", more like "interleave_quarter"...
>> That said, avx512 certainly follows the same rules as avx256, so 128bit
>> pieces are treated independently. So maybe this should be renamed like
>> "interleave_native" or something like that.
>> Also, I believe it is definitely a mistake to restrict this to dword
>> interleaves here. You should handle all type widths, just like the
>> 256bit case can handle all widths.
>> And I'm not sure through which paths you reach this, but I'm not sure
>> why you don't need the corresponding unpack2_native and pack2_native
>> adjustments - it should not really be a special case, avx512 should
>> generally handle things like this (if you'd want to extend the gallivm
>> code to use avx512...). For that matter, the commit log and shortlog is
>> confusing, because this isn't directly related to texture fetching.
>>
>
> Thanks Roland,
>
> This check-in is a prerequisite for some avx-512 changes I am about to
> check in to the swr driver. I’ve hit this for piglit test
> vs-texelfetch-isampler1d and similar ones that try to fetch from a
> texture from a VS. You are right, there are probably other cases where
> this gets hit, but that was the place where I’ve hit it first. If you
> could recommend some tests that hit the other width cases, it would be
> great to fix them too; since didn’t know where those cases would be hit,
> I decided to make this “surgical” change.
Ok, so it is hit from the transpose_aos path. That should work alright,
since this only ever deals with dword elements.
I guess to see it hit from other paths you'd need some more proper
support for 512bit vectors, e.g. pack/unpack_native.
>
> My follow-on changes depend on this change for correctness, and I’d like
> to get the the other changes in before the mesa 18.0 date (tomorrow).
> AVX512 support is an ongoing effort on our side, so they will
> definitely be additional check-ins along the same lines.
>
> Thoughts on checking in as-is for 18.0, with upcoming changes slowly
> creeping in to fix additional issues, and make things more general?
I can live with that, it's your driver which might break,
llvmpipe shouldn't hit it :-).
As long as you fix up the commit message (should really mention the
transpose),
Reviewed-by: Roland Scheidegger <sroland at vmware.com>
> Thanks,
>
> George
>
>
>> Roland
>>
>>
>>
>>
>>
>>> } else {
>>> return lp_build_interleave2(gallivm, type, a, b, lo_hi);
>>> }
>>>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.freedesktop.org>
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0&m=WSTeuSUPvmSGYKAu9XgzHNSXvuo4qh2MJ2USAEQygWQ&s=4yOW4nVoM0041Ves_N6YfrGtHDSNgprm6JdNpLmzMKI&e=>
>
More information about the mesa-dev
mailing list