[Mesa-dev] [PATCH 3/6] gallivm: optimize gather a bit, by using supplied destination type

Roland Scheidegger sroland at vmware.com
Wed Jan 18 15:46:10 UTC 2017


Am 18.01.2017 um 06:56 schrieb Dave Airlie:
> On 12 December 2016 at 10:11,  <sroland at vmware.com> wrote:
>> From: Roland Scheidegger <sroland at vmware.com>
>>
>> By using a dst_type in the the gather interface, gather has some more
>> knowledge about how values should be fetched.
>> E.g. if this is a 3x32bit fetch and dst_type is 4x32bit vector gather
>> will no longer do a ZExt with a 96bit scalar value to 128bit, but
>> just fetch the 96bit as 3x32bit vector (this is still going to be
>> 2 loads of course, but the loads can be done directly to simd vector
>> that way).
>> Also, we can now do some try to use the right int/float type. This should
>> make no difference really since there's typically no domain transition
>> penalties for such simd loads, however it actually makes a difference
>> since llvm will use different shuffle lowering afterwards so the caller
>> can use this to trick llvm into using sane shuffle afterwards (and yes
>> llvm is really stupid there - nothing against using the shuffle
>> instruction from the correct domain, but not at the cost of doing 3 times
>> more shuffles, the case which actually matters is refusal to use shufps
>> for integer values).
>> Also do some attempt to avoid things which look great on paper but llvm
>> doesn't really handle (e.g. fetching 3-element 8 bit and 16 bit vectors
>> which is simply disastrous - I suspect type legalizer is to blame trying
>> to extend these vectors to 128bit types somehow, so fetching these with
>> scalars like before which is suboptimal due to the ZExt).
>>
>> Remove the ability for truncation (no point, this is gather, not conversion)
>> as it is complex enough already.
>>
>> While here also implement not just the float, but also the 64bit avx2
>> gathers (disabled though since based on the theoretical numbers the benefit
>> just isn't there at all until Skylake at least).
> 
> Hi Roland,
> 
> This breaks the build on big endian machines.
> 
>   CC       gallivm/lp_bld_gather.lo
>   CC       gallivm/lp_bld_init.lo
> gallivm/lp_bld_gather.c: In function 'lp_build_gather_elem_vec':
> gallivm/lp_bld_gather.c:238:42: error: 'dst_elem_type' undeclared
> (first use in this function)
>                              LLVMConstInt(dst_elem_type,
>                                           ^
> gallivm/lp_bld_gather.c:238:42: note: each undeclared identifier is
> reported only once for each function it appears in
> gallivm/lp_bld_gather.c: In function 'lp_build_gather':
> 

Oh right. I thought I actually hack-tested compilation for this, but
apparently not the latest version...
I've pushed a trivial fix for this, though I have to say I'm not really
certain this change all works correct on big endian arch, though I tried
to keep things the same...

Roland





More information about the mesa-dev mailing list