[Nouveau] [PATCH] nouveau: codegen: Take src swizzle into account on loads
Hans de Goede
hdegoede at redhat.com
Fri Apr 8 15:28:52 UTC 2016
Hi,
On 08-04-16 17:02, Ilia Mirkin wrote:
> On Fri, Apr 8, 2016 at 5:27 AM, Hans de Goede <hdegoede at redhat.com> wrote:
>> Hi,
>>
>> On 07-04-16 15:58, Ilia Mirkin wrote:
>>>
>>> That's wrong.
>>
>>
>> It used to work with the old RES[] code and if one cannot specify
>> a source swizzle, then how can I do something like
>>
>> LOAD TEMP[0].y, MEMORY[0], address
>>
>> And get the data at absolute global memory address "address" into TEMP[0].y
>> ?
>>
>> This is a must-have for llvm to be able to generate working TGSI code,
>> I do not see any way around this.
>>
>> AFAIK this is exactly what src-swizzling is for. Also note that
>> this commit does not change anything if no src-swizzling is specified,
>> in that case things work exactly as before.
>>
>>> The spec for the instruction needs to be clarified...
>>>
>>> The current nouveau impl is correct - only the .x of the address
>>> should be loaded, with up to 16 bytes read into the destination.
>>
>>
>> Ah note this is not about swizzling on the address, that indeed
>> makes no sense given how the addressing works for BUFFERS / MEMORY,
>> no this is about adding a swizlling postfix to the buffer / memory
>> resource specification, for example:
>>
>> LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0]
>>
>> See the swizzling is done on the resource, not on the address, so
>> the swizzling specifies swizzling of the up to 16 bytes read from
>> address, it does not influence the address handling at all.
>>
>> I now see I made an error in my commit msg, it gives the following
>> example:
>>
>> LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0].x
>>
>> This clearly is wrong, the last TEMP[0].x is not even valid TGSI,
>> the correct example would be:
>>
>> LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0]
>
> I stand by my comment of "working as intended". But that doesn't mean
> the intent can't be changed :)
>
> For memory/buffers, LOAD takes the address at TEMP[0].x and loads 16
> bytes (4 words), and sticks them into the destination's .xyzw. If you
> happen to have a writemask, then only some of those are written out.
>
> It seems that you're trying to add additional meaning to the swizzle
> on the "memory" argument. However I don't believe that such a thing is
> defined. (And definitely not used anywhere, at least not on purpose.)
>
> Why does this cause you issues with LLVM-generated TGSI?
When dealing with non vector variables the llvm register allocator
will use TEMP[0].x then TEMP[0].y, etc.
When loading something from a global buffer it will calculate the
address to use, and store that in say TEMP[0].x, so it ends up
generating:
LOAD TEMP[0].y, MEMORY[0], TEMP[0]
Expecting the contents of TEMP[0].y to become the 32 bits of data
to which TEMP[0].x is pointing. But instead it will get the 32 bits of
data at address (TEMP[0].x + 4).
With the old RES[32767] code one could generate the following TGSI:
LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0]
And things would work fine since the .xxxx swizzling postfix would
be honored and when storing to y (the only component set in the dest-mask)
the x component at address (TEMP[0].x) would be loaded, rather then the
y component at (TEMP[0].y)
Note that another approach would be to not increment the address by
a 32 bit word for skipped (not set in destmask) components.
The way I see it either:
1) We see that LOAD does not deal with vectors, but with flat memory,
in which case skipping 4 bytes because x is not set in the destmask
does not make sense, as that is a vector thing todo.
2) LOAD is vector layout aware in which case supporting swizzling
makes sense.
Currently we have a weird hybrid which is rather cumbersome to
work with from a compiler pov.
Regards,
Hans
>
> -ilia
>
More information about the Nouveau
mailing list