[Mesa-dev] [PATCH] nouveau: codegen: Take src swizzle into account on loads
imirkin at alum.mit.edu
Fri Apr 8 15:45:23 UTC 2016
On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdegoede at redhat.com> wrote:
> When dealing with non vector variables the llvm register allocator
> will use TEMP.x then TEMP.y, etc.
> When loading something from a global buffer it will calculate the
> address to use, and store that in say TEMP.x, so it ends up
> LOAD TEMP.y, MEMORY, TEMP
> Expecting the contents of TEMP.y to become the 32 bits of data
> to which TEMP.x is pointing. But instead it will get the 32 bits of
> data at address (TEMP.x + 4).
> With the old RES code one could generate the following TGSI:
> LOAD TEMP.y, RES.xxxx, TEMP
> And things would work fine since the .xxxx swizzling postfix would
> be honored and when storing to y (the only component set in the dest-mask)
> the x component at address (TEMP.x) would be loaded, rather then the
> y component at (TEMP.y)
> Note that another approach would be to not increment the address by
> a 32 bit word for skipped (not set in destmask) components.
> The way I see it either:
> 1) We see that LOAD does not deal with vectors, but with flat memory,
> in which case skipping 4 bytes because x is not set in the destmask
> does not make sense, as that is a vector thing todo.
> 2) LOAD is vector layout aware in which case supporting swizzling
> makes sense.
> Currently we have a weird hybrid which is rather cumbersome to
> work with from a compiler pov.
And I guess LLVM never ends up generating any of the other "funny"
instructions like LIT and the such. Well, I have no problem adding the
swizzling logic, i.e. the way that LOAD will now work (logically) is
that it will
(a) fetch 4 values from the coordinates provided (4 sequential dwords
from src1.x in the case of buffer/memory, RGBA colors from src1.xyz in
the case of images)
(b) swizzle them according to the swizzle on the MEMORY/BUFFER/IMAGE argument
(c) store that swizzled result into the destination based on the writemask
That would sound reasonable to me, and if I understand correctly, is
option 2 of your proposal. We'd need some docs updates and buy-in from
the other gallium driver developers.
STORE remains unchanged, as the MEMORY/etc is in the destination,
where there is a writemask, which is presently used and will remain
More information about the mesa-dev