[Mesa-dev] [PATCH 2/2] [RFC] radv: add scratch support for spilling.

Nicolai Hähnle nhaehnle at gmail.com
Mon Oct 10 08:06:54 UTC 2016


On 10.10.2016 05:45, Dave Airlie wrote:
> On 10 October 2016 at 13:25, Dave Airlie <airlied at gmail.com> wrote:
>> From: Dave Airlie <airlied at redhat.com>
>>
>> This is a bit of a hack due to how llvm currently handles
>> spilling in it's shader ABI. Currently llvm amdgpu backend
>> uses relocations to patch the shader with the address of
>> the tmpring. The driver loads the shader and patches the
>> relocations.
>>
>> However for vulkan this doesn't work so well for a few reasons
>> a) when we build/load the shaders we aren't constructing the
>> command stream yet, and the same shader could be using in multiple
>> command streams.
>>
>> b) multiple command execution engines for compute shaders.
>>
>> So ideally we'd fix LLVM to understand the ABI convention, possibly
>> we'd fix it so user sgpr 0,1 are used (this hack uses 10/11).
>>
>> This patch when it gets the shader back from llvm it patches
>> the relocation dword to a nop, and patches to previous mov command
>> to move from SGPR 10 and 11. This works usually as it seems the
>> SGPR loading of the spill stuff is at the start of shaders always
>> so the 10/11 user sgprs haven't been trashed yet. I'm not 100%
>> sure this will work all the time, but for now this should allow
>> us to pass a bunch more CTS tests and make the Sascha computeshader
>> demo to work.
>
> So I found a shader that this doesn't work so well with unfortunately, so
> while I'd like this as a temporary solution I probably need to start
> digging into llvm.
>
> My current plan is to add a flag to llvm to denote ability to spill to
> the address
> in userdata sgpr 0/1, have llvm preload those and use them instead.
>
> Now the other question I have is, should I be killing two user data sgprs for
> this purpose, or should we define a better ABI, so that the first descriptor
> in the buffer pointed to by these is the scratch buffer, and other things
> could be queued after it, (like push constants and dynamic descriptor).
>
> Bas, nha? not sure if Matt is on this list.

Getting rid of the relocations would be nice for OpenGL as well. It 
always feels like a bit of a hack.

For OpenGL, it's important that non-monolithic shaders work. In 
practice, this means that it must be possible to pass the scratch buffer 
pointer through the prolog shader into the main part.

It seems to me the simplest way to ensure this is to add an explicit 
function argument. Something like:

define amdgpu_XX TYPE @prolog(i64 inreg %spillptr, ....) 
"amdgpu-spill-ptr"="0" {
    ...
    %retval = insertvalue TYPE %prev, i64 %spillptr, 0
    ret TYPE %retval
}

amdgpu-spill-ptr would be a function attribute indicating which function 
argument (by position) holds the base pointer for spilling.

I also think that loading the spill pointer indirectly makes a lot of 
sense. After all, user data is scarce and most shaders don't spill. This 
could easily fit in the function attribute scheme. Something like 
"amdgpu-spill-ptr=1:16" to indicate that the second function argument is 
a 64-bit pointer, and the spill pointer should be loaded from there with 
a 16 byte offset.

FWIW, even with the indirect load, I think we should only load a 64-bit 
pointer instead of a full descriptor. This is because we should really 
set the size limit of the spill descriptor correctly to implement 
robustness semantics (this is relevant because the spill descriptor is 
also used for temporary arrays; not sure if Vulkan cares, but OpenGL 
definitely does).

Cheers,
Nicolai


More information about the mesa-dev mailing list