[Mesa-dev] [PATCH 5/6] radeonsi: use slot indexes for bindless handles

Marek Olšák maraeo at gmail.com
Mon Jul 17 17:07:50 UTC 2017


On Mon, Jul 17, 2017 at 4:35 AM, Samuel Pitoiset
<samuel.pitoiset at gmail.com> wrote:
>
>
> On 07/15/2017 02:54 AM, Marek Olšák wrote:
>>
>> On Wed, Jul 5, 2017 at 1:42 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>>>
>>> On 04.07.2017 15:05, Samuel Pitoiset wrote:
>>>>
>>>>
>>>> Using VRAM address as bindless handles is not a good idea because
>>>> we have to use LLVMIntToPTr and the LLVM CSE pass can't optimize
>>>> because it has no information about the pointer.
>>>>
>>>> Instead, use slots indexes like the existing descriptors.
>>>>
>>>> This improves performance with DOW3 by +7%.
>>>
>>>
>>>
>>> Wow.
>>>
>>> The thing is, burning a pair of user SGPRs for this seems a bit overkill,
>>> especially since it also hurts apps that don't use bindless at all.
>>>
>>> Do you have some examples of how LLVM fails here? Could we perhaps avoid
>>> most of the performance issues by casting 0 to an appropriate pointer
>>> type
>>> once, and then using the bindless handle as an index relative to that
>>> pointer?
>>
>>
>> The problem is inttoptr doesn't support noalias and LLVM passes assume
>> it's a generic pointer and therefore don't optimize it. radeonsi loads
>> descriptors before each use and relies on CSE to unify all equivalent
>> loads that are close to each other. Without CSE, the resulting code is
>> very bad.
>>
>> Another interesting aspect of having the bindless descriptor array in
>> user SGPRs is that we can do buffer invalidations easily by
>> reuploading the whole array. That, however, adds a lot of overhead,
>> because the array is usually huge (64 bytes * 1000 slots), so it's
>> usually worse than the current solution (partial flushes +
>> WRITE_DATA). The bindless array could be packed better though.
>> Textures need 12 dwords, images need 8 dwords, and buffers need 4
>> dwords. Right now, all slots have 16 dwords.
>>
>> Samuel, sorry I haven't had time to look at these patches yet.
>
>
> No worries, but are you fine with this solution? If yes, I will fix up patch
> 1.

Yes, I'm OK with the solution. The user SGPR usage should decrease
when we add support for 32bit pointers (actually we'll just need one
opcode to work with 32bit pointers: the 32bit->64bit address space
cast).

Marek


More information about the mesa-dev mailing list