[Mesa-dev] [PATCH 0/5] Volatile and invariant LDS memory ops

Nicolai Hähnle nhaehnle at gmail.com
Mon Nov 13 08:31:46 UTC 2017


On 11.11.2017 04:09, Marek Olšák wrote:
> On Fri, Nov 10, 2017 at 9:58 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>> On 10.11.2017 19:24, Connor Abbott wrote:
>>>
>>> On Fri, Nov 10, 2017 at 1:19 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>
>>>> On Fri, Nov 10, 2017 at 6:55 PM, Nicolai Hähnle <nhaehnle at gmail.com>
>>>> wrote:
>>>>>
>>>>> On 10.11.2017 18:43, Marek Olšák wrote:
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 10, 2017 at 2:09 AM, Connor Abbott <cwabbott0 at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Nov 9, 2017 at 7:17 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Nov 10, 2017 at 12:40 AM, Matt Arsenault <arsenm2 at gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Nov 10, 2017, at 07:41, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> This fixes the TCS gl_ClipDistance piglit failure that was
>>>>>>>>>> uncovered
>>>>>>>>>> by a recent LLVM change. The solution is to set volatile on loads
>>>>>>>>>> and stores to enforce proper ordering.
>>>>>>>>>>
>>>>>>>>>> Please review.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Every LDS access certainly should not be volatile. This kills all
>>>>>>>>> optimizations, like formation of ds_read2_b32. What ordering issue
>>>>>>>>> are you
>>>>>>>>> having?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> It might be caused by inttoptr(NULL) that we do to declare LDS. There
>>>>>>>> is simply no ordering enforced, which is weird.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> As soon as you do inttoptr(NULL), you've generated a poison value (in
>>>>>>> LLVM legalese), so LLVM will assume that you never dereference it and
>>>>>>> optimize accordingly. I think a GEP instruction without the inbounds
>>>>>>> parameter set will get rid of the poison value, although I'm not sure
>>>>>>> about the case where the offset is known to be zero. At least, that's
>>>>>>> my reading of the langref text for the GEP instruction
>>>>>>> (https://llvm.org/docs/LangRef.html#id215). If zero is a valid address
>>>>>>> in LDS, then it sounds like LLVM needs to be fixed to disable this
>>>>>>> optimization for certain address spaces. On the other hand, if you're
>>>>>>> doing inttoptr(NULL) + offset, where "offset" is the result of a
>>>>>>> ptrtoint somewhere, you should be doing inttoptr(offset) instead, and
>>>>>>> then LLVM should never misbehave.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I don't think that using inttoptr before every load and store would be
>>>>>> better than volatile. The must be a better solution.
>>>>>
>>>>>
>>>>>
>>>>> Can't we just allocate the required LDS memory explicitly like we did
>>>>> for
>>>>> the LDS-based derivative computations?
>>>>>
>>>>> It may require shuffling around a bit how/when we calculate the required
>>>>> sizes, but it doesn't seem impossible.
>>>>
>>>>
>>>> We want to share the same declaration in TCS main and epilog parts.
>>>>
>>>> Does LLVM know that LDS declarations are pre-initialized?
>>>> Do sized LDS declarations affect SIMD-occupancy-based optimizations?
>>>> Because Mesa always declares 64kB of LDS and the real value is
>>>> determined at runtime.
>>>
>>>
>>> I don't know about the latter, but for the former, if you declare the
>>> LDS variable as having external linkage, LLVM should assume that it
>>> might be initialized beforehand -- exactly like a global non-static
>>> variable in C.
>>
>>
>> Makes sense.
>>
>> I don't think LLVM is really looking at LDS size too closely for anything,
>> since LDS is per-thread group. But it's been a while since I checked.
>>
>> So just declaring a 64/32 KB memory block and then potentially not using all
>> of it is probably fine and is probably the best short-term solution (if it
>> works).
>>
>> It's a good point though that "shuffling around the computation of the
>> required sizes" is potentially much more involved than I was thinking at
>> first. It looks like if we wanted to be perfectly honest with LLVM about
>> what's going on (and I believe we should, in the long run), we'd have to
>> teach it a notion of "per-thread LDS memory". That requires more thought.
> 
> We can't tell LLVM the size of LDS, because we don't know it - it's
> computed from 2 independent shaders (LS and HS, or ES and GS).

That's true, but purely as a short-term solution, can't we tell LLVM the 
size of LDS is 64KB anyway instead of the inttoptr(0), even if we're not 
going to use it all? What goes wrong in that approach?

Cheers,
Nicolai


> 
> Marek
> 


-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list