[Mesa-dev] [PATCH 0/5] Volatile and invariant LDS memory ops

Fri Nov 10 18:19:01 UTC 2017

On Fri, Nov 10, 2017 at 6:55 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 10.11.2017 18:43, Marek Olšák wrote:
>>
>> On Fri, Nov 10, 2017 at 2:09 AM, Connor Abbott <cwabbott0 at gmail.com>
>> wrote:
>>>
>>> On Thu, Nov 9, 2017 at 7:17 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>>
>>>> On Fri, Nov 10, 2017 at 12:40 AM, Matt Arsenault <arsenm2 at gmail.com>
>>>> wrote:
>>>>>
>>>>>
>>>>>> On Nov 10, 2017, at 07:41, Marek Olšák <maraeo at gmail.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This fixes the TCS gl_ClipDistance piglit failure that was uncovered
>>>>>> by a recent LLVM change. The solution is to set volatile on loads
>>>>>> and stores to enforce proper ordering.
>>>>>>
>>>>>> Please review.
>>>>>>
>>>>>
>>>>>
>>>>> Every LDS access certainly should not be volatile. This kills all
>>>>> optimizations, like formation of ds_read2_b32. What ordering issue are you
>>>>> having?
>>>>
>>>>
>>>> It might be caused by inttoptr(NULL) that we do to declare LDS. There
>>>> is simply no ordering enforced, which is weird.
>>>
>>>
>>> As soon as you do inttoptr(NULL), you've generated a poison value (in
>>> LLVM legalese), so LLVM will assume that you never dereference it and
>>> optimize accordingly. I think a GEP instruction without the inbounds
>>> parameter set will get rid of the poison value, although I'm not sure
>>> about the case where the offset is known to be zero. At least, that's
>>> my reading of the langref text for the GEP instruction
>>> (https://llvm.org/docs/LangRef.html#id215). If zero is a valid address
>>> in LDS, then it sounds like LLVM needs to be fixed to disable this
>>> optimization for certain address spaces. On the other hand, if you're
>>> doing inttoptr(NULL) + offset, where "offset" is the result of a
>>> ptrtoint somewhere, you should be doing inttoptr(offset) instead, and
>>> then LLVM should never misbehave.
>>
>>
>> I don't think that using inttoptr before every load and store would be
>> better than volatile. The must be a better solution.
>
>
> Can't we just allocate the required LDS memory explicitly like we did for
> the LDS-based derivative computations?
>
> It may require shuffling around a bit how/when we calculate the required
> sizes, but it doesn't seem impossible.

We want to share the same declaration in TCS main and epilog parts.

Does LLVM know that LDS declarations are pre-initialized?
Do sized LDS declarations affect SIMD-occupancy-based optimizations?
Because Mesa always declares 64kB of LDS and the real value is
determined at runtime.

Marek