[Mesa-dev] [PATCH] nouveau: codegen: combineLd/St do not combine indirect loads
Ilia Mirkin
imirkin at alum.mit.edu
Thu Apr 21 15:15:00 UTC 2016
On Thu, Apr 21, 2016 at 10:46 AM, Hans de Goede <hdegoede at redhat.com> wrote:
> Hi,
>
> On 21-04-16 16:28, Ilia Mirkin wrote:
>>
>> On Thu, Apr 21, 2016 at 9:55 AM, Hans de Goede <hdegoede at redhat.com>
>> wrote:
>>>
>>> combineLd/St would combine, i.e. :
>>>
>>> st u32 # g[$r2+0x0] $r2
>>> st u32 # g[$r2+0x4] $r3
>>>
>>> into:
>>>
>>> st u64 # g[$r2+0x0] $r2d
>>>
>>> But this is only valid if r2 contains an 8 byte aligned address,
>>> which is unknown.
>>>
>>> This commit checks for src0 dim 0 not being indirect when combining
>>> loads / stores as combining indirect loads / stores may break alignment
>>> rules.
>>
>>
>> I believe the assumption is that all indirect addresses are 16-byte
>> aligned. This works out for GL, I think. Although hm... I wonder what
>> happens if you have a
>>
>> layout (std430) buffer foo {
>> int x[16];
>> }
>>
>> And you access x[i], x[i+1], and i == 1. I think we end up doing a ton
>> of size-based validation which might avoid the problem.
>>
>> My concern is that now constbufs will get the same treatment, and for
>> constbufs the alignment is always 16 :(
>>
>> What do you think? Just drop those, or add extra conditionals to allow
>> it for constbufs?
>
>
> I'm not sure we've the alignment guarantee for constbufs, IIRC we lower
> const buf accesses to be indirect because we want to provide more then 8
> UBO-s,
> right ? So we read the offset from NVC0_CB_AUX_BUF_INFO and then end up with
> e.g.:
>
> ld u64 r2d c7[r1+0x0]
>
> Where r1 contains the offset of the user-buf. But what if the user is
> somehow
> indirectly accessing the userbuf, then we will have added that indirect
> offset
> to r1, and we can no longer assume that we can safely merge the loads
> without
> breaking alignment rules.
>
> I hope I'm making sense here, I'm still a bit unsure about the details how
> this all works.
Yes, that can happen, except... it can't. The addresses are 64-bit,
and this only happens on Kepler+, which limits constbuf loads to
64-bit at a time (no 128-bit loads for some unknown reason -- I've
tried to enable it, and it indeed doesn't work, even though nvdisasm
is perfectly happy with it). So ... it can't happen. But in principle
you're right :)
-ilia
More information about the mesa-dev
mailing list