[Mesa-dev] [PATCH] nouveau: codegen: combineLd/St do not combine indirect loads
Samuel Pitoiset
samuel.pitoiset at gmail.com
Thu Apr 21 15:09:59 UTC 2016
On 04/21/2016 04:46 PM, Hans de Goede wrote:
> Hi,
>
> On 21-04-16 16:28, Ilia Mirkin wrote:
>> On Thu, Apr 21, 2016 at 9:55 AM, Hans de Goede <hdegoede at redhat.com>
>> wrote:
>>> combineLd/St would combine, i.e. :
>>>
>>> st u32 # g[$r2+0x0] $r2
>>> st u32 # g[$r2+0x4] $r3
>>>
>>> into:
>>>
>>> st u64 # g[$r2+0x0] $r2d
>>>
>>> But this is only valid if r2 contains an 8 byte aligned address,
>>> which is unknown.
>>>
>>> This commit checks for src0 dim 0 not being indirect when combining
>>> loads / stores as combining indirect loads / stores may break alignment
>>> rules.
>>
>> I believe the assumption is that all indirect addresses are 16-byte
>> aligned. This works out for GL, I think. Although hm... I wonder what
>> happens if you have a
>>
>> layout (std430) buffer foo {
>> int x[16];
>> }
>>
>> And you access x[i], x[i+1], and i == 1. I think we end up doing a ton
>> of size-based validation which might avoid the problem.
>>
>> My concern is that now constbufs will get the same treatment, and for
>> constbufs the alignment is always 16 :(
>>
>> What do you think? Just drop those, or add extra conditionals to allow
>> it for constbufs?
>
> I'm not sure we've the alignment guarantee for constbufs, IIRC we lower
> const buf accesses to be indirect because we want to provide more then 8
> UBO-s,
> right ? So we read the offset from NVC0_CB_AUX_BUF_INFO and then end up
> with e.g.:
Right. This is because the launch descriptor used for compute shaders on
kepler only allows to set up 8 CBs. But OpenGL requires at least 14
UBOs, so the logic is to stick UBOs' information into the driver
constant buffer.
As you can, we do this dance for all UBOs because it's simpler that
testing if an UBO has been described in the launch descriptor or not (so
if it's mapped as c1[], c2[], etc).
The lowering pass should properly handle indirect UBO accesses (I did
write a piglit test for that and looked at blob). But I'm not sure if we
can break alignment here.
Do you have a simple shader that might hit the issue?
>
> ld u64 r2d c7[r1+0x0]
>
> Where r1 contains the offset of the user-buf. But what if the user is
> somehow
> indirectly accessing the userbuf, then we will have added that indirect
> offset
> to r1, and we can no longer assume that we can safely merge the loads
> without
> breaking alignment rules.
>
> I hope I'm making sense here, I'm still a bit unsure about the details how
> this all works.
>
> Regards,
>
> Hans
--
-Samuel
More information about the mesa-dev
mailing list