[Mesa-dev] [PATCH] nouveau: codegen: combineLd/St do not combine indirect loads

Samuel Pitoiset samuel.pitoiset at gmail.com
Thu Apr 21 15:09:59 UTC 2016



On 04/21/2016 04:46 PM, Hans de Goede wrote:
> Hi,
>
> On 21-04-16 16:28, Ilia Mirkin wrote:
>> On Thu, Apr 21, 2016 at 9:55 AM, Hans de Goede <hdegoede at redhat.com>
>> wrote:
>>> combineLd/St would combine, i.e. :
>>>
>>> st  u32 # g[$r2+0x0] $r2
>>> st  u32 # g[$r2+0x4] $r3
>>>
>>> into:
>>>
>>> st  u64 # g[$r2+0x0] $r2d
>>>
>>> But this is only valid if r2 contains an 8 byte aligned address,
>>> which is unknown.
>>>
>>> This commit checks for src0 dim 0 not being indirect when combining
>>> loads / stores as combining indirect loads / stores may break alignment
>>> rules.
>>
>> I believe the assumption is that all indirect addresses are 16-byte
>> aligned. This works out for GL, I think. Although hm... I wonder what
>> happens if you have a
>>
>> layout (std430) buffer foo {
>>    int x[16];
>> }
>>
>> And you access x[i], x[i+1], and i == 1. I think we end up doing a ton
>> of size-based validation which might avoid the problem.
>>
>> My concern is that now constbufs will get the same treatment, and for
>> constbufs the alignment is always 16 :(
>>
>> What do you think? Just drop those, or add extra conditionals to allow
>> it for constbufs?
>
> I'm not sure we've the alignment guarantee for constbufs, IIRC we lower
> const buf accesses to be indirect because we want to provide more then 8
> UBO-s,
> right ? So we read the offset from NVC0_CB_AUX_BUF_INFO and then end up
> with e.g.:

Right. This is because the launch descriptor used for compute shaders on 
kepler only allows to set up 8 CBs. But OpenGL requires at least 14 
UBOs, so the logic is to stick UBOs' information into the driver 
constant buffer.

As you can, we do this dance for all UBOs because it's simpler that 
testing if an UBO has been described in the launch descriptor or not (so 
if it's mapped as c1[], c2[], etc).

The lowering pass should properly handle indirect UBO accesses (I did 
write a piglit test for that and looked at blob). But I'm not sure if we 
can break alignment here.

Do you have a simple shader that might hit the issue?

>
> ld u64  r2d c7[r1+0x0]
>
> Where r1 contains the offset of the user-buf. But what if the user is
> somehow
> indirectly accessing the userbuf, then we will have added that indirect
> offset
> to r1, and we can no longer assume that we can safely merge the loads
> without
> breaking alignment rules.
>
> I hope I'm making sense here, I'm still a bit unsure about the details how
> this all works.
>
> Regards,
>
> Hans

-- 
-Samuel


More information about the mesa-dev mailing list