[Mesa-dev] FLAG-DAY: NIR derefs

Thu Mar 15 00:07:53 UTC 2018

On Wed, Mar 14, 2018 at 7:42 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> On Wed, Mar 14, 2018 at 5:05 PM, Rob Clark <robdclark at gmail.com> wrote:
>> On Wed, Mar 14, 2018 at 4:58 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>> FWIW, the way I imagined doing this was something like:
>>>
>>> 1. Add nir_deref_instr and nir_deref_type_pointer. At this point, just
>>> make everything assert if the base deref isn't a nir_deref_var. This
>>> will be a bit of a flag-day, but also very mechanical. It'll also help
>>> us catch cases where we don't handle new-style derefs later.
>>> 2. Add a pass to flatten nir_deref_type_pointer into
>>> nir_deref_type_var if possible (i.e. if there's a clear chain up to
>>> the base variable without any phi nodes or whatever). This should
>>> always be possible for GLSL, as well as SPIR-V unless
>>> KHR_variable_pointers is enabled. We'll use this to avoid too much
>>> churn in drivers, passes that haven't been updated, etc. We might also
>>> want a pass to do the opposite, for converting passes where we don't
>>> want to have codepaths for both forms at once.
>>
>> btw, does it seem reasonable to assert that deref instruction src's
>> are *always* in SSA form?  That seems reasonable to me since they will
>> be mostly lowered away before the driver sees them (and I think makes
>> some of the operation on them easier), and I can't think of any way
>> for them *not* to be SSA (since they aren't real instructions).
>
> I think so... as long as you don't lower locals to regs before
> lowering everything to explicit address arithmetic. Although, with the
> physical memory model, it's just another source like any other so I'm
> not sure if there's a point.
>

I think w/ phys memory model, we could lower away the deref's before
going to regs.  That *seems* like a reasonable requirement to me.

>>
>> If so, my rough thoughts are a deref instruction chain (formed by ssa
>> links to previous deref instruction) either start w/
>> nir_deref_instr_pointer or nir_deref_instruction_var instructions at
>> the head of the list (to start, I guess you could ignore adding the
>> nir_deref_instr_pointer instruction and I could add that for
>> clover/spirv work).  Followed by N links of struct/array deref_link
>> instructions that have two ssa src's (one that is previous deref
>> instruction and one that is array or struct member offset)
>
> Why would you need a separate nir_deref_instr_pointer? Do you want to
> put information like what type of pointer it is in there? Maybe we
> could just make that part of every nir_deref_instr instead?

well, in clc you could hypotheticaly do something like:

  __global struct Foo *f = (struct Foo *)0x1234;

so you don't necessarily have a var at the start of your deref chain.

More realistic example is:

  ptr->a.b->c.d

which is really two deref chains, first starting at a var, second
starting at an ssa ptr (which I think realistically ends up needing to
be a fat pointer to deal w/ cl's multiple address spaces[1]), with an
intermediate load_global or load_shared intrinsic in between.

Anyways, don't want to derail the conversion to deref instructions too
much, but I do think we need something different for "var" vs "ptr"
(and the nice thing about deref chains is this should be easier to
add)

BR,
-R

[1] kinda a different topic.. short version is I'm leaning towards a
nir_deref_instr_pointer taking a two component vector as it's src so
it can be lowered to an if/else chain to deal with different address
spaces, and then let opt passes clean things up so driver ends up with
either load/store_global or load/store_local, etc

>
>>
>>> 3. Modify nir_lower_io to handle new-style derefs, especially for
>>> shared variables (i.e. KHR_variable_pointers for anv). We might have
>>> to modify a few other passes, too.
>>> 4. Add the required deref lowering passes to all drivers.
>>> 5. Rewrite glsl_to_nir and spirv_to_nir to emit the new-style derefs.
>>> At the very least, we should be using this to implement the shared
>>> variable bits of KHR_variable_pointers. If we add stride/offset
>>> annotations to nir_deref_instr for UBO's and SSBO's, then we might
>>> also be able to get rid of the vtn_deref stuff entirely (although I'm
>>> not sure if that should be a goal right now).
>>
>> I think I might try to prototype something where we convert vtn over
>> to new-style deref instructions, plus a pass to lower to old style
>> deref chains.  It partly comes down to how quickly I can finish a
>> couple other things, and how much I can't sleep on a long-ass flight.
>> (I guess even if throw-away, if it gives some idea of what to do or
>> what not to do it might be useful?)
>>
>> Anyways, as far as decoupling this from backend drivers, I think a
>> nir_intr_get_var(intr, n) instruction to replace open coded
>> intr->variables[0]->var could go a long way.  (In the new world this
>> would follow ssa links to previous deref instruction to find the
>> nir_deref_instruction_var.)  I'll try typing this up in a few minutes.
>>
>>> At this point, we can fix things up and move everything else over to
>>> new-style derefs at our leisure. Also, it should now be pretty
>>> straightforward to add support for shared variable pointers to radv
>>> without lowering everything to offsets up-front, which is nice.
>>>
>>> Connor
>>>
>>>
>>> On Wed, Mar 14, 2018 at 2:32 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>>> All,
>>>>
>>>> Connor and I along with several others have been discussing for a while
>>>> changing the way NIR dereferences work.  In particular, adding a new
>>>> nir_deref_instr type where the first one in the chain takes a variable and
>>>> is followed by a series of instructions which take another deref instruction
>>>> and do an array or structure dereference on it.
>>>>
>>>> Much of the motivation for this is some of the upcoming SPIR-V stuff where
>>>> we have more real pointers and deref chains don't really work anymore.  It
>>>> will also allow for things such as CSE of common derefs which could make
>>>> analysis easier.  This is similar to what LLVM does and it's working very
>>>> well for them.
>>>>
>>>> The reason for this e-mail is that this is going to be a flag-day change.
>>>> We've been talking about it for a while but this is going to be a major and
>>>> fairly painful change in the short term so no one has actually done it.
>>>> It's time we finally just suck it up and make it happen.  While we will try
>>>> to make the change as incrementally and reviewably as possible but there is
>>>> a real limit as to what is possible here.  My plan is to start cracking away
>>>> at this on Monday and hopefully have something working for i965/anv by the
>>>> end of the week or maybe some time the week after.  If anyone has something
>>>> to say in opposition, please speak up now and not after I've spent a week
>>>> straight frantically hacking on NIR.
>>>>
>>>> I would like everyone to be respectful of the fact that this will be a major
>>>> change and very painful to rebase.  If you've got outstanding NIR, GLSL, or
>>>> SPIR-V work that is likely to conflict with this, please try to land it
>>>> before Monday so that we can avoid rebase conflicts.  If you have interest
>>>> in reviewing this, please try to be responsive so that we can get it
>>>> reviewed and landed before it becomes too painful.  I'll try to send out
>>>> some preview patches as I go so that the data structures themselves can get
>>>> some review before the rest of the changes have been made.
>>>>
>>>> I'm also asking for help from Rob, Bas, and Eric if there are changes needed
>>>> in any of their drivers.  I suspect the impact on back-end drivers will be
>>>> low because most of them don't use derefs directly, but it would be good of
>>>> people were on hand to help catch bugs if nothing else.
>>>>
>>
>> I'll be at linaro conf next week.. I'll be responsive, but with some
>> latency.  But I expect I should be able to test/fix back-end issues on
>> the freedreno/ir3 side of things, if not in real time, at least fast
>> enough to not be a problem.
>>
>> (Sorry timing is slightly non-ideal, but I don't think I should be
>> holding anything up, and I really like the deref instruction approach)
>>
>> BR,
>> -R
>>
>>>> Thanks,
>>>>
>>>> --Jason Ekstrand
>>>>