[Mesa-dev] FLAG-DAY: NIR derefs

Thu Mar 15 00:44:07 UTC 2018

On Wed, Mar 14, 2018 at 6:07 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Wed, Mar 14, 2018 at 7:42 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>> On Wed, Mar 14, 2018 at 5:05 PM, Rob Clark <robdclark at gmail.com> wrote:
>>> On Wed, Mar 14, 2018 at 4:58 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>>> FWIW, the way I imagined doing this was something like:
>>>>
>>>> 1. Add nir_deref_instr and nir_deref_type_pointer. At this point, just
>>>> make everything assert if the base deref isn't a nir_deref_var. This
>>>> will be a bit of a flag-day, but also very mechanical. It'll also help
>>>> us catch cases where we don't handle new-style derefs later.
>>>> 2. Add a pass to flatten nir_deref_type_pointer into
>>>> nir_deref_type_var if possible (i.e. if there's a clear chain up to
>>>> the base variable without any phi nodes or whatever). This should
>>>> always be possible for GLSL, as well as SPIR-V unless
>>>> KHR_variable_pointers is enabled. We'll use this to avoid too much
>>>> churn in drivers, passes that haven't been updated, etc. We might also
>>>> want a pass to do the opposite, for converting passes where we don't
>>>> want to have codepaths for both forms at once.
>>>
>>> btw, does it seem reasonable to assert that deref instruction src's
>>> are *always* in SSA form?  That seems reasonable to me since they will
>>> be mostly lowered away before the driver sees them (and I think makes
>>> some of the operation on them easier), and I can't think of any way
>>> for them *not* to be SSA (since they aren't real instructions).
>>
>> I think so... as long as you don't lower locals to regs before
>> lowering everything to explicit address arithmetic. Although, with the
>> physical memory model, it's just another source like any other so I'm
>> not sure if there's a point.
>>
>
> I think w/ phys memory model, we could lower away the deref's before
> going to regs.  That *seems* like a reasonable requirement to me.
>
>>>
>>> If so, my rough thoughts are a deref instruction chain (formed by ssa
>>> links to previous deref instruction) either start w/
>>> nir_deref_instr_pointer or nir_deref_instruction_var instructions at
>>> the head of the list (to start, I guess you could ignore adding the
>>> nir_deref_instr_pointer instruction and I could add that for
>>> clover/spirv work).  Followed by N links of struct/array deref_link
>>> instructions that have two ssa src's (one that is previous deref
>>> instruction and one that is array or struct member offset)
>>
>> Why would you need a separate nir_deref_instr_pointer? Do you want to
>> put information like what type of pointer it is in there? Maybe we
>> could just make that part of every nir_deref_instr instead?
>
> well, in clc you could hypotheticaly do something like:
>
>   __global struct Foo *f = (struct Foo *)0x1234;
>
> so you don't necessarily have a var at the start of your deref chain.
>
> More realistic example is:
>
>   ptr->a.b->c.d
>
> which is really two deref chains, first starting at a var, second
> starting at an ssa ptr (which I think realistically ends up needing to
> be a fat pointer to deal w/ cl's multiple address spaces[1]), with an
> intermediate load_global or load_shared intrinsic in between.
>
> Anyways, don't want to derail the conversion to deref instructions too
> much, but I do think we need something different for "var" vs "ptr"
> (and the nice thing about deref chains is this should be easier to
> add)

My point was that you don't really need a distinction, as long as
deref instructions can accept any old pointer. In your second example,
there would be a struct deref, a load, and then a second struct deref
using the result of the load. This is similar to how it's done in
LLVM.

>
> BR,
> -R
>
> [1] kinda a different topic.. short version is I'm leaning towards a
> nir_deref_instr_pointer taking a two component vector as it's src so
> it can be lowered to an if/else chain to deal with different address
> spaces, and then let opt passes clean things up so driver ends up with
> either load/store_global or load/store_local, etc
>
>
>>
>>>
>>>> 3. Modify nir_lower_io to handle new-style derefs, especially for
>>>> shared variables (i.e. KHR_variable_pointers for anv). We might have
>>>> to modify a few other passes, too.
>>>> 4. Add the required deref lowering passes to all drivers.
>>>> 5. Rewrite glsl_to_nir and spirv_to_nir to emit the new-style derefs.
>>>> At the very least, we should be using this to implement the shared
>>>> variable bits of KHR_variable_pointers. If we add stride/offset
>>>> annotations to nir_deref_instr for UBO's and SSBO's, then we might
>>>> also be able to get rid of the vtn_deref stuff entirely (although I'm
>>>> not sure if that should be a goal right now).
>>>
>>> I think I might try to prototype something where we convert vtn over
>>> to new-style deref instructions, plus a pass to lower to old style
>>> deref chains.  It partly comes down to how quickly I can finish a
>>> couple other things, and how much I can't sleep on a long-ass flight.
>>> (I guess even if throw-away, if it gives some idea of what to do or
>>> what not to do it might be useful?)
>>>
>>> Anyways, as far as decoupling this from backend drivers, I think a
>>> nir_intr_get_var(intr, n) instruction to replace open coded
>>> intr->variables[0]->var could go a long way.  (In the new world this
>>> would follow ssa links to previous deref instruction to find the
>>> nir_deref_instruction_var.)  I'll try typing this up in a few minutes.
>>>
>>>> At this point, we can fix things up and move everything else over to
>>>> new-style derefs at our leisure. Also, it should now be pretty
>>>> straightforward to add support for shared variable pointers to radv
>>>> without lowering everything to offsets up-front, which is nice.
>>>>
>>>> Connor
>>>>
>>>>
>>>> On Wed, Mar 14, 2018 at 2:32 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>>>> All,
>>>>>
>>>>> Connor and I along with several others have been discussing for a while
>>>>> changing the way NIR dereferences work.  In particular, adding a new
>>>>> nir_deref_instr type where the first one in the chain takes a variable and
>>>>> is followed by a series of instructions which take another deref instruction
>>>>> and do an array or structure dereference on it.
>>>>>
>>>>> Much of the motivation for this is some of the upcoming SPIR-V stuff where
>>>>> we have more real pointers and deref chains don't really work anymore.  It
>>>>> will also allow for things such as CSE of common derefs which could make
>>>>> analysis easier.  This is similar to what LLVM does and it's working very
>>>>> well for them.
>>>>>
>>>>> The reason for this e-mail is that this is going to be a flag-day change.
>>>>> We've been talking about it for a while but this is going to be a major and
>>>>> fairly painful change in the short term so no one has actually done it.
>>>>> It's time we finally just suck it up and make it happen.  While we will try
>>>>> to make the change as incrementally and reviewably as possible but there is
>>>>> a real limit as to what is possible here.  My plan is to start cracking away
>>>>> at this on Monday and hopefully have something working for i965/anv by the
>>>>> end of the week or maybe some time the week after.  If anyone has something
>>>>> to say in opposition, please speak up now and not after I've spent a week
>>>>> straight frantically hacking on NIR.
>>>>>
>>>>> I would like everyone to be respectful of the fact that this will be a major
>>>>> change and very painful to rebase.  If you've got outstanding NIR, GLSL, or
>>>>> SPIR-V work that is likely to conflict with this, please try to land it
>>>>> before Monday so that we can avoid rebase conflicts.  If you have interest
>>>>> in reviewing this, please try to be responsive so that we can get it
>>>>> reviewed and landed before it becomes too painful.  I'll try to send out
>>>>> some preview patches as I go so that the data structures themselves can get
>>>>> some review before the rest of the changes have been made.
>>>>>
>>>>> I'm also asking for help from Rob, Bas, and Eric if there are changes needed
>>>>> in any of their drivers.  I suspect the impact on back-end drivers will be
>>>>> low because most of them don't use derefs directly, but it would be good of
>>>>> people were on hand to help catch bugs if nothing else.
>>>>>
>>>
>>> I'll be at linaro conf next week.. I'll be responsive, but with some
>>> latency.  But I expect I should be able to test/fix back-end issues on
>>> the freedreno/ir3 side of things, if not in real time, at least fast
>>> enough to not be a problem.
>>>
>>> (Sorry timing is slightly non-ideal, but I don't think I should be
>>> holding anything up, and I really like the deref instruction approach)
>>>
>>> BR,
>>> -R
>>>
>>>>> Thanks,
>>>>>
>>>>> --Jason Ekstrand
>>>>>