[Mesa-dev] [RFC] i965: alternative to memctx for cleaning up nir variants

Thu Dec 31 09:05:23 PST 2015

On Thu, Dec 31, 2015 at 10:16 AM, Rob Clark <robdclark at gmail.com> wrote:
> On Tue, Dec 29, 2015 at 10:32 AM, Rob Clark <robdclark at gmail.com> wrote:
>>>>> If you do this, you'll be back to always needing a mutable copy.  Most
>>>>> lowering and optimization passes die the moment they see a register.  You'll
>>>>> either have to go fix a bunch of stuff up to no-op properly or run
>>>>> vars_to_regs after doing your NIR lowering but before going into your
>>>>> backend IR.  This means that your "gold copy" still has variables and you
>>>>> always need to lower them to registers before you go into the backend.
>>>>
>>>> ugg.. but good point, thanks for pointing that out before I wasted
>>>> another afternoon on yet another dead-end for handling deref's..
>>>>
>>>> Ok, I guess I need to think of a better name than load/store_var2 for
>>>> the new intrinsics ;-)
>>>
>>> I don't think that "you should throw away registers and use your own
>>> thing" is what Jason wanted you to get out of that.
>>
>> perhaps..  I was considering switching to registers for arrays.
>> Although it would end up forcing an extra clone in the common case
>> where there would otherwise not be one... a bit of a tough pill to
>> swallow..
>
>
> I've been thinking through this a bit, and the whole load/store
> intrinsic for var access (vs. potentially being the src and/or dst of
> any other instruction, with registers) is pretty damn convenient for
> me..
>
> Not all instructions support indirect dst and/or src, and some support
> indirect in certain src positions, but not others.  I have similar
> constraints with const (uniform), fwiw.
>
> In addition to avoiding a lot of churn in nir->ir3 I think it would be
> easier to deal with these kind of constraints by always starting out
> with a move, and then let an ir3 backend pass collapse that into the
> instruction(s) that consumes the mov when possible, similar to what we
> do already with uniforms.
>
> So thinking of introducing load/store_global and load/store_local
> intrinsics, and lowering to them in lower_io.
>
> BR,
> -R

The thng about that is, starting out with a separate instruction is
much harder than splitting it out. Splitting out an indirect access
can be done easily, and in your case, on the fly as you convert from
NIR, whereas inlining accesses is a lot more painful, since you're
essentially back to not having the full SSA information. We want to
(eventually) solve that problem in NIR, and not in everyone's backend.
For that reason, you're likely going to be the only user of these new
intrinsics/derefs/whatever. Both Jason and I are confused as to why
you don't want to make the minor changes needed to adopt the thing
that is designed to do exactly what you want to do, rather than
rewrite core NIR for something that would save maybe <50 lines of code
in your backend (and really, that's what we're talking about).