[Mesa-dev] [RFC] i965: alternative to memctx for cleaning up nir variants

Mon Dec 28 10:33:16 PST 2015

On Mon, Dec 28, 2015 at 1:20 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>
>
> On Mon, Dec 28, 2015 at 9:37 AM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>
>> On Mon, Dec 28, 2015 at 10:13 AM, Rob Clark <robdclark at gmail.com> wrote:
>> > On Tue, Dec 22, 2015 at 10:11 PM, Connor Abbott <cwabbott0 at gmail.com>
>> > wrote:
>> >> On Tue, Dec 22, 2015 at 9:55 PM, Rob Clark <robdclark at gmail.com> wrote:
>> >>> On Tue, Dec 22, 2015 at 9:47 PM, Connor Abbott <cwabbott0 at gmail.com>
>> >>> wrote:
>> >>>> On Tue, Dec 22, 2015 at 9:02 PM, Rob Clark <robdclark at gmail.com>
>> >>>> wrote:
>> >>>>> On Mon, Dec 21, 2015 at 1:48 PM, Jason Ekstrand
>> >>>>> <jason at jlekstrand.net> wrote:
>> >>>>>>
>> >>>>>> I think two different concepts of ownership are getting conflated
>> >>>>>> here:
>> >>>>>> Right/responsibility to delete and right to modify.
>> >>>>>>
>> >>>>>> The way I understand it, gallium, as it stands, gives neither to
>> >>>>>> the driver.
>> >>>>>> A back-end using NIR requires the right to modify but who deletes
>> >>>>>> it doesn't
>> >>>>>> ultimately matter.  I think it's dangerous to pass one of these
>> >>>>>> rights to
>> >>>>>> the driver and not the other but we need to think about both.
>> >>>>>
>> >>>>> yeah, uneasy about driver modifying the IR if the state tracker is
>> >>>>> still going to potentially spin off variants of the IR.. that sounds
>> >>>>> like madness.
>> >>>>>
>> >>>>> The refcnt'ing I proposed does deal w/ right to modify vs delete via
>> >>>>> nir_shader(_is)_mutable() which returns something that is guaranteed
>> >>>>> to be safe to modify (ie. has only a single reference)
>> >>>>>
>> >>>>>> What I'm trying to say is that we have two options here:
>> >>>>>>
>> >>>>>> 1) gallium passes IR to the back-end that it is free to modify and
>> >>>>>> is
>> >>>>>> required to delete when it's done.
>> >>>>>
>> >>>>> with refcnt'ing, s/delete/unref/
>> >>>>>
>> >>>>> The idea is, the st transfers ownership of the reference it passes
>> >>>>> to
>> >>>>> the driver.  If the st wants to hang on to a reference itself, it
>> >>>>> must
>> >>>>> increment the refcnt before passing to the driver (backend).
>> >>>>>
>> >>>>> Without refcnt'ing, I suppose we could (since we don't have to
>> >>>>> follow
>> >>>>> TGSI semantics), just decree that the driver always takes ownership
>> >>>>> of
>> >>>>> the copy passed in, and if the st wants to hang on to a copy too,
>> >>>>> then
>> >>>>> it must clone.  I suppose this would work well enough for
>> >>>>> freedreno/vc4, which both end up generating variants later.  It does
>> >>>>> force an extra clone for drivers that immediately translate into
>> >>>>> their
>> >>>>> own backend IR and don't need to keep the NIR around, for example.
>> >>>>> Maybe that is not worth caring about (since at this point it is
>> >>>>> hypothetical).
>> >>>>
>> >>>> While always cloning does have this disadvantage, I don't think it's
>> >>>> really relevant here. Even if the driver throws away the NIR
>> >>>> immediately after consuming it, almost invariably it's going to want
>> >>>> to modify  it. The generic NIR passed in by the state tracker (other
>> >>>> IR -> NIR + some optimizations) is almost never going to be the same
>> >>>> as the NIR after going through driver-specific lowering passes, which
>> >>>> means that drivers are never going to want a read-only version of the
>> >>>> IR. In light of that, I think making the driver own the IR passed in
>> >>>> seems like the most sensible thing.
>> >>>
>> >>> well, unless the driver is already doing it's own lowering in it's own
>> >>> native IR..
>> >>
>> >> Well, if you're not doing any lowering in NIR, then you aren't really
>> >> taking any advantage of it. I can't see a plausible scenario where all
>> >> the lowering is done in the driver's own IR -- and as soon as you do
>> >> anything in NIR, you need the driver-owns-IR semantics.
>> >
>> > When it comes to shader variants, I have a mix, with some things
>> > lowered in nir and others just handled in backend..
>> >
>> > The re-work / cleanup that I have had on a branch for a while now
>> > (since it is currently blocked on refcnt'ing) does a first round of
>> > variant-key independent NIR lowering/opt passes.  And then at draw
>> > time, if the variant key has anything that is lowered in nir, I do a
>> > second round.
>
>
> Just to be clear, your key-dependent lowering happens after all of your
> other lowering?  If this is the case, then I guarantee you that you're
> unique in this since i965 and vc4 need to at least run out-of-SSA
> afterwards.  To be honest, I completely forgot that a driver could use fully
> ssa NIR.

It is a mix.. I do texcoord saturate, clip-plane, and 2-sided color
lowering in NIR.  But flat-shading, binning-pass, and half vs full
precision color output in ir3.

I do as much lowering in NIR as I can, in an effort to do as much as
possible at compile time, vs draw time.  I do the first round of
lowering/opt w/ null shader key, which is enough for the common cases.

Pretty much independent, I suppose, of whether I came out of SSA or
not first.  Although binning-pass variant and the instruction
scheduling I do are easier in SSA.

Somewhat unrelated, but I may end up converting array access to
registers, but leave everything else in SSA, so I can benefit from
converting multi-dimensional offsets into a single offset..  this is
still one open issue w/ gallium glsl_to_nir.. right now I have a
hacked up version of nir_lower_io that converts global/local
load/store_var's into load/store_var2 which take an offset as a src
(like load_input/store_output) instead of deref chain.. not sure yet
whether this will be the permanent solution, but at least it fixes a
huge heap of variable-indexing piglits and lets me continue w/
implementing lowering passes for everything else that was previously
done in glsl->tgsi or tgsi->tgsi passes.

BR,
-R

>>
>> >>>
>> >>> Maybe it is too much of a hypothetical.. I still think refcnt'ing
>> >>> gives some nice flexibility to deal with various scenarios, and having
>> >>> to call nir_shader_unref() isn't so much of a burden.
>> >>
>> >> Still, I can't see how this flexibility is at all useful, and it seems
>> >> like overkill since the driver will always want a mutable version of
>> >> the IR anyways.
>> >
>> > Well, due to the structure I mentioned above, at draw time when I need
>> > to generate a variant with nothing lowered in NIR, I simply incr the
>> > refcnt on the IR which has already gone through first round of NIR
>> > passes, and pass that in to my back end compiler.  At the end, once
>> > the shader binary is generated, I can unconditionally unref the
>> > nir_shader without having to care.
>> >
>> > Without refcnt'ing I'd either have to generate a pointless clone or
>> > keep track that the nir_shader should not actually be free'd.  Not
>> > impossible, just a bit more ugly.
>>
>> Assuming you do all your variant management in your driver's IR, then
>> you don't need to do anything. If you do some variant management in
>> NIR, then in the function where you do the NIR-based lowering you can
>> check if you need to do any lowering based on the shader key, clone
>> first, and give the NIR->ir3 function the cloned IR and then free it.
>> It might be a "bit more ugly," but it's really not that much different
>> from the refcounting, and when the extra shader gets created/freed is
>> made explicit.
>>
>> >
>> > (The gallium glsl_to_nir stuff is also currently using refcnt'ing,
>> > although at least for freedreno/ir3 it isn't strictly needed.. I could
>> > just unconditionally clone in the state tracker.  That said, I'm still
>> > of the opinion that refcnt'ing could be useful to some other driver
>> > someday)
>>
>> "It could be useful to some driver someday" isn't a good argument for
>> adding stuff today. We've already had enough examples of things in NIR
>> that we added because we thought it was useful, but turned out not to
>> be.
>>
>> >
>> > BR,
>> > -R
>> >
>> >>>
>> >>> BR,
>> >>> -R
>> >>>
>> >>>>>
>> >>>>> (I guess nouveau is the one driver, that if it ever consumed NIR,
>> >>>>> would translate immediately into it's own backend IR?)
>> >>>>>
>> >>>>>> 2) gallium passes read-only IR to the back-end and it always makes
>> >>>>>> a copy.
>> >>>>>
>> >>>>> This is how it is w/ TGSI, but I think with NIR we are free to make
>> >>>>> a
>> >>>>> clean break.  And we *definitely* want to avoid
>> >>>>> the-driver-always-copies semantics..
>> >>>>>
>> >>>>>> It sounds like, from what Marek is saying, that gallium is
>> >>>>>> currently doing
>> >>>>>> (2) and changing it to (1) would be painful.  I think reference
>> >>>>>> counting is
>> >>>>>> more like an awkward option 1.5 than option 3.  Reference counting
>> >>>>>> would
>> >>>>>> mean that gallium passes a reference to the driver which it is
>> >>>>>> expected to
>> >>>>>> unref but may keep a second reference if it wants to keep the
>> >>>>>> driver from
>> >>>>>> modifying it.  Then the driver may or may not make a copy based on
>> >>>>>> the
>> >>>>>> number of references.  Why don't we just make it explicit and add a
>> >>>>>> read-only bit and call it a day.
>> >>>>>>
>> >>>>>> One of the reasons I don't like passing a reference is that it
>> >>>>>> effectively
>> >>>>>> puts allocation and freeing in different components of the driver.
>> >>>>>
>> >>>>> With refcnt'ing you should talk in terms of ref/unref rather than
>> >>>>> allocate/free.. imho.  Although maybe that is what you meant.  (In
>> >>>>> which case, yes, that was my idea, that passing in to driver
>> >>>>> transfers
>> >>>>> ownership of the passed reference.)
>> >>>>>
>> >>>>>> This
>> >>>>>> means that if and driver doesn't care at all about the shader that
>> >>>>>> gets
>> >>>>>> passed in, it still has to under it to avoid a memory leak.  You
>> >>>>>> can't have
>> >>>>>> the driver take the reference because then, either it comes in with
>> >>>>>> a
>> >>>>>> recount of 0 and should have been deleted, or the "can I modify
>> >>>>>> this" check
>> >>>>>> becomes "recount <= 2" which makes no sense.
>> >>>>>
>> >>>>> hmm, no, if ownership of the reference is transferred to the driver,
>> >>>>> then it becomes "refcount == 1" (and refcount == 0 should be an
>> >>>>> assert, because something has gone horribly wrong)
>> >>>>>
>> >>>>> BR,
>> >>>>> -R
>
>