[Mesa-dev] [RFC] i965: alternative to memctx for cleaning up nir variants

Mon Dec 28 11:05:05 PST 2015

On Mon, Dec 28, 2015 at 10:33 AM, Rob Clark <robdclark at gmail.com> wrote:

> On Mon, Dec 28, 2015 at 1:20 PM, Jason Ekstrand <jason at jlekstrand.net>
> wrote:
> >
> >
> > On Mon, Dec 28, 2015 at 9:37 AM, Connor Abbott <cwabbott0 at gmail.com>
> wrote:
> >>
> >> On Mon, Dec 28, 2015 at 10:13 AM, Rob Clark <robdclark at gmail.com>
> wrote:
> >> > On Tue, Dec 22, 2015 at 10:11 PM, Connor Abbott <cwabbott0 at gmail.com>
> >> > wrote:
> >> >> On Tue, Dec 22, 2015 at 9:55 PM, Rob Clark <robdclark at gmail.com>
> wrote:
> >> >>> On Tue, Dec 22, 2015 at 9:47 PM, Connor Abbott <cwabbott0 at gmail.com
> >
> >> >>> wrote:
> >> >>>> On Tue, Dec 22, 2015 at 9:02 PM, Rob Clark <robdclark at gmail.com>
> >> >>>> wrote:
> >> >>>>> On Mon, Dec 21, 2015 at 1:48 PM, Jason Ekstrand
> >> >>>>> <jason at jlekstrand.net> wrote:
> >> >>>>>>
> >> >>>>>> I think two different concepts of ownership are getting conflated
> >> >>>>>> here:
> >> >>>>>> Right/responsibility to delete and right to modify.
> >> >>>>>>
> >> >>>>>> The way I understand it, gallium, as it stands, gives neither to
> >> >>>>>> the driver.
> >> >>>>>> A back-end using NIR requires the right to modify but who deletes
> >> >>>>>> it doesn't
> >> >>>>>> ultimately matter.  I think it's dangerous to pass one of these
> >> >>>>>> rights to
> >> >>>>>> the driver and not the other but we need to think about both.
> >> >>>>>
> >> >>>>> yeah, uneasy about driver modifying the IR if the state tracker is
> >> >>>>> still going to potentially spin off variants of the IR.. that
> sounds
> >> >>>>> like madness.
> >> >>>>>
> >> >>>>> The refcnt'ing I proposed does deal w/ right to modify vs delete
> via
> >> >>>>> nir_shader(_is)_mutable() which returns something that is
> guaranteed
> >> >>>>> to be safe to modify (ie. has only a single reference)
> >> >>>>>
> >> >>>>>> What I'm trying to say is that we have two options here:
> >> >>>>>>
> >> >>>>>> 1) gallium passes IR to the back-end that it is free to modify
> and
> >> >>>>>> is
> >> >>>>>> required to delete when it's done.
> >> >>>>>
> >> >>>>> with refcnt'ing, s/delete/unref/
> >> >>>>>
> >> >>>>> The idea is, the st transfers ownership of the reference it passes
> >> >>>>> to
> >> >>>>> the driver.  If the st wants to hang on to a reference itself, it
> >> >>>>> must
> >> >>>>> increment the refcnt before passing to the driver (backend).
> >> >>>>>
> >> >>>>> Without refcnt'ing, I suppose we could (since we don't have to
> >> >>>>> follow
> >> >>>>> TGSI semantics), just decree that the driver always takes
> ownership
> >> >>>>> of
> >> >>>>> the copy passed in, and if the st wants to hang on to a copy too,
> >> >>>>> then
> >> >>>>> it must clone.  I suppose this would work well enough for
> >> >>>>> freedreno/vc4, which both end up generating variants later.  It
> does
> >> >>>>> force an extra clone for drivers that immediately translate into
> >> >>>>> their
> >> >>>>> own backend IR and don't need to keep the NIR around, for example.
> >> >>>>> Maybe that is not worth caring about (since at this point it is
> >> >>>>> hypothetical).
> >> >>>>
> >> >>>> While always cloning does have this disadvantage, I don't think
> it's
> >> >>>> really relevant here. Even if the driver throws away the NIR
> >> >>>> immediately after consuming it, almost invariably it's going to
> want
> >> >>>> to modify  it. The generic NIR passed in by the state tracker
> (other
> >> >>>> IR -> NIR + some optimizations) is almost never going to be the
> same
> >> >>>> as the NIR after going through driver-specific lowering passes,
> which
> >> >>>> means that drivers are never going to want a read-only version of
> the
> >> >>>> IR. In light of that, I think making the driver own the IR passed
> in
> >> >>>> seems like the most sensible thing.
> >> >>>
> >> >>> well, unless the driver is already doing it's own lowering in it's
> own
> >> >>> native IR..
> >> >>
> >> >> Well, if you're not doing any lowering in NIR, then you aren't really
> >> >> taking any advantage of it. I can't see a plausible scenario where
> all
> >> >> the lowering is done in the driver's own IR -- and as soon as you do
> >> >> anything in NIR, you need the driver-owns-IR semantics.
> >> >
> >> > When it comes to shader variants, I have a mix, with some things
> >> > lowered in nir and others just handled in backend..
> >> >
> >> > The re-work / cleanup that I have had on a branch for a while now
> >> > (since it is currently blocked on refcnt'ing) does a first round of
> >> > variant-key independent NIR lowering/opt passes.  And then at draw
> >> > time, if the variant key has anything that is lowered in nir, I do a
> >> > second round.
> >
> >
> > Just to be clear, your key-dependent lowering happens after all of your
> > other lowering?  If this is the case, then I guarantee you that you're
> > unique in this since i965 and vc4 need to at least run out-of-SSA
> > afterwards.  To be honest, I completely forgot that a driver could use
> fully
> > ssa NIR.
>
> It is a mix.. I do texcoord saturate, clip-plane, and 2-sided color
> lowering in NIR.  But flat-shading, binning-pass, and half vs full
> precision color output in ir3.
>
> I do as much lowering in NIR as I can, in an effort to do as much as
> possible at compile time, vs draw time.  I do the first round of
> lowering/opt w/ null shader key, which is enough for the common cases.
>
> Pretty much independent, I suppose, of whether I came out of SSA or
> not first.  Although binning-pass variant and the instruction
> scheduling I do are easier in SSA.
>
> Somewhat unrelated, but I may end up converting array access to
> registers, but leave everything else in SSA, so I can benefit from
> converting multi-dimensional offsets into a single offset..  this is
> still one open issue w/ gallium glsl_to_nir.. right now I have a
> hacked up version of nir_lower_io that converts global/local
> load/store_var's into load/store_var2 which take an offset as a src
> (like load_input/store_output) instead of deref chain.. not sure yet
> whether this will be the permanent solution, but at least it fixes a
> huge heap of variable-indexing piglits and lets me continue w/
> implementing lowering passes for everything else that was previously
> done in glsl->tgsi or tgsi->tgsi passes.
>

If you do this, you'll be back to always needing a mutable copy.  Most
lowering and optimization passes die the moment they see a register.
You'll either have to go fix a bunch of stuff up to no-op properly or run
vars_to_regs after doing your NIR lowering but before going into your
backend IR.  This means that your "gold copy" still has variables and you
always need to lower them to registers before you go into the backend.

>
> BR,
> -R
>
> >>
> >> >>>
> >> >>> Maybe it is too much of a hypothetical.. I still think refcnt'ing
> >> >>> gives some nice flexibility to deal with various scenarios, and
> having
> >> >>> to call nir_shader_unref() isn't so much of a burden.
> >> >>
> >> >> Still, I can't see how this flexibility is at all useful, and it
> seems
> >> >> like overkill since the driver will always want a mutable version of
> >> >> the IR anyways.
> >> >
> >> > Well, due to the structure I mentioned above, at draw time when I need
> >> > to generate a variant with nothing lowered in NIR, I simply incr the
> >> > refcnt on the IR which has already gone through first round of NIR
> >> > passes, and pass that in to my back end compiler.  At the end, once
> >> > the shader binary is generated, I can unconditionally unref the
> >> > nir_shader without having to care.
> >> >
> >> > Without refcnt'ing I'd either have to generate a pointless clone or
> >> > keep track that the nir_shader should not actually be free'd.  Not
> >> > impossible, just a bit more ugly.
> >>
> >> Assuming you do all your variant management in your driver's IR, then
> >> you don't need to do anything. If you do some variant management in
> >> NIR, then in the function where you do the NIR-based lowering you can
> >> check if you need to do any lowering based on the shader key, clone
> >> first, and give the NIR->ir3 function the cloned IR and then free it.
> >> It might be a "bit more ugly," but it's really not that much different
> >> from the refcounting, and when the extra shader gets created/freed is
> >> made explicit.
> >>
> >> >
> >> > (The gallium glsl_to_nir stuff is also currently using refcnt'ing,
> >> > although at least for freedreno/ir3 it isn't strictly needed.. I could
> >> > just unconditionally clone in the state tracker.  That said, I'm still
> >> > of the opinion that refcnt'ing could be useful to some other driver
> >> > someday)
> >>
> >> "It could be useful to some driver someday" isn't a good argument for
> >> adding stuff today. We've already had enough examples of things in NIR
> >> that we added because we thought it was useful, but turned out not to
> >> be.
> >>
> >> >
> >> > BR,
> >> > -R
> >> >
> >> >>>
> >> >>> BR,
> >> >>> -R
> >> >>>
> >> >>>>>
> >> >>>>> (I guess nouveau is the one driver, that if it ever consumed NIR,
> >> >>>>> would translate immediately into it's own backend IR?)
> >> >>>>>
> >> >>>>>> 2) gallium passes read-only IR to the back-end and it always
> makes
> >> >>>>>> a copy.
> >> >>>>>
> >> >>>>> This is how it is w/ TGSI, but I think with NIR we are free to
> make
> >> >>>>> a
> >> >>>>> clean break.  And we *definitely* want to avoid
> >> >>>>> the-driver-always-copies semantics..
> >> >>>>>
> >> >>>>>> It sounds like, from what Marek is saying, that gallium is
> >> >>>>>> currently doing
> >> >>>>>> (2) and changing it to (1) would be painful.  I think reference
> >> >>>>>> counting is
> >> >>>>>> more like an awkward option 1.5 than option 3.  Reference
> counting
> >> >>>>>> would
> >> >>>>>> mean that gallium passes a reference to the driver which it is
> >> >>>>>> expected to
> >> >>>>>> unref but may keep a second reference if it wants to keep the
> >> >>>>>> driver from
> >> >>>>>> modifying it.  Then the driver may or may not make a copy based
> on
> >> >>>>>> the
> >> >>>>>> number of references.  Why don't we just make it explicit and
> add a
> >> >>>>>> read-only bit and call it a day.
> >> >>>>>>
> >> >>>>>> One of the reasons I don't like passing a reference is that it
> >> >>>>>> effectively
> >> >>>>>> puts allocation and freeing in different components of the
> driver.
> >> >>>>>
> >> >>>>> With refcnt'ing you should talk in terms of ref/unref rather than
> >> >>>>> allocate/free.. imho.  Although maybe that is what you meant.  (In
> >> >>>>> which case, yes, that was my idea, that passing in to driver
> >> >>>>> transfers
> >> >>>>> ownership of the passed reference.)
> >> >>>>>
> >> >>>>>> This
> >> >>>>>> means that if and driver doesn't care at all about the shader
> that
> >> >>>>>> gets
> >> >>>>>> passed in, it still has to under it to avoid a memory leak.  You
> >> >>>>>> can't have
> >> >>>>>> the driver take the reference because then, either it comes in
> with
> >> >>>>>> a
> >> >>>>>> recount of 0 and should have been deleted, or the "can I modify
> >> >>>>>> this" check
> >> >>>>>> becomes "recount <= 2" which makes no sense.
> >> >>>>>
> >> >>>>> hmm, no, if ownership of the reference is transferred to the
> driver,
> >> >>>>> then it becomes "refcount == 1" (and refcount == 0 should be an
> >> >>>>> assert, because something has gone horribly wrong)
> >> >>>>>
> >> >>>>> BR,
> >> >>>>> -R
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151228/ec6d8e17/attachment-0001.html>