[Mesa-dev] [RFC] i965: alternative to memctx for cleaning up nir variants
Rob Clark
robdclark at gmail.com
Mon Dec 28 12:25:01 PST 2015
On Mon, Dec 28, 2015 at 2:05 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>
>
> On Mon, Dec 28, 2015 at 10:33 AM, Rob Clark <robdclark at gmail.com> wrote:
>>
>> On Mon, Dec 28, 2015 at 1:20 PM, Jason Ekstrand <jason at jlekstrand.net>
>> wrote:
>> >
>> >
>> > On Mon, Dec 28, 2015 at 9:37 AM, Connor Abbott <cwabbott0 at gmail.com>
>> > wrote:
>> >>
>> >> On Mon, Dec 28, 2015 at 10:13 AM, Rob Clark <robdclark at gmail.com>
>> >> wrote:
>> >> > On Tue, Dec 22, 2015 at 10:11 PM, Connor Abbott <cwabbott0 at gmail.com>
>> >> > wrote:
>> >> >> On Tue, Dec 22, 2015 at 9:55 PM, Rob Clark <robdclark at gmail.com>
>> >> >> wrote:
>> >> >>> On Tue, Dec 22, 2015 at 9:47 PM, Connor Abbott
>> >> >>> <cwabbott0 at gmail.com>
>> >> >>> wrote:
>> >> >>>> On Tue, Dec 22, 2015 at 9:02 PM, Rob Clark <robdclark at gmail.com>
>> >> >>>> wrote:
>> >> >>>>> On Mon, Dec 21, 2015 at 1:48 PM, Jason Ekstrand
>> >> >>>>> <jason at jlekstrand.net> wrote:
>> >> >>>>>>
>> >> >>>>>> I think two different concepts of ownership are getting
>> >> >>>>>> conflated
>> >> >>>>>> here:
>> >> >>>>>> Right/responsibility to delete and right to modify.
>> >> >>>>>>
>> >> >>>>>> The way I understand it, gallium, as it stands, gives neither to
>> >> >>>>>> the driver.
>> >> >>>>>> A back-end using NIR requires the right to modify but who
>> >> >>>>>> deletes
>> >> >>>>>> it doesn't
>> >> >>>>>> ultimately matter. I think it's dangerous to pass one of these
>> >> >>>>>> rights to
>> >> >>>>>> the driver and not the other but we need to think about both.
>> >> >>>>>
>> >> >>>>> yeah, uneasy about driver modifying the IR if the state tracker
>> >> >>>>> is
>> >> >>>>> still going to potentially spin off variants of the IR.. that
>> >> >>>>> sounds
>> >> >>>>> like madness.
>> >> >>>>>
>> >> >>>>> The refcnt'ing I proposed does deal w/ right to modify vs delete
>> >> >>>>> via
>> >> >>>>> nir_shader(_is)_mutable() which returns something that is
>> >> >>>>> guaranteed
>> >> >>>>> to be safe to modify (ie. has only a single reference)
>> >> >>>>>
>> >> >>>>>> What I'm trying to say is that we have two options here:
>> >> >>>>>>
>> >> >>>>>> 1) gallium passes IR to the back-end that it is free to modify
>> >> >>>>>> and
>> >> >>>>>> is
>> >> >>>>>> required to delete when it's done.
>> >> >>>>>
>> >> >>>>> with refcnt'ing, s/delete/unref/
>> >> >>>>>
>> >> >>>>> The idea is, the st transfers ownership of the reference it
>> >> >>>>> passes
>> >> >>>>> to
>> >> >>>>> the driver. If the st wants to hang on to a reference itself, it
>> >> >>>>> must
>> >> >>>>> increment the refcnt before passing to the driver (backend).
>> >> >>>>>
>> >> >>>>> Without refcnt'ing, I suppose we could (since we don't have to
>> >> >>>>> follow
>> >> >>>>> TGSI semantics), just decree that the driver always takes
>> >> >>>>> ownership
>> >> >>>>> of
>> >> >>>>> the copy passed in, and if the st wants to hang on to a copy too,
>> >> >>>>> then
>> >> >>>>> it must clone. I suppose this would work well enough for
>> >> >>>>> freedreno/vc4, which both end up generating variants later. It
>> >> >>>>> does
>> >> >>>>> force an extra clone for drivers that immediately translate into
>> >> >>>>> their
>> >> >>>>> own backend IR and don't need to keep the NIR around, for
>> >> >>>>> example.
>> >> >>>>> Maybe that is not worth caring about (since at this point it is
>> >> >>>>> hypothetical).
>> >> >>>>
>> >> >>>> While always cloning does have this disadvantage, I don't think
>> >> >>>> it's
>> >> >>>> really relevant here. Even if the driver throws away the NIR
>> >> >>>> immediately after consuming it, almost invariably it's going to
>> >> >>>> want
>> >> >>>> to modify it. The generic NIR passed in by the state tracker
>> >> >>>> (other
>> >> >>>> IR -> NIR + some optimizations) is almost never going to be the
>> >> >>>> same
>> >> >>>> as the NIR after going through driver-specific lowering passes,
>> >> >>>> which
>> >> >>>> means that drivers are never going to want a read-only version of
>> >> >>>> the
>> >> >>>> IR. In light of that, I think making the driver own the IR passed
>> >> >>>> in
>> >> >>>> seems like the most sensible thing.
>> >> >>>
>> >> >>> well, unless the driver is already doing it's own lowering in it's
>> >> >>> own
>> >> >>> native IR..
>> >> >>
>> >> >> Well, if you're not doing any lowering in NIR, then you aren't
>> >> >> really
>> >> >> taking any advantage of it. I can't see a plausible scenario where
>> >> >> all
>> >> >> the lowering is done in the driver's own IR -- and as soon as you do
>> >> >> anything in NIR, you need the driver-owns-IR semantics.
>> >> >
>> >> > When it comes to shader variants, I have a mix, with some things
>> >> > lowered in nir and others just handled in backend..
>> >> >
>> >> > The re-work / cleanup that I have had on a branch for a while now
>> >> > (since it is currently blocked on refcnt'ing) does a first round of
>> >> > variant-key independent NIR lowering/opt passes. And then at draw
>> >> > time, if the variant key has anything that is lowered in nir, I do a
>> >> > second round.
>> >
>> >
>> > Just to be clear, your key-dependent lowering happens after all of your
>> > other lowering? If this is the case, then I guarantee you that you're
>> > unique in this since i965 and vc4 need to at least run out-of-SSA
>> > afterwards. To be honest, I completely forgot that a driver could use
>> > fully
>> > ssa NIR.
>>
>> It is a mix.. I do texcoord saturate, clip-plane, and 2-sided color
>> lowering in NIR. But flat-shading, binning-pass, and half vs full
>> precision color output in ir3.
>>
>> I do as much lowering in NIR as I can, in an effort to do as much as
>> possible at compile time, vs draw time. I do the first round of
>> lowering/opt w/ null shader key, which is enough for the common cases.
>>
>> Pretty much independent, I suppose, of whether I came out of SSA or
>> not first. Although binning-pass variant and the instruction
>> scheduling I do are easier in SSA.
>>
>> Somewhat unrelated, but I may end up converting array access to
>> registers, but leave everything else in SSA, so I can benefit from
>> converting multi-dimensional offsets into a single offset.. this is
>> still one open issue w/ gallium glsl_to_nir.. right now I have a
>> hacked up version of nir_lower_io that converts global/local
>> load/store_var's into load/store_var2 which take an offset as a src
>> (like load_input/store_output) instead of deref chain.. not sure yet
>> whether this will be the permanent solution, but at least it fixes a
>> huge heap of variable-indexing piglits and lets me continue w/
>> implementing lowering passes for everything else that was previously
>> done in glsl->tgsi or tgsi->tgsi passes.
>
>
> If you do this, you'll be back to always needing a mutable copy. Most
> lowering and optimization passes die the moment they see a register. You'll
> either have to go fix a bunch of stuff up to no-op properly or run
> vars_to_regs after doing your NIR lowering but before going into your
> backend IR. This means that your "gold copy" still has variables and you
> always need to lower them to registers before you go into the backend.
ugg.. but good point, thanks for pointing that out before I wasted
another afternoon on yet another dead-end for handling deref's..
Ok, I guess I need to think of a better name than load/store_var2 for
the new intrinsics ;-)
PS. I would really like to keep the nir_variable ptr in the intrinsic
instruction, rather than matching things up w/
var->data.driver_location, which is rather clunky..
BR,
-R
>>
>>
>> BR,
>> -R
>>
>> >>
>> >> >>>
>> >> >>> Maybe it is too much of a hypothetical.. I still think refcnt'ing
>> >> >>> gives some nice flexibility to deal with various scenarios, and
>> >> >>> having
>> >> >>> to call nir_shader_unref() isn't so much of a burden.
>> >> >>
>> >> >> Still, I can't see how this flexibility is at all useful, and it
>> >> >> seems
>> >> >> like overkill since the driver will always want a mutable version of
>> >> >> the IR anyways.
>> >> >
>> >> > Well, due to the structure I mentioned above, at draw time when I
>> >> > need
>> >> > to generate a variant with nothing lowered in NIR, I simply incr the
>> >> > refcnt on the IR which has already gone through first round of NIR
>> >> > passes, and pass that in to my back end compiler. At the end, once
>> >> > the shader binary is generated, I can unconditionally unref the
>> >> > nir_shader without having to care.
>> >> >
>> >> > Without refcnt'ing I'd either have to generate a pointless clone or
>> >> > keep track that the nir_shader should not actually be free'd. Not
>> >> > impossible, just a bit more ugly.
>> >>
>> >> Assuming you do all your variant management in your driver's IR, then
>> >> you don't need to do anything. If you do some variant management in
>> >> NIR, then in the function where you do the NIR-based lowering you can
>> >> check if you need to do any lowering based on the shader key, clone
>> >> first, and give the NIR->ir3 function the cloned IR and then free it.
>> >> It might be a "bit more ugly," but it's really not that much different
>> >> from the refcounting, and when the extra shader gets created/freed is
>> >> made explicit.
>> >>
>> >> >
>> >> > (The gallium glsl_to_nir stuff is also currently using refcnt'ing,
>> >> > although at least for freedreno/ir3 it isn't strictly needed.. I
>> >> > could
>> >> > just unconditionally clone in the state tracker. That said, I'm
>> >> > still
>> >> > of the opinion that refcnt'ing could be useful to some other driver
>> >> > someday)
>> >>
>> >> "It could be useful to some driver someday" isn't a good argument for
>> >> adding stuff today. We've already had enough examples of things in NIR
>> >> that we added because we thought it was useful, but turned out not to
>> >> be.
>> >>
>> >> >
>> >> > BR,
>> >> > -R
>> >> >
>> >> >>>
>> >> >>> BR,
>> >> >>> -R
>> >> >>>
>> >> >>>>>
>> >> >>>>> (I guess nouveau is the one driver, that if it ever consumed NIR,
>> >> >>>>> would translate immediately into it's own backend IR?)
>> >> >>>>>
>> >> >>>>>> 2) gallium passes read-only IR to the back-end and it always
>> >> >>>>>> makes
>> >> >>>>>> a copy.
>> >> >>>>>
>> >> >>>>> This is how it is w/ TGSI, but I think with NIR we are free to
>> >> >>>>> make
>> >> >>>>> a
>> >> >>>>> clean break. And we *definitely* want to avoid
>> >> >>>>> the-driver-always-copies semantics..
>> >> >>>>>
>> >> >>>>>> It sounds like, from what Marek is saying, that gallium is
>> >> >>>>>> currently doing
>> >> >>>>>> (2) and changing it to (1) would be painful. I think reference
>> >> >>>>>> counting is
>> >> >>>>>> more like an awkward option 1.5 than option 3. Reference
>> >> >>>>>> counting
>> >> >>>>>> would
>> >> >>>>>> mean that gallium passes a reference to the driver which it is
>> >> >>>>>> expected to
>> >> >>>>>> unref but may keep a second reference if it wants to keep the
>> >> >>>>>> driver from
>> >> >>>>>> modifying it. Then the driver may or may not make a copy based
>> >> >>>>>> on
>> >> >>>>>> the
>> >> >>>>>> number of references. Why don't we just make it explicit and
>> >> >>>>>> add a
>> >> >>>>>> read-only bit and call it a day.
>> >> >>>>>>
>> >> >>>>>> One of the reasons I don't like passing a reference is that it
>> >> >>>>>> effectively
>> >> >>>>>> puts allocation and freeing in different components of the
>> >> >>>>>> driver.
>> >> >>>>>
>> >> >>>>> With refcnt'ing you should talk in terms of ref/unref rather than
>> >> >>>>> allocate/free.. imho. Although maybe that is what you meant.
>> >> >>>>> (In
>> >> >>>>> which case, yes, that was my idea, that passing in to driver
>> >> >>>>> transfers
>> >> >>>>> ownership of the passed reference.)
>> >> >>>>>
>> >> >>>>>> This
>> >> >>>>>> means that if and driver doesn't care at all about the shader
>> >> >>>>>> that
>> >> >>>>>> gets
>> >> >>>>>> passed in, it still has to under it to avoid a memory leak. You
>> >> >>>>>> can't have
>> >> >>>>>> the driver take the reference because then, either it comes in
>> >> >>>>>> with
>> >> >>>>>> a
>> >> >>>>>> recount of 0 and should have been deleted, or the "can I modify
>> >> >>>>>> this" check
>> >> >>>>>> becomes "recount <= 2" which makes no sense.
>> >> >>>>>
>> >> >>>>> hmm, no, if ownership of the reference is transferred to the
>> >> >>>>> driver,
>> >> >>>>> then it becomes "refcount == 1" (and refcount == 0 should be an
>> >> >>>>> assert, because something has gone horribly wrong)
>> >> >>>>>
>> >> >>>>> BR,
>> >> >>>>> -R
>> >
>> >
>
>
More information about the mesa-dev
mailing list