[Mesa-dev] [RFC] i965: alternative to memctx for cleaning up nir variants

Mon Dec 28 10:58:21 PST 2015

On Mon, Dec 28, 2015 at 1:35 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Mon, Dec 28, 2015 at 12:37 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>> On Mon, Dec 28, 2015 at 10:13 AM, Rob Clark <robdclark at gmail.com> wrote:
>>> On Tue, Dec 22, 2015 at 10:11 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>>> On Tue, Dec 22, 2015 at 9:55 PM, Rob Clark <robdclark at gmail.com> wrote:
>>>>> On Tue, Dec 22, 2015 at 9:47 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>>>>> On Tue, Dec 22, 2015 at 9:02 PM, Rob Clark <robdclark at gmail.com> wrote:
>>>>>>> On Mon, Dec 21, 2015 at 1:48 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>>>>>>>>
>>>>>>>> I think two different concepts of ownership are getting conflated here:
>>>>>>>> Right/responsibility to delete and right to modify.
>>>>>>>>
>>>>>>>> The way I understand it, gallium, as it stands, gives neither to the driver.
>>>>>>>> A back-end using NIR requires the right to modify but who deletes it doesn't
>>>>>>>> ultimately matter.  I think it's dangerous to pass one of these rights to
>>>>>>>> the driver and not the other but we need to think about both.
>>>>>>>
>>>>>>> yeah, uneasy about driver modifying the IR if the state tracker is
>>>>>>> still going to potentially spin off variants of the IR.. that sounds
>>>>>>> like madness.
>>>>>>>
>>>>>>> The refcnt'ing I proposed does deal w/ right to modify vs delete via
>>>>>>> nir_shader(_is)_mutable() which returns something that is guaranteed
>>>>>>> to be safe to modify (ie. has only a single reference)
>>>>>>>
>>>>>>>> What I'm trying to say is that we have two options here:
>>>>>>>>
>>>>>>>> 1) gallium passes IR to the back-end that it is free to modify and is
>>>>>>>> required to delete when it's done.
>>>>>>>
>>>>>>> with refcnt'ing, s/delete/unref/
>>>>>>>
>>>>>>> The idea is, the st transfers ownership of the reference it passes to
>>>>>>> the driver.  If the st wants to hang on to a reference itself, it must
>>>>>>> increment the refcnt before passing to the driver (backend).
>>>>>>>
>>>>>>> Without refcnt'ing, I suppose we could (since we don't have to follow
>>>>>>> TGSI semantics), just decree that the driver always takes ownership of
>>>>>>> the copy passed in, and if the st wants to hang on to a copy too, then
>>>>>>> it must clone.  I suppose this would work well enough for
>>>>>>> freedreno/vc4, which both end up generating variants later.  It does
>>>>>>> force an extra clone for drivers that immediately translate into their
>>>>>>> own backend IR and don't need to keep the NIR around, for example.
>>>>>>> Maybe that is not worth caring about (since at this point it is
>>>>>>> hypothetical).
>>>>>>
>>>>>> While always cloning does have this disadvantage, I don't think it's
>>>>>> really relevant here. Even if the driver throws away the NIR
>>>>>> immediately after consuming it, almost invariably it's going to want
>>>>>> to modify  it. The generic NIR passed in by the state tracker (other
>>>>>> IR -> NIR + some optimizations) is almost never going to be the same
>>>>>> as the NIR after going through driver-specific lowering passes, which
>>>>>> means that drivers are never going to want a read-only version of the
>>>>>> IR. In light of that, I think making the driver own the IR passed in
>>>>>> seems like the most sensible thing.
>>>>>
>>>>> well, unless the driver is already doing it's own lowering in it's own
>>>>> native IR..
>>>>
>>>> Well, if you're not doing any lowering in NIR, then you aren't really
>>>> taking any advantage of it. I can't see a plausible scenario where all
>>>> the lowering is done in the driver's own IR -- and as soon as you do
>>>> anything in NIR, you need the driver-owns-IR semantics.
>>>
>>> When it comes to shader variants, I have a mix, with some things
>>> lowered in nir and others just handled in backend..
>>>
>>> The re-work / cleanup that I have had on a branch for a while now
>>> (since it is currently blocked on refcnt'ing) does a first round of
>>> variant-key independent NIR lowering/opt passes.  And then at draw
>>> time, if the variant key has anything that is lowered in nir, I do a
>>> second round.
>>>
>>>>>
>>>>> Maybe it is too much of a hypothetical.. I still think refcnt'ing
>>>>> gives some nice flexibility to deal with various scenarios, and having
>>>>> to call nir_shader_unref() isn't so much of a burden.
>>>>
>>>> Still, I can't see how this flexibility is at all useful, and it seems
>>>> like overkill since the driver will always want a mutable version of
>>>> the IR anyways.
>>>
>>> Well, due to the structure I mentioned above, at draw time when I need
>>> to generate a variant with nothing lowered in NIR, I simply incr the
>>> refcnt on the IR which has already gone through first round of NIR
>>> passes, and pass that in to my back end compiler.  At the end, once
>>> the shader binary is generated, I can unconditionally unref the
>>> nir_shader without having to care.
>>>
>>> Without refcnt'ing I'd either have to generate a pointless clone or
>>> keep track that the nir_shader should not actually be free'd.  Not
>>> impossible, just a bit more ugly.
>>
>> Assuming you do all your variant management in your driver's IR, then
>> you don't need to do anything. If you do some variant management in
>> NIR, then in the function where you do the NIR-based lowering you can
>> check if you need to do any lowering based on the shader key, clone
>> first, and give the NIR->ir3 function the cloned IR and then free it.
>> It might be a "bit more ugly," but it's really not that much different
>> from the refcounting, and when the extra shader gets created/freed is
>> made explicit.
>
> I'd have to re-arrange things some compared to how the backend works
> now..  not impossible, but annoying.  Also, I could more easily unref
> the NIR once I've converted into ir3 rather than waiting until after
> the backend passes.

That's just a matter of how you structure things... if you do the
NIR-based lowering in the same function as converting to ir3, which
seems like a good idea anyways, then it should be possible to unref
before you run your backend passes. But I doubt the difference matters
anyways. Our experience with i965 is that holding a lot of shaders at
once can eat into memory, but memory consumption while actually doing
work hardly matters (I guess it could if it's totally out of control,
but that's not here...).

>
>>>
>>> (The gallium glsl_to_nir stuff is also currently using refcnt'ing,
>>> although at least for freedreno/ir3 it isn't strictly needed.. I could
>>> just unconditionally clone in the state tracker.  That said, I'm still
>>> of the opinion that refcnt'ing could be useful to some other driver
>>> someday)
>>
>> "It could be useful to some driver someday" isn't a good argument for
>> adding stuff today. We've already had enough examples of things in NIR
>> that we added because we thought it was useful, but turned out not to
>> be.
>
> well, it is useful to one driver today, as I explained..  so it is
> more a matter of "It could be also useful to some other driver...".
> Otherwise I would agree with you.

But even for you, this is still overkill. Reference counting is most
useful when there are potentially many different owners of a resource,
which isn't the case here. If we add reference counting now, we'd be
adding an essentially useless feature, and the fact that you used it
doesn't change the fact that it's unnecessary.

>
> BR,
> -R
>
>>>
>>> BR,
>>> -R
>>>
>>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>>>>
>>>>>>> (I guess nouveau is the one driver, that if it ever consumed NIR,
>>>>>>> would translate immediately into it's own backend IR?)
>>>>>>>
>>>>>>>> 2) gallium passes read-only IR to the back-end and it always makes a copy.
>>>>>>>
>>>>>>> This is how it is w/ TGSI, but I think with NIR we are free to make a
>>>>>>> clean break.  And we *definitely* want to avoid
>>>>>>> the-driver-always-copies semantics..
>>>>>>>
>>>>>>>> It sounds like, from what Marek is saying, that gallium is currently doing
>>>>>>>> (2) and changing it to (1) would be painful.  I think reference counting is
>>>>>>>> more like an awkward option 1.5 than option 3.  Reference counting would
>>>>>>>> mean that gallium passes a reference to the driver which it is expected to
>>>>>>>> unref but may keep a second reference if it wants to keep the driver from
>>>>>>>> modifying it.  Then the driver may or may not make a copy based on the
>>>>>>>> number of references.  Why don't we just make it explicit and add a
>>>>>>>> read-only bit and call it a day.
>>>>>>>>
>>>>>>>> One of the reasons I don't like passing a reference is that it effectively
>>>>>>>> puts allocation and freeing in different components of the driver.
>>>>>>>
>>>>>>> With refcnt'ing you should talk in terms of ref/unref rather than
>>>>>>> allocate/free.. imho.  Although maybe that is what you meant.  (In
>>>>>>> which case, yes, that was my idea, that passing in to driver transfers
>>>>>>> ownership of the passed reference.)
>>>>>>>
>>>>>>>> This
>>>>>>>> means that if and driver doesn't care at all about the shader that gets
>>>>>>>> passed in, it still has to under it to avoid a memory leak.  You can't have
>>>>>>>> the driver take the reference because then, either it comes in with a
>>>>>>>> recount of 0 and should have been deleted, or the "can I modify this" check
>>>>>>>> becomes "recount <= 2" which makes no sense.
>>>>>>>
>>>>>>> hmm, no, if ownership of the reference is transferred to the driver,
>>>>>>> then it becomes "refcount == 1" (and refcount == 0 should be an
>>>>>>> assert, because something has gone horribly wrong)
>>>>>>>
>>>>>>> BR,
>>>>>>> -R