[Intel-gfx] [PATCH 1/6] dma-buf: add dynamic DMA-buf handling v10
Daniel Vetter
daniel at ffwll.ch
Fri Jun 21 10:32:07 UTC 2019
On Fri, Jun 21, 2019 at 11:55 AM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> Am 21.06.19 um 11:20 schrieb Daniel Vetter:
> > On Tue, Jun 18, 2019 at 01:54:50PM +0200, Christian König wrote:
> >> On the exporter side we add optional explicit pinning callbacks. If those
> >> callbacks are implemented the framework no longer caches sg tables and the
> >> map/unmap callbacks are always called with the lock of the reservation object
> >> held.
> >>
> >> On the importer side we add an optional invalidate callback. This callback is
> >> used by the exporter to inform the importers that their mappings should be
> >> destroyed as soon as possible.
> >>
> >> This allows the exporter to provide the mappings without the need to pin
> >> the backing store.
> >>
> >> v2: don't try to invalidate mappings when the callback is NULL,
> >> lock the reservation obj while using the attachments,
> >> add helper to set the callback
> >> v3: move flag for invalidation support into the DMA-buf,
> >> use new attach_info structure to set the callback
> >> v4: use importer_priv field instead of mangling exporter priv.
> >> v5: drop invalidation_supported flag
> >> v6: squash together with pin/unpin changes
> >> v7: pin/unpin takes an attachment now
> >> v8: nuke dma_buf_attachment_(map|unmap)_locked,
> >> everything is now handled backward compatible
> >> v9: always cache when export/importer don't agree on dynamic handling
> >> v10: minimal style cleanup
> >>
> >> Signed-off-by: Christian König <christian.koenig at amd.com>
> >> ---
> >> drivers/dma-buf/dma-buf.c | 188 ++++++++++++++++++++++++++++++++++++--
> >> include/linux/dma-buf.h | 109 ++++++++++++++++++++--
> >> 2 files changed, 283 insertions(+), 14 deletions(-)
> >>
> >> [SNIP]
> >> + if (dma_buf_attachment_is_dynamic(attach)) {
> >> + reservation_object_assert_held(attach->dmabuf->resv);
> >> +
> >> + /*
> >> + * Mapping a DMA-buf can trigger its invalidation, prevent
> >> + * sending this event to the caller by temporary removing
> >> + * this attachment from the list.
> >> + */
> >> + list_del(&attach->node);
> > I'm still hung up about this, that still feels like leaking random ttm
> > implementation details into the dma-buf interfaces. And it's asymmetric:
> >
> > - When acquiring a buffer mapping (whether p2p or system memory sg or
> > whatever) we always have to wait for pending fences before we can access
> > the buffer. At least for full dynamic dma-buf access.
> >
> > - Same is true when dropping a mapping: We could drop the mapping
> > immediately, but only actually release it when that fence has signalled.
> > Then this hack here wouldn't be necessary.
> >
> > It feels a bit like this is just an artifact of how ttm currently does bo
> > moves with the shadow bo. There's other ways to fix that, you could just
> > have a memory manager reservation of a given range or whatever and a
> > release fence from when it's actually good to use.
>
> No, that is for handling a completely different case :)
>
> >
> > Imo the below semantics would be much cleaner:
> >
> > - invalidate may add new fences
> > - invalidate _must_ unmap its mappings
> > - an unmap must wait for current fences before the mapping can be
> > released.
> >
> > Imo there's no reason why unmap is special, and the only thing where we
> > don't use fences to gate access to resources/memory when it's in the
> > process of getting moved around.
>
> Well in general I want to avoid waiting for fences as much as possible.
> But the key point here is that this actually won't help with the problem
> I'm trying to solve.
The point of using fences is not to wait on them. I mean if you have
the shadow ttm bo on the lru you also don't wait for that fence to
retire before you insert the shadow bo onto the lru. You don't even
wait when you try to use that memory again, you just pipeline more
stuff on top.
In the end it will be the exact same amount of fences and waiting in
both solutions. One just leaks less implementationt details (at least
in my opinion) across the dma-buf border.
> > btw this is like the 2nd or 3rd time I'm typing this, haven't seen your
> > thoughts on this yet.
>
> Yeah, and I'm responding for the 3rd time now that you are
> misunderstanding why we need this here :)
>
> Maybe I can make that clear with an example:
>
> 1. You got a sharing between device A (exporter) and B (importer) which
> uses P2P.
>
> 2. Now device C (importer) comes along and wants to use the DMA-buf
> object as well.
>
> 3. The handling now figures out that we can't do P2P between device A
> and device C (for whatever reason).
>
> 4. The map_attachment implementation in device driver A doesn't want to
> fail with -EBUSY and migrates the DMA-buf somewhere where both device A
> and device C can access it.
>
> 5. This migration will result in sending an invalidation event around.
> And here it doesn't make sense to send this invalidation event to device
> C, because we know that device C is actually causing this event and
> doesn't have a valid mapping.
Hm I thought the last time around there was a different scenario, with
just one importer:
- importer has a mapping, gets an ->invalidate call.
- importer arranges for the mappings/usage to get torn down, maybe
updating fences, all from ->invalidate. But the mapping itself wont
disappear.
- exporter moves buffer to new places (for whatever reasons it felt
that was the thing to do).
- importer does another execbuf, the exporter needs to move the buffer
back. Again it calls ->invalidate, but on a mapping it already has
called ->invalidate on, and to prevent that silliness we take the
importer temporary off the list.
Your scenario here is new, and iirc my suggestion back then was to
count the number of pending mappings so you don't go around calling
->invalidate on mappings that don't exist.
But even if you fix your scenario here there's still the issue that we
can receive invalidates on a mapping we've already torn down and which
is on the process of disappearing. That's kinda the part I don't think
is great semantics.
> One alternative would be to completely disallow buffer migration which
> can cause invalidation in the drivers map_attachment call. But with
> dynamic handling you definitely need to be able to migrate in the
> map_attachment call for swapping evicted things back into a place where
> they are accessible. So that would make it harder for drivers to get it
> right.
Nah, that defeats the point. Also, it's not the problem of getting an
->invalidate for something that's already been ->invalidated. It could
also be some 3rd party which causes another buffer move, and then
again importers would get an ->invalidate for something that they've
cleaned up already.
> Another alternative (and that's what I implemented initially) is to make
> sure the driver calling map_attachment can handle invalidation events
> re-entering itself while doing so. But then you add another tricky thing
> for drivers to handle which could be done in the general code.
Nah that sounds even worse :-)
> The reason I don't have that on unmap is that I think migrating things
> on unmap doesn't make sense. If you think otherwise it certainly does
> make sense to add that there as well.
The problem isn't the recursion, but the book-keeping. There's imo two cases:
- your scenario, where you call ->invalidate on an attachment which
doesn't have a mapping. I'll call that very lazy accounting, feels
like a bug :-) It's also very easy to fix by keeping track who
actually has a mapping, and then you fix it everywhere, not just for
the specific case of a recursion into the same caller.
- calling invalidate multiple times. That's my scenario (or your older
one), where you call invalidate again on something that's already
invalidated. Just keeping track of who actually has a mapping wont fix
that, imo the proper fix is to to pipeline the unmapping using fences.
But I guess there's other fixes too possible.
Either way none of this is about recursion, I think the recursive case
is simply the one where you've hit this already. Drivers will have to
handle all these additional ->invalidates no matter what with your
current proposal. After all the point here is that the exporter can
move the buffers around whenever it feels like, for whatever reasons.
For solutions I think there's roughly three:
- importers need to deal. You don't like that, I agree
- exporters need to deal, probably not much better, but I think
stricter contract is better in itself at least.
- dma-buf.c keeps better track of mappings and which have been
invalidated already
We could also combine the last two with some helpers, e.g. if your
exporter really expects importers to delay the unmap until it's no
longer in use, then we could do a small helper which puts all these
unmaps onto a list with a worker. But I think you want to integrate
that into your exporters lru management directly.
> So this is just the most defensive thing I was able to come up with,
> which leaves the least possibility for drivers to do something stupid.
Maybe we're still talking past each another, but I feel like the big
issues are all still there. Problem identified, yes, solved, no.
Thanks, Daniel
>
> Thanks,
> Christian.
>
> >
> > Thanks, Daniel
> >
> >> +
> >> + } else if (dma_buf_is_dynamic(attach->dmabuf)) {
> >> + reservation_object_lock(attach->dmabuf->resv, NULL);
> >> + r = dma_buf_pin(attach);
> >> + if (r) {
> >> + reservation_object_unlock(attach->dmabuf->resv);
> >> + return ERR_PTR(r);
> >> + }
> >> + }
> >> +
> >> sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
> >> if (!sg_table)
> >> sg_table = ERR_PTR(-ENOMEM);
> >>
> >> + if (dma_buf_attachment_is_dynamic(attach)) {
> >> + list_add(&attach->node, &attach->dmabuf->attachments);
> >> +
> >> + } else if (dma_buf_is_dynamic(attach->dmabuf)) {
> >> + if (IS_ERR(sg_table))
> >> + dma_buf_unpin(attach);
> >> + reservation_object_unlock(attach->dmabuf->resv);
> >> + }
> >> +
> >> if (!IS_ERR(sg_table) && attach->dmabuf->ops->cache_sgt_mapping) {
> >> attach->sgt = sg_table;
> >> attach->dir = direction;
> >> @@ -802,10 +945,41 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
> >> if (attach->sgt == sg_table)
> >> return;
> >>
> >> + if (dma_buf_attachment_is_dynamic(attach))
> >> + reservation_object_assert_held(attach->dmabuf->resv);
> >> + else if (dma_buf_is_dynamic(attach->dmabuf))
> >> + reservation_object_lock(attach->dmabuf->resv, NULL);
> >> +
> >> attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
> >> +
> >> + if (dma_buf_is_dynamic(attach->dmabuf) &&
> >> + !dma_buf_attachment_is_dynamic(attach)) {
> >> + dma_buf_unpin(attach);
> >> + reservation_object_unlock(attach->dmabuf->resv);
> >> + }
> >> }
> >> EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
> >>
> >> +/**
> >> + * dma_buf_invalidate_mappings - invalidate all mappings of this dma_buf
> >> + *
> >> + * @dmabuf: [in] buffer which mappings should be invalidated
> >> + *
> >> + * Informs all attachmenst that they need to destroy and recreated all their
> >> + * mappings.
> >> + */
> >> +void dma_buf_invalidate_mappings(struct dma_buf *dmabuf)
> >> +{
> >> + struct dma_buf_attachment *attach;
> >> +
> >> + reservation_object_assert_held(dmabuf->resv);
> >> +
> >> + list_for_each_entry(attach, &dmabuf->attachments, node)
> >> + if (attach->importer_ops && attach->importer_ops->invalidate)
> >> + attach->importer_ops->invalidate(attach);
> >> +}
> >> +EXPORT_SYMBOL_GPL(dma_buf_invalidate_mappings);
> >> +
> >> /**
> >> * DOC: cpu access
> >> *
> >> @@ -1225,10 +1399,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
> >> seq_puts(s, "\tAttached Devices:\n");
> >> attach_count = 0;
> >>
> >> + reservation_object_lock(buf_obj->resv, NULL);
> >> list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
> >> seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
> >> attach_count++;
> >> }
> >> + reservation_object_unlock(buf_obj->resv);
> >>
> >> seq_printf(s, "Total %d devices attached\n\n",
> >> attach_count);
> >> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> >> index 01ad5b942a6f..f9c96bf56bc8 100644
> >> --- a/include/linux/dma-buf.h
> >> +++ b/include/linux/dma-buf.h
> >> @@ -92,14 +92,40 @@ struct dma_buf_ops {
> >> */
> >> void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
> >>
> >> + /**
> >> + * @pin:
> >> + *
> >> + * This is called by dma_buf_pin and lets the exporter know that the
> >> + * DMA-buf can't be moved any more.
> >> + *
> >> + * This is called with the dmabuf->resv object locked.
> >> + *
> >> + * This callback is optional.
> >> + *
> >> + * Returns:
> >> + *
> >> + * 0 on success, negative error code on failure.
> >> + */
> >> + int (*pin)(struct dma_buf_attachment *attach);
> >> +
> >> + /**
> >> + * @unpin:
> >> + *
> >> + * This is called by dma_buf_unpin and lets the exporter know that the
> >> + * DMA-buf can be moved again.
> >> + *
> >> + * This is called with the dmabuf->resv object locked.
> >> + *
> >> + * This callback is optional.
> >> + */
> >> + void (*unpin)(struct dma_buf_attachment *attach);
> >> +
> >> /**
> >> * @map_dma_buf:
> >> *
> >> * This is called by dma_buf_map_attachment() and is used to map a
> >> * shared &dma_buf into device address space, and it is mandatory. It
> >> - * can only be called if @attach has been called successfully. This
> >> - * essentially pins the DMA buffer into place, and it cannot be moved
> >> - * any more
> >> + * can only be called if @attach has been called successfully.
> >> *
> >> * This call may sleep, e.g. when the backing storage first needs to be
> >> * allocated, or moved to a location suitable for all currently attached
> >> @@ -120,6 +146,9 @@ struct dma_buf_ops {
> >> * any other kind of sharing that the exporter might wish to make
> >> * available to buffer-users.
> >> *
> >> + * This is always called with the dmabuf->resv object locked when
> >> + * the pin/unpin callbacks are implemented.
> >> + *
> >> * Returns:
> >> *
> >> * A &sg_table scatter list of or the backing storage of the DMA buffer,
> >> @@ -137,9 +166,6 @@ struct dma_buf_ops {
> >> *
> >> * This is called by dma_buf_unmap_attachment() and should unmap and
> >> * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
> >> - * It should also unpin the backing storage if this is the last mapping
> >> - * of the DMA buffer, it the exporter supports backing storage
> >> - * migration.
> >> */
> >> void (*unmap_dma_buf)(struct dma_buf_attachment *,
> >> struct sg_table *,
> >> @@ -330,6 +356,35 @@ struct dma_buf {
> >> } cb_excl, cb_shared;
> >> };
> >>
> >> +/**
> >> + * struct dma_buf_attach_ops - importer operations for an attachment
> >> + * @invalidate: [optional] invalidate all mappings made using this attachment.
> >> + *
> >> + * Attachment operations implemented by the importer.
> >> + */
> >> +struct dma_buf_attach_ops {
> >> + /**
> >> + * @invalidate:
> >> + *
> >> + * If the invalidate callback is provided the framework can avoid
> >> + * pinning the backing store while mappings exists.
> >> + *
> >> + * This callback is called with the lock of the reservation object
> >> + * associated with the dma_buf held and the mapping function must be
> >> + * called with this lock held as well. This makes sure that no mapping
> >> + * is created concurrently with an ongoing invalidation.
> >> + *
> >> + * After the callback all existing mappings are still valid until all
> >> + * fences in the dma_bufs reservation object are signaled. After getting an
> >> + * invalidation callback all mappings should be destroyed by the importer using
> >> + * the normal dma_buf_unmap_attachment() function as soon as possible.
> >> + *
> >> + * New mappings can be created immediately, but can't be used before the
> >> + * exclusive fence in the dma_bufs reservation object is signaled.
> >> + */
> >> + void (*invalidate)(struct dma_buf_attachment *attach);
> >> +};
> >> +
> >> /**
> >> * struct dma_buf_attachment - holds device-buffer attachment data
> >> * @dmabuf: buffer for this attachment.
> >> @@ -338,6 +393,8 @@ struct dma_buf {
> >> * @sgt: cached mapping.
> >> * @dir: direction of cached mapping.
> >> * @priv: exporter specific attachment data.
> >> + * @importer_ops: importer operations for this attachment.
> >> + * @importer_priv: importer specific attachment data.
> >> *
> >> * This structure holds the attachment information between the dma_buf buffer
> >> * and its user device(s). The list contains one attachment struct per device
> >> @@ -355,6 +412,9 @@ struct dma_buf_attachment {
> >> struct sg_table *sgt;
> >> enum dma_data_direction dir;
> >> void *priv;
> >> +
> >> + const struct dma_buf_attach_ops *importer_ops;
> >> + void *importer_priv;
> >> };
> >>
> >> /**
> >> @@ -405,10 +465,42 @@ static inline void get_dma_buf(struct dma_buf *dmabuf)
> >> get_file(dmabuf->file);
> >> }
> >>
> >> +/**
> >> + * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
> >> + * @dmabuf: the DMA-buf to check
> >> + *
> >> + * Returns true if a DMA-buf exporter wants to create dynamic sg table mappings
> >> + * for each attachment. False if only a single static sg table should be used.
> >> + */
> >> +static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
> >> +{
> >> + return !!dmabuf->ops->pin;
> >> +}
> >> +
> >> +/**
> >> + * dma_buf_attachment_is_dynamic - check if a DMA-buf attachment uses dynamic
> >> + * mappinsg
> >> + * @attach: the DMA-buf attachment to check
> >> + *
> >> + * Returns true if a DMA-buf importer wants to use dynamic sg table mappings and
> >> + * calls the map/unmap functions with the reservation object locked.
> >> + */
> >> +static inline bool
> >> +dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
> >> +{
> >> + return attach->importer_ops && attach->importer_ops->invalidate;
> >> +}
> >> +
> >> struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> >> - struct device *dev);
> >> + struct device *dev);
> >> +struct dma_buf_attachment *
> >> +dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
> >> + const struct dma_buf_attach_ops *importer_ops,
> >> + void *importer_priv);
> >> void dma_buf_detach(struct dma_buf *dmabuf,
> >> - struct dma_buf_attachment *dmabuf_attach);
> >> + struct dma_buf_attachment *attach);
> >> +int dma_buf_pin(struct dma_buf_attachment *attach);
> >> +void dma_buf_unpin(struct dma_buf_attachment *attach);
> >>
> >> struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info);
> >>
> >> @@ -420,6 +512,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
> >> enum dma_data_direction);
> >> void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
> >> enum dma_data_direction);
> >> +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf);
> >> int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
> >> enum dma_data_direction dir);
> >> int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
> >> --
> >> 2.17.1
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the amd-gfx
mailing list