[PATCH] dma-buf: fix and rework dma_buf_poll v3

Daniel Vetter daniel.vetter at ffwll.ch
Wed Jun 23 11:30:22 UTC 2021


On Wed, Jun 23, 2021 at 1:17 PM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> Am 22.06.21 um 19:02 schrieb Daniel Vetter:
> > On Tue, Jun 22, 2021 at 3:07 PM Christian König
> > <ckoenig.leichtzumerken at gmail.com> wrote:
> >> Crap, hit enter to early before adding a cover letter.
> >>
> >> This is the same patch as before, but as requested I'm keeping the
> >> exclusive fence handling as it is for now.
> >>
> >> Daniel can you double check this and/or make sure that it is tested?
> >>
> >> I only smoke tested it and the code is so complicated that I'm not sure
> >> I catched all side effects.
> > So I've thought about this some more, and we actually have docs for
> > how this is supposed to work:
> >
> > https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#implicit-fence-poll-support
> >
> > Docs are pretty clear that we want both read and write for EPOLLOUT or
> > well both exclusive and shared fences. So I guess back to your actual
> > thing, but maybe we should get some acks from userspace people for it
> > (Michel, Pekka, Simon probably usual suspects).
>
> Ok, sounds good to me. Going to send out a patch rebased to
> drm-misc-fixes today.
>
> >
> > The other thing I brought up and I haven't seen you reply to (maybe
> > missed it) is whether we shouldn't just use dma_resv_get_fences(). We
> > need to do the refcounting anyway, and this avoids us having to
> > open-code this very nasty code. Finally, and imo most important, this
> > is what's also used in drm_gem_fence_array_add_implicit(), which many
> > drivers use to handle their implicit in-fences. So would be nice to
> > have exactly matching code between that and what dma-buf poll does for
> > "can I read" and "can I write".
> >
> > Thoughts?
>
> Yeah, I've seen that. Just didn't had time to answer.
>
> That goes into the same direction as my thinking that we need to
> centralize the RCU and synchronization handling in general.
>
> What I don't like about the approach is that dma_resv_get_fences() needs
> to allocate memory. For a lot of use cases including dma_buf_poll that
> is rather overkill.
>
> To unify the handling I think we should use the iterator I've create the
> prototype of. This way we only have an "for_write" parameter and the
> iterator in return gives you all the fences you need.

Yeah I think in general I agree, especially in the CS code a bunch of
temporary allocations aren't great. But in dma_buf_poll I don't think
it's a place where anyone cares. That's why I think we can just use
that here and forget about all the trickiness. The gem helper otoh
wants an iterator (without retry even, since it's holding dma-resv
lock).

> When you then extend that  approach we could say we have an enum
> describing the use case. Something like:
> 1. For explicit sync, just give me all the must sync fences from memory
> management.
> 2. For read, give me all the writers and memory management fences.
> 3. For write, give me all the readers and writers and memory management
> fences.
> 4. For memory management, give me everything including things like PTE
> updates/TLB flushes.
>
> So instead of asking for some specific type of fence(s) the drivers
> tells the dma_resv object about their use case and in return get the
> fences they need to wait for.
>
> This essentially means that we move the decision what to wait for from
> the drivers into the dma_resv object, which I think would be a massive
> improvement.
>
> Functions like dma_resv_get_list(), dma_resv_get_excl(),
> dma_resv_get_excl_rcu() etc would then essentially be deprecated and
> instead you use dma_resv_get_fences(), dma_resv_for_each_fences(),
> dma_resv_wait_timeout(), dma_resv_test_signaled() with a proper use case.
>
> What do you think?

Yeah I think in general the direction makes sense, at least long term.
I think for now it's better to go with simplest solutions first until
we have everyone aligned on one set of rules, and have these rules
properly documented.
-Daniel

> Christian.
>
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >> Am 22.06.21 um 15:04 schrieb Christian König:
> >>> Daniel pointed me towards this function and there are multiple obvious problems
> >>> in the implementation.
> >>>
> >>> First of all the retry loop is not working as intended. In general the retry
> >>> makes only sense if you grab the reference first and then check the sequence
> >>> values.
> >>>
> >>> It's also good practice to keep the reference around when installing callbacks
> >>> to fences you don't own.
> >>>
> >>> And last the whole implementation was unnecessary complex and rather hard to
> >>> understand which could lead to probably unexpected behavior of the IOCTL.
> >>>
> >>> Fix all this by reworking the implementation from scratch.
> >>>
> >>> Only mildly tested and needs a thoughtful review of the code.
> >>>
> >>> v2: fix the reference counting as well
> >>> v3: keep the excl fence handling as is for stable
> >>>
> >>> Signed-off-by: Christian König <christian.koenig at amd.com>
> >>> CC: stable at vger.kernel.org
> >>> ---
> >>>    drivers/dma-buf/dma-buf.c | 133 ++++++++++++++++----------------------
> >>>    include/linux/dma-buf.h   |   2 +-
> >>>    2 files changed, 55 insertions(+), 80 deletions(-)
> >>>
> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >>> index eadd1eaa2fb5..e97c3a9d98d5 100644
> >>> --- a/drivers/dma-buf/dma-buf.c
> >>> +++ b/drivers/dma-buf/dma-buf.c
> >>> @@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry)
> >>>         * If you hit this BUG() it means someone dropped their ref to the
> >>>         * dma-buf while still having pending operation to the buffer.
> >>>         */
> >>> -     BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active);
> >>> +     BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
> >>>
> >>>        dmabuf->ops->release(dmabuf);
> >>>
> >>> @@ -202,16 +202,20 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
> >>>        wake_up_locked_poll(dcb->poll, dcb->active);
> >>>        dcb->active = 0;
> >>>        spin_unlock_irqrestore(&dcb->poll->lock, flags);
> >>> +     dma_fence_put(fence);
> >>>    }
> >>>
> >>>    static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
> >>>    {
> >>> +     struct dma_buf_poll_cb_t *dcb;
> >>>        struct dma_buf *dmabuf;
> >>>        struct dma_resv *resv;
> >>>        struct dma_resv_list *fobj;
> >>>        struct dma_fence *fence_excl;
> >>> -     __poll_t events;
> >>>        unsigned shared_count, seq;
> >>> +     struct dma_fence *fence;
> >>> +     __poll_t events;
> >>> +     int r, i;
> >>>
> >>>        dmabuf = file->private_data;
> >>>        if (!dmabuf || !dmabuf->resv)
> >>> @@ -225,99 +229,70 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
> >>>        if (!events)
> >>>                return 0;
> >>>
> >>> +     dcb = events & EPOLLOUT ? &dmabuf->cb_out : &dmabuf->cb_in;
> >>> +
> >>> +     /* Only queue a new one if we are not still waiting for the old one */
> >>> +     spin_lock_irq(&dmabuf->poll.lock);
> >>> +     if (dcb->active)
> >>> +             events = 0;
> >>> +     else
> >>> +             dcb->active = events;
> >>> +     spin_unlock_irq(&dmabuf->poll.lock);
> >>> +     if (!events)
> >>> +             return 0;
> >>> +
> >>>    retry:
> >>>        seq = read_seqcount_begin(&resv->seq);
> >>>        rcu_read_lock();
> >>>
> >>>        fobj = rcu_dereference(resv->fence);
> >>> -     if (fobj)
> >>> +     if (fobj && events & EPOLLOUT)
> >>>                shared_count = fobj->shared_count;
> >>>        else
> >>>                shared_count = 0;
> >>> -     fence_excl = rcu_dereference(resv->fence_excl);
> >>> -     if (read_seqcount_retry(&resv->seq, seq)) {
> >>> -             rcu_read_unlock();
> >>> -             goto retry;
> >>> -     }
> >>>
> >>> -     if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) {
> >>> -             struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_excl;
> >>> -             __poll_t pevents = EPOLLIN;
> >>> -
> >>> -             if (shared_count == 0)
> >>> -                     pevents |= EPOLLOUT;
> >>> -
> >>> -             spin_lock_irq(&dmabuf->poll.lock);
> >>> -             if (dcb->active) {
> >>> -                     dcb->active |= pevents;
> >>> -                     events &= ~pevents;
> >>> -             } else
> >>> -                     dcb->active = pevents;
> >>> -             spin_unlock_irq(&dmabuf->poll.lock);
> >>> -
> >>> -             if (events & pevents) {
> >>> -                     if (!dma_fence_get_rcu(fence_excl)) {
> >>> -                             /* force a recheck */
> >>> -                             events &= ~pevents;
> >>> -                             dma_buf_poll_cb(NULL, &dcb->cb);
> >>> -                     } else if (!dma_fence_add_callback(fence_excl, &dcb->cb,
> >>> -                                                        dma_buf_poll_cb)) {
> >>> -                             events &= ~pevents;
> >>> -                             dma_fence_put(fence_excl);
> >>> -                     } else {
> >>> -                             /*
> >>> -                              * No callback queued, wake up any additional
> >>> -                              * waiters.
> >>> -                              */
> >>> -                             dma_fence_put(fence_excl);
> >>> -                             dma_buf_poll_cb(NULL, &dcb->cb);
> >>> -                     }
> >>> +     for (i = 0; i < shared_count; ++i) {
> >>> +             fence = rcu_dereference(fobj->shared[i]);
> >>> +             fence = dma_fence_get_rcu(fence);
> >>> +             if (!fence || read_seqcount_retry(&resv->seq, seq)) {
> >>> +                     /* Concurrent modify detected, force re-check */
> >>> +                     dma_fence_put(fence);
> >>> +                     rcu_read_unlock();
> >>> +                     goto retry;
> >>>                }
> >>> -     }
> >>>
> >>> -     if ((events & EPOLLOUT) && shared_count > 0) {
> >>> -             struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_shared;
> >>> -             int i;
> >>> -
> >>> -             /* Only queue a new callback if no event has fired yet */
> >>> -             spin_lock_irq(&dmabuf->poll.lock);
> >>> -             if (dcb->active)
> >>> -                     events &= ~EPOLLOUT;
> >>> -             else
> >>> -                     dcb->active = EPOLLOUT;
> >>> -             spin_unlock_irq(&dmabuf->poll.lock);
> >>> -
> >>> -             if (!(events & EPOLLOUT))
> >>> +             r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
> >>> +             if (!r) {
> >>> +                     /* Callback queued */
> >>> +                     events = 0;
> >>>                        goto out;
> >>> +             }
> >>> +             dma_fence_put(fence);
> >>> +     }
> >>>
> >>> -             for (i = 0; i < shared_count; ++i) {
> >>> -                     struct dma_fence *fence = rcu_dereference(fobj->shared[i]);
> >>> -
> >>> -                     if (!dma_fence_get_rcu(fence)) {
> >>> -                             /*
> >>> -                              * fence refcount dropped to zero, this means
> >>> -                              * that fobj has been freed
> >>> -                              *
> >>> -                              * call dma_buf_poll_cb and force a recheck!
> >>> -                              */
> >>> -                             events &= ~EPOLLOUT;
> >>> -                             dma_buf_poll_cb(NULL, &dcb->cb);
> >>> -                             break;
> >>> -                     }
> >>> -                     if (!dma_fence_add_callback(fence, &dcb->cb,
> >>> -                                                 dma_buf_poll_cb)) {
> >>> -                             dma_fence_put(fence);
> >>> -                             events &= ~EPOLLOUT;
> >>> -                             break;
> >>> -                     }
> >>> +     fence = dma_resv_get_excl(resv);
> >>> +     if (fence && shared_count == 0) {
> >>> +             fence = dma_fence_get_rcu(fence);
> >>> +             if (!fence || read_seqcount_retry(&resv->seq, seq)) {
> >>> +                     /* Concurrent modify detected, force re-check */
> >>>                        dma_fence_put(fence);
> >>> +                     rcu_read_unlock();
> >>> +                     goto retry;
> >>> +
> >>>                }
> >>>
> >>> -             /* No callback queued, wake up any additional waiters. */
> >>> -             if (i == shared_count)
> >>> -                     dma_buf_poll_cb(NULL, &dcb->cb);
> >>> +             r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
> >>> +             if (!r) {
> >>> +                     /* Callback queued */
> >>> +                     events = 0;
> >>> +                     goto out;
> >>> +             }
> >>> +             dma_fence_put(fence_excl);
> >>>        }
> >>>
> >>> +     /* No callback queued, wake up any additional waiters. */
> >>> +     dma_buf_poll_cb(NULL, &dcb->cb);
> >>> +
> >>>    out:
> >>>        rcu_read_unlock();
> >>>        return events;
> >>> @@ -562,8 +537,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
> >>>        dmabuf->owner = exp_info->owner;
> >>>        spin_lock_init(&dmabuf->name_lock);
> >>>        init_waitqueue_head(&dmabuf->poll);
> >>> -     dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll;
> >>> -     dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0;
> >>> +     dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll;
> >>> +     dmabuf->cb_in.active = dmabuf->cb_out.active = 0;
> >>>
> >>>        if (!resv) {
> >>>                resv = (struct dma_resv *)&dmabuf[1];
> >>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> >>> index efdc56b9d95f..7e747ad54c81 100644
> >>> --- a/include/linux/dma-buf.h
> >>> +++ b/include/linux/dma-buf.h
> >>> @@ -329,7 +329,7 @@ struct dma_buf {
> >>>                wait_queue_head_t *poll;
> >>>
> >>>                __poll_t active;
> >>> -     } cb_excl, cb_shared;
> >>> +     } cb_in, cb_out;
> >>>    };
> >>>
> >>>    /**
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list