[PATCH] dma-buf: fix and rework dma_buf_poll v3
Christian König
ckoenig.leichtzumerken at gmail.com
Wed Jun 23 12:42:13 UTC 2021
Am 23.06.21 um 13:30 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 1:17 PM Christian König
> <ckoenig.leichtzumerken at gmail.com> wrote:
>> Am 22.06.21 um 19:02 schrieb Daniel Vetter:
>>> On Tue, Jun 22, 2021 at 3:07 PM Christian König
>>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>> Crap, hit enter to early before adding a cover letter.
>>>>
>>>> This is the same patch as before, but as requested I'm keeping the
>>>> exclusive fence handling as it is for now.
>>>>
>>>> Daniel can you double check this and/or make sure that it is tested?
>>>>
>>>> I only smoke tested it and the code is so complicated that I'm not sure
>>>> I catched all side effects.
>>> So I've thought about this some more, and we actually have docs for
>>> how this is supposed to work:
>>>
>>> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#implicit-fence-poll-support
>>>
>>> Docs are pretty clear that we want both read and write for EPOLLOUT or
>>> well both exclusive and shared fences. So I guess back to your actual
>>> thing, but maybe we should get some acks from userspace people for it
>>> (Michel, Pekka, Simon probably usual suspects).
>> Ok, sounds good to me. Going to send out a patch rebased to
>> drm-misc-fixes today.
>>
>>> The other thing I brought up and I haven't seen you reply to (maybe
>>> missed it) is whether we shouldn't just use dma_resv_get_fences(). We
>>> need to do the refcounting anyway, and this avoids us having to
>>> open-code this very nasty code. Finally, and imo most important, this
>>> is what's also used in drm_gem_fence_array_add_implicit(), which many
>>> drivers use to handle their implicit in-fences. So would be nice to
>>> have exactly matching code between that and what dma-buf poll does for
>>> "can I read" and "can I write".
>>>
>>> Thoughts?
>> Yeah, I've seen that. Just didn't had time to answer.
>>
>> That goes into the same direction as my thinking that we need to
>> centralize the RCU and synchronization handling in general.
>>
>> What I don't like about the approach is that dma_resv_get_fences() needs
>> to allocate memory. For a lot of use cases including dma_buf_poll that
>> is rather overkill.
>>
>> To unify the handling I think we should use the iterator I've create the
>> prototype of. This way we only have an "for_write" parameter and the
>> iterator in return gives you all the fences you need.
> Yeah I think in general I agree, especially in the CS code a bunch of
> temporary allocations aren't great. But in dma_buf_poll I don't think
> it's a place where anyone cares. That's why I think we can just use
> that here and forget about all the trickiness. The gem helper otoh
> wants an iterator (without retry even, since it's holding dma-resv
> lock).
Well then I would rather say we make nails with heads and grab the
reservation lock in dma_buf_poll.
That has at least less overhead than allocating memory, cause
essentially what dma_buf_poll needs is a
dma_resv_get_next_unsignaled_or_null_fence().
>> When you then extend that approach we could say we have an enum
>> describing the use case. Something like:
>> 1. For explicit sync, just give me all the must sync fences from memory
>> management.
>> 2. For read, give me all the writers and memory management fences.
>> 3. For write, give me all the readers and writers and memory management
>> fences.
>> 4. For memory management, give me everything including things like PTE
>> updates/TLB flushes.
>>
>> So instead of asking for some specific type of fence(s) the drivers
>> tells the dma_resv object about their use case and in return get the
>> fences they need to wait for.
>>
>> This essentially means that we move the decision what to wait for from
>> the drivers into the dma_resv object, which I think would be a massive
>> improvement.
>>
>> Functions like dma_resv_get_list(), dma_resv_get_excl(),
>> dma_resv_get_excl_rcu() etc would then essentially be deprecated and
>> instead you use dma_resv_get_fences(), dma_resv_for_each_fences(),
>> dma_resv_wait_timeout(), dma_resv_test_signaled() with a proper use case.
>>
>> What do you think?
> Yeah I think in general the direction makes sense, at least long term.
> I think for now it's better to go with simplest solutions first until
> we have everyone aligned on one set of rules, and have these rules
> properly documented.
I think that looks rather good now, doesn't it?
Christian.
> -Daniel
>
>> Christian.
>>
>>> -Daniel
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 22.06.21 um 15:04 schrieb Christian König:
>>>>> Daniel pointed me towards this function and there are multiple obvious problems
>>>>> in the implementation.
>>>>>
>>>>> First of all the retry loop is not working as intended. In general the retry
>>>>> makes only sense if you grab the reference first and then check the sequence
>>>>> values.
>>>>>
>>>>> It's also good practice to keep the reference around when installing callbacks
>>>>> to fences you don't own.
>>>>>
>>>>> And last the whole implementation was unnecessary complex and rather hard to
>>>>> understand which could lead to probably unexpected behavior of the IOCTL.
>>>>>
>>>>> Fix all this by reworking the implementation from scratch.
>>>>>
>>>>> Only mildly tested and needs a thoughtful review of the code.
>>>>>
>>>>> v2: fix the reference counting as well
>>>>> v3: keep the excl fence handling as is for stable
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>> CC: stable at vger.kernel.org
>>>>> ---
>>>>> drivers/dma-buf/dma-buf.c | 133 ++++++++++++++++----------------------
>>>>> include/linux/dma-buf.h | 2 +-
>>>>> 2 files changed, 55 insertions(+), 80 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>>>> index eadd1eaa2fb5..e97c3a9d98d5 100644
>>>>> --- a/drivers/dma-buf/dma-buf.c
>>>>> +++ b/drivers/dma-buf/dma-buf.c
>>>>> @@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry)
>>>>> * If you hit this BUG() it means someone dropped their ref to the
>>>>> * dma-buf while still having pending operation to the buffer.
>>>>> */
>>>>> - BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active);
>>>>> + BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
>>>>>
>>>>> dmabuf->ops->release(dmabuf);
>>>>>
>>>>> @@ -202,16 +202,20 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb)
>>>>> wake_up_locked_poll(dcb->poll, dcb->active);
>>>>> dcb->active = 0;
>>>>> spin_unlock_irqrestore(&dcb->poll->lock, flags);
>>>>> + dma_fence_put(fence);
>>>>> }
>>>>>
>>>>> static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>>>>> {
>>>>> + struct dma_buf_poll_cb_t *dcb;
>>>>> struct dma_buf *dmabuf;
>>>>> struct dma_resv *resv;
>>>>> struct dma_resv_list *fobj;
>>>>> struct dma_fence *fence_excl;
>>>>> - __poll_t events;
>>>>> unsigned shared_count, seq;
>>>>> + struct dma_fence *fence;
>>>>> + __poll_t events;
>>>>> + int r, i;
>>>>>
>>>>> dmabuf = file->private_data;
>>>>> if (!dmabuf || !dmabuf->resv)
>>>>> @@ -225,99 +229,70 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>>>>> if (!events)
>>>>> return 0;
>>>>>
>>>>> + dcb = events & EPOLLOUT ? &dmabuf->cb_out : &dmabuf->cb_in;
>>>>> +
>>>>> + /* Only queue a new one if we are not still waiting for the old one */
>>>>> + spin_lock_irq(&dmabuf->poll.lock);
>>>>> + if (dcb->active)
>>>>> + events = 0;
>>>>> + else
>>>>> + dcb->active = events;
>>>>> + spin_unlock_irq(&dmabuf->poll.lock);
>>>>> + if (!events)
>>>>> + return 0;
>>>>> +
>>>>> retry:
>>>>> seq = read_seqcount_begin(&resv->seq);
>>>>> rcu_read_lock();
>>>>>
>>>>> fobj = rcu_dereference(resv->fence);
>>>>> - if (fobj)
>>>>> + if (fobj && events & EPOLLOUT)
>>>>> shared_count = fobj->shared_count;
>>>>> else
>>>>> shared_count = 0;
>>>>> - fence_excl = rcu_dereference(resv->fence_excl);
>>>>> - if (read_seqcount_retry(&resv->seq, seq)) {
>>>>> - rcu_read_unlock();
>>>>> - goto retry;
>>>>> - }
>>>>>
>>>>> - if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) {
>>>>> - struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_excl;
>>>>> - __poll_t pevents = EPOLLIN;
>>>>> -
>>>>> - if (shared_count == 0)
>>>>> - pevents |= EPOLLOUT;
>>>>> -
>>>>> - spin_lock_irq(&dmabuf->poll.lock);
>>>>> - if (dcb->active) {
>>>>> - dcb->active |= pevents;
>>>>> - events &= ~pevents;
>>>>> - } else
>>>>> - dcb->active = pevents;
>>>>> - spin_unlock_irq(&dmabuf->poll.lock);
>>>>> -
>>>>> - if (events & pevents) {
>>>>> - if (!dma_fence_get_rcu(fence_excl)) {
>>>>> - /* force a recheck */
>>>>> - events &= ~pevents;
>>>>> - dma_buf_poll_cb(NULL, &dcb->cb);
>>>>> - } else if (!dma_fence_add_callback(fence_excl, &dcb->cb,
>>>>> - dma_buf_poll_cb)) {
>>>>> - events &= ~pevents;
>>>>> - dma_fence_put(fence_excl);
>>>>> - } else {
>>>>> - /*
>>>>> - * No callback queued, wake up any additional
>>>>> - * waiters.
>>>>> - */
>>>>> - dma_fence_put(fence_excl);
>>>>> - dma_buf_poll_cb(NULL, &dcb->cb);
>>>>> - }
>>>>> + for (i = 0; i < shared_count; ++i) {
>>>>> + fence = rcu_dereference(fobj->shared[i]);
>>>>> + fence = dma_fence_get_rcu(fence);
>>>>> + if (!fence || read_seqcount_retry(&resv->seq, seq)) {
>>>>> + /* Concurrent modify detected, force re-check */
>>>>> + dma_fence_put(fence);
>>>>> + rcu_read_unlock();
>>>>> + goto retry;
>>>>> }
>>>>> - }
>>>>>
>>>>> - if ((events & EPOLLOUT) && shared_count > 0) {
>>>>> - struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_shared;
>>>>> - int i;
>>>>> -
>>>>> - /* Only queue a new callback if no event has fired yet */
>>>>> - spin_lock_irq(&dmabuf->poll.lock);
>>>>> - if (dcb->active)
>>>>> - events &= ~EPOLLOUT;
>>>>> - else
>>>>> - dcb->active = EPOLLOUT;
>>>>> - spin_unlock_irq(&dmabuf->poll.lock);
>>>>> -
>>>>> - if (!(events & EPOLLOUT))
>>>>> + r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
>>>>> + if (!r) {
>>>>> + /* Callback queued */
>>>>> + events = 0;
>>>>> goto out;
>>>>> + }
>>>>> + dma_fence_put(fence);
>>>>> + }
>>>>>
>>>>> - for (i = 0; i < shared_count; ++i) {
>>>>> - struct dma_fence *fence = rcu_dereference(fobj->shared[i]);
>>>>> -
>>>>> - if (!dma_fence_get_rcu(fence)) {
>>>>> - /*
>>>>> - * fence refcount dropped to zero, this means
>>>>> - * that fobj has been freed
>>>>> - *
>>>>> - * call dma_buf_poll_cb and force a recheck!
>>>>> - */
>>>>> - events &= ~EPOLLOUT;
>>>>> - dma_buf_poll_cb(NULL, &dcb->cb);
>>>>> - break;
>>>>> - }
>>>>> - if (!dma_fence_add_callback(fence, &dcb->cb,
>>>>> - dma_buf_poll_cb)) {
>>>>> - dma_fence_put(fence);
>>>>> - events &= ~EPOLLOUT;
>>>>> - break;
>>>>> - }
>>>>> + fence = dma_resv_get_excl(resv);
>>>>> + if (fence && shared_count == 0) {
>>>>> + fence = dma_fence_get_rcu(fence);
>>>>> + if (!fence || read_seqcount_retry(&resv->seq, seq)) {
>>>>> + /* Concurrent modify detected, force re-check */
>>>>> dma_fence_put(fence);
>>>>> + rcu_read_unlock();
>>>>> + goto retry;
>>>>> +
>>>>> }
>>>>>
>>>>> - /* No callback queued, wake up any additional waiters. */
>>>>> - if (i == shared_count)
>>>>> - dma_buf_poll_cb(NULL, &dcb->cb);
>>>>> + r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb);
>>>>> + if (!r) {
>>>>> + /* Callback queued */
>>>>> + events = 0;
>>>>> + goto out;
>>>>> + }
>>>>> + dma_fence_put(fence_excl);
>>>>> }
>>>>>
>>>>> + /* No callback queued, wake up any additional waiters. */
>>>>> + dma_buf_poll_cb(NULL, &dcb->cb);
>>>>> +
>>>>> out:
>>>>> rcu_read_unlock();
>>>>> return events;
>>>>> @@ -562,8 +537,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>>>>> dmabuf->owner = exp_info->owner;
>>>>> spin_lock_init(&dmabuf->name_lock);
>>>>> init_waitqueue_head(&dmabuf->poll);
>>>>> - dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll;
>>>>> - dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0;
>>>>> + dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll;
>>>>> + dmabuf->cb_in.active = dmabuf->cb_out.active = 0;
>>>>>
>>>>> if (!resv) {
>>>>> resv = (struct dma_resv *)&dmabuf[1];
>>>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>>>> index efdc56b9d95f..7e747ad54c81 100644
>>>>> --- a/include/linux/dma-buf.h
>>>>> +++ b/include/linux/dma-buf.h
>>>>> @@ -329,7 +329,7 @@ struct dma_buf {
>>>>> wait_queue_head_t *poll;
>>>>>
>>>>> __poll_t active;
>>>>> - } cb_excl, cb_shared;
>>>>> + } cb_in, cb_out;
>>>>> };
>>>>>
>>>>> /**
>
More information about the dri-devel
mailing list