[PATCH 5/5] [RFC]drm: add syncobj timeline support v3

Chunming Zhou zhoucm1 at amd.com
Mon Sep 3 04:13:09 UTC 2018



在 2018/8/30 19:32, Christian König 写道:
> [SNIP]
>>>>>
>>>>>> +
>>>>>> +struct drm_syncobj_wait_pt {
>>>>>> +    struct drm_syncobj_stub_fence base;
>>>>>> +    u64    value;
>>>>>> +    struct rb_node   node;
>>>>>> +};
>>>>>> +struct drm_syncobj_signal_pt {
>>>>>> +    struct drm_syncobj_stub_fence base;
>>>>>> +    struct dma_fence *signal_fence;
>>>>>> +    struct dma_fence *pre_pt_base;
>>>>>> +    struct dma_fence_cb signal_cb;
>>>>>> +    struct dma_fence_cb pre_pt_cb;
>>>>>> +    struct drm_syncobj *syncobj;
>>>>>> +    u64    value;
>>>>>> +    struct list_head list;
>>>>>> +};
>>>>>
>>>>> What are those two structures good for
>>>> They are used to record wait ops points and signal ops points.
>>>> For timeline, they are connected by timeline value, works like:
>>>>     a. signal pt increase timeline value to signal_pt value when 
>>>> signal_pt is signaled. signal_pt is depending on previous pt fence 
>>>> and itself signal fence from signal ops.
>>>>     b. wait pt tree checks if timeline value over itself value.
>>>>
>>>> For normal, works like:
>>>>     a. signal pt increase 1 for syncobj_timeline->signal_point 
>>>> every time when signal ops is performed.
>>>>     b. when wait ops is comming, wait pt will fetch above last 
>>>> signal pt value as its wait point. wait pt will be only signaled by 
>>>> equal point value signal_pt.
>>>>
>>>>
>>>>> and why is the stub fence their base?
>>>> Good question, I tried to kzalloc for them as well when I debug 
>>>> them, I encountered a problem:
>>>> I lookup/find wait_pt or signal_pt successfully, but when I tried 
>>>> to use them, they are freed sometimes, and results in NULL point.
>>>> and generally, when lookup them, we often need their stub fence as 
>>>> well, and in the meantime,  their lifecycle are same.
>>>> Above reason, I make stub fence as their base.
>>>
>>> That sounds like you only did this because you messed up the lifecycle.
>>>
>>> Additional to that I don't think you correctly considered the 
>>> lifecycle of the waits and the sync object itself. E.g. blocking in 
>>> drm_syncobj_timeline_fini() until all waits are done is not a good 
>>> idea.
>>>
>>> What you should do instead is to create a fence_array object with 
>>> all the fence we need to wait for when a wait point is requested.
>> Yeah, this is our initial discussion result, but when I tried to do 
>> that, I found that cannot meet the advance signal requirement:
>>     a. We need to consider the wait and signal pt value are not 
>> one-to-one match, it's difficult to find dependent point, at least, 
>> there is some overhead.
>
> As far as I can see that is independent of using a fence array here. 
> See you can either use a ring buffer or an rb-tree, but when you want 
> to wait for a specific point we need to condense the not yet signaled 
> fences into an array.
again, need to find the range of where the specific point in, we should 
close to timeline semantics, I also refered the sw_sync.c timeline, 
usally wait_pt is signalled by timeline point. And I agree we can 
implement it with several methods, but I don't think there are basical 
differences.

>
>>     b. because we allowed "wait-before-signal", if 
>> "wait-before-signal" happens, there isn't signal fence which can be 
>> used to create fence array.
>
> Well, again we DON'T support wait-before-signal here. I will certainly 
> NAK any implementation which tries to do this until we haven't figured 
> out all the resource management constraints and I still don't see how 
> we can do this.
yes, we don't support real wait-before-signal in patch now, just a fake 
wait-before-signal, which still wait on CS submission until signal 
operation coming through wait_event, which is the conclusion we 
disscussed before.

>
>> So timeline value is good to resolve that.
>>
>>>
>>> Otherwise somebody can easily construct a situation where timeline 
>>> sync obj A waits on B and B waits on A.
>> Same situation also can happens with fence-array, we only can see 
>> this is a programming bug with incorrect use.
>
> No, fence-array is initialized only once with a static list of fences. 
> This way it is impossible to add the fence-array to itself for example.
>
> E.g. you can't build circle dependencies with that.
we already wait for signal operation event, so never build circle 
dependencies with that. The theory is same.


>
>> Better use rbtree_postorder_for_each_entry_safe() for this.
>>>> From the comments, seems we cannot use it here, we need erase node 
>>>> here.
>>>> Comments:
>>>>  * Note, however, that it cannot handle other modifications that 
>>>> re-order the
>>>>  * rbtree it is iterating over. This includes calling rb_erase() on 
>>>> @pos, as
>>>>  * rb_erase() may rebalance the tree, causing us to miss some nodes.
>>>>  */
>>>
>>> Yeah, but your current implementation has the same problem :)
>>>
>>> You need to use something like "while (node = rb_first(...))" 
>>> instead if you want to destruct the whole tree.
OK, got it, I will change it in v4.

Thanks,
David Zhou
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Thanks,
>>>> David Zhou
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> +        wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
>>>>>> +        node = rb_next(node);
>> I already get the next node before erasing this node, so the "for 
>> (..." sentence is equal with "while (...)"
>
> That still doesn't work. The problem is the because of an erase the 
> next node might skip some nodes when rebalancing.
>
> What you need to do is to either not erase the nodes at all (because 
> we re-initialize the tree anyway) or always use rb_first().
>
> Regards,
> Christian.
>
>>
>> Thanks,
>> David Zhou
>>
>>>>>> + rb_erase(&wait_pt->node,
>>>>>> +             &syncobj_timeline->wait_pt_tree);
>>>>>> +        RB_CLEAR_NODE(&wait_pt->node);
>>>>>> +        spin_unlock(&syncobj->lock);
>>>>>> +        dma_fence_wait(&wait_pt->base.base, true);
>>>>>> +        spin_lock(&syncobj->lock);
>>>>>> +        /* kfree(wait_pt) is excuted by fence put */
>>>>>> +        dma_fence_put(&wait_pt->base.base);
>>>>>> +    }
>>>>>> +    list_for_each_entry_safe(signal_pt, tmp,
>>>>>> + &syncobj_timeline->signal_pt_list, list) {
>>>>>> +        list_del(&signal_pt->list);
>>>>>> +        if (signal_pt->signal_fence) {
>>>>>> + dma_fence_remove_callback(signal_pt->signal_fence,
>>>>>> + &signal_pt->signal_cb);
>>>>>> +            dma_fence_put(signal_pt->signal_fence);
>>>>>> +        }
>>>>>> +        if (signal_pt->pre_pt_base) {
>>>>>> + dma_fence_remove_callback(signal_pt->pre_pt_base,
>>>>>> + &signal_pt->pre_pt_cb);
>>>>>> +            dma_fence_put(signal_pt->pre_pt_base);
>>>>>> +        }
>>>>>> +        dma_fence_put(&signal_pt->base.base);
>>>>>> +    }
>>>>>> +    spin_unlock(&syncobj->lock);
>>>>>> +}
>>>>>> +
>>>>>>   -    *fence = drm_syncobj_fence_get(syncobj);
>>>>>> -    if (*fence)
>>>>>> -        return 1;
>>>>>> +static bool drm_syncobj_normal_signal_wait_pts(struct 
>>>>>> drm_syncobj *syncobj,
>>>>>> +                           u64 signal_pt)
>>>>>> +{
>>>>>> +    struct rb_node *node = NULL;
>>>>>> +    struct drm_syncobj_wait_pt *wait_pt = NULL;
>>>>>>         spin_lock(&syncobj->lock);
>>>>>> -    /* We've already tried once to get a fence and failed. Now 
>>>>>> that we
>>>>>> -     * have the lock, try one more time just to be sure we don't 
>>>>>> add a
>>>>>> -     * callback when a fence has already been set.
>>>>>> -     */
>>>>>> -    if (syncobj->fence) {
>>>>>> -        *fence = 
>>>>>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>>>>> - lockdep_is_held(&syncobj->lock)));
>>>>>> -        ret = 1;
>>>>>> -    } else {
>>>>>> -        *fence = NULL;
>>>>>> -        drm_syncobj_add_callback_locked(syncobj, cb, func);
>>>>>> -        ret = 0;
>>>>>> +    for(node = rb_first(&syncobj->syncobj_timeline.wait_pt_tree);
>>>>>> +        node != NULL; ) {
>>>>>> +        wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
>>>>>> +        node = rb_next(node);
>>>>>> +        /* for normal syncobj */
>>>>>> +        if (wait_pt->value == signal_pt) {
>>>>>> + dma_fence_signal(&wait_pt->base.base);
>>>>>> +            rb_erase(&wait_pt->node,
>>>>>> + &syncobj->syncobj_timeline.wait_pt_tree);
>>>>>> +            RB_CLEAR_NODE(&wait_pt->node);
>>>>>> +            /* kfree(wait_pt) is excuted by fence put */
>>>>>> +            dma_fence_put(&wait_pt->base.base);
>>>>>> +            spin_unlock(&syncobj->lock);
>>>>>> +            return true;
>>>>>> +        }
>>>>>>       }
>>>>>>       spin_unlock(&syncobj->lock);
>>>>>> +    return false;
>>>>>> +}
>>>>>>   -    return ret;
>>>>>> +static void drm_syncobj_timeline_signal_wait_pts(struct 
>>>>>> drm_syncobj *syncobj)
>>>>>> +{
>>>>>> +    struct rb_node *node = NULL;
>>>>>> +    struct drm_syncobj_wait_pt *wait_pt = NULL;
>>>>>> +
>>>>>> +    spin_lock(&syncobj->lock);
>>>>>> +    for(node = rb_first(&syncobj->syncobj_timeline.wait_pt_tree);
>>>>>> +        node != NULL; ) {
>>>>>> +        wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
>>>>>> +        node = rb_next(node);
>>>>>> +        if (wait_pt->value <= syncobj->syncobj_timeline.timeline) {
>>>>>> + dma_fence_signal(&wait_pt->base.base);
>>>>>> +            rb_erase(&wait_pt->node,
>>>>>> + &syncobj->syncobj_timeline.wait_pt_tree);
>>>>>> +            RB_CLEAR_NODE(&wait_pt->node);
>>>>>> +            /* kfree(wait_pt) is excuted by fence put */
>>>>>> +            dma_fence_put(&wait_pt->base.base);
>>>>>> +        } else {
>>>>>> +            /* the loop is from left to right, the later entry 
>>>>>> value is
>>>>>> +             * bigger, so don't need to check any more */
>>>>>> +            break;
>>>>>> +        }
>>>>>> +    }
>>>>>> +    spin_unlock(&syncobj->lock);
>>>>>>   }
>>>>>>   -void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
>>>>>> -                  struct drm_syncobj_cb *cb,
>>>>>> -                  drm_syncobj_func_t func)
>>>>>> +
>>>>>> +
>>>>>> +static void pt_fence_cb(struct drm_syncobj_signal_pt *signal_pt)
>>>>>>   {
>>>>>> +    struct dma_fence *fence = NULL;
>>>>>> +    struct drm_syncobj *syncobj;
>>>>>> +    struct drm_syncobj_signal_pt *tail_pt;
>>>>>> +    u64 pt_value = signal_pt->value;
>>>>>> +
>>>>>> +    dma_fence_signal(&signal_pt->base.base);
>>>>>> +    fence = signal_pt->signal_fence;
>>>>>> +    signal_pt->signal_fence = NULL;
>>>>>> +    dma_fence_put(fence);
>>>>>> +    fence = signal_pt->pre_pt_base;
>>>>>> +    signal_pt->pre_pt_base = NULL;
>>>>>> +    dma_fence_put(fence);
>>>>>> +
>>>>>> +    syncobj = signal_pt->syncobj;
>>>>>>       spin_lock(&syncobj->lock);
>>>>>> -    drm_syncobj_add_callback_locked(syncobj, cb, func);
>>>>>> +    syncobj->syncobj_timeline.timeline = pt_value;
>>>>>> +    tail_pt = 
>>>>>> list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
>>>>>> +                  struct drm_syncobj_signal_pt, list);
>>>>>> +    if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL && signal_pt 
>>>>>> != tail_pt)
>>>>>> +        || syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>>> +        list_del(&signal_pt->list);
>>>>>> +        /* kfree(signal_pt) will be  executed by below fence put */
>>>>>> +        dma_fence_put(&signal_pt->base.base);
>>>>>> +    }
>>>>>>       spin_unlock(&syncobj->lock);
>>>>>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL)
>>>>>> +        drm_syncobj_normal_signal_wait_pts(syncobj, pt_value);
>>>>>> +    else
>>>>>> +        drm_syncobj_timeline_signal_wait_pts(syncobj);
>>>>>>   }
>>>>>> +static void pt_signal_fence_func(struct dma_fence *fence,
>>>>>> +                 struct dma_fence_cb *cb)
>>>>>> +{
>>>>>> +    struct drm_syncobj_signal_pt *signal_pt =
>>>>>> +        container_of(cb, struct drm_syncobj_signal_pt, signal_cb);
>>>>>>   -void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
>>>>>> -                 struct drm_syncobj_cb *cb)
>>>>>> +    if (signal_pt->pre_pt_base &&
>>>>>> + !dma_fence_is_signaled(signal_pt->pre_pt_base))
>>>>>> +        return;
>>>>>> +
>>>>>> +    pt_fence_cb(signal_pt);
>>>>>> +}
>>>>>> +static void pt_pre_fence_func(struct dma_fence *fence,
>>>>>> +                 struct dma_fence_cb *cb)
>>>>>> +{
>>>>>> +    struct drm_syncobj_signal_pt *signal_pt =
>>>>>> +        container_of(cb, struct drm_syncobj_signal_pt, pre_pt_cb);
>>>>>> +
>>>>>> +    if (signal_pt->signal_fence &&
>>>>>> + !dma_fence_is_signaled(signal_pt->pre_pt_base))
>>>>>> +        return;
>>>>>> +
>>>>>> +    pt_fence_cb(signal_pt);
>>>>>> +}
>>>>>> +
>>>>>> +static int drm_syncobj_timeline_create_signal_pt(struct 
>>>>>> drm_syncobj *syncobj,
>>>>>> +                         struct dma_fence *fence,
>>>>>> +                         u64 point)
>>>>>>   {
>>>>>> +    struct drm_syncobj_signal_pt *signal_pt =
>>>>>> +        kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
>>>>>> +    struct drm_syncobj_stub_fence *base;
>>>>>> +    struct drm_syncobj_signal_pt *tail_pt;
>>>>>> +    struct dma_fence *tail_pt_fence = NULL;
>>>>>> +    int ret = 0;
>>>>>> +
>>>>>> +    if (!signal_pt)
>>>>>> +        return -ENOMEM;
>>>>>> +    if (syncobj->syncobj_timeline.signal_point >= point) {
>>>>>> +        DRM_WARN("A later signal is ready!");
>>>>>> +        goto out;
>>>>>> +    }
>>>>>> +    if (!fence)
>>>>>> +        goto out;
>>>>>> +    dma_fence_get(fence);
>>>>>>       spin_lock(&syncobj->lock);
>>>>>> -    list_del_init(&cb->node);
>>>>>> +    base = &signal_pt->base;
>>>>>> +    spin_lock_init(&base->lock);
>>>>>> +    dma_fence_init(&base->base,
>>>>>> +               &drm_syncobj_stub_fence_ops,
>>>>>> +               &base->lock,
>>>>>> + syncobj->syncobj_timeline.timeline_context, point);
>>>>>> +    signal_pt->signal_fence = fence;
>>>>>> +    /* timeline syncobj must take this dependency */
>>>>>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>>> +        if 
>>>>>> (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
>>>>>> +            tail_pt = 
>>>>>> list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
>>>>>> +                          struct drm_syncobj_signal_pt, list);
>>>>>> +            tail_pt_fence = &tail_pt->base.base;
>>>>>> +            if (dma_fence_is_signaled(tail_pt_fence))
>>>>>> +                tail_pt_fence = NULL;
>>>>>> +            else
>>>>>> +                signal_pt->pre_pt_base =
>>>>>> +                    dma_fence_get(tail_pt_fence);
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    signal_pt->value = point;
>>>>>> +    syncobj->syncobj_timeline.signal_point = point;
>>>>>> +    signal_pt->syncobj = syncobj;
>>>>>> +    INIT_LIST_HEAD(&signal_pt->list);
>>>>>> +    list_add_tail(&signal_pt->list, 
>>>>>> &syncobj->syncobj_timeline.signal_pt_list);
>>>>>>       spin_unlock(&syncobj->lock);
>>>>>> +    wake_up_all(&syncobj->syncobj_timeline.wq);
>>>>>> +    /**
>>>>>> +     * Every pt is depending on signal fence and previous pt 
>>>>>> fence, add
>>>>>> +     * callbacks to them
>>>>>> +     */
>>>>>> +    ret = dma_fence_add_callback(signal_pt->signal_fence,
>>>>>> +                     &signal_pt->signal_cb,
>>>>>> +                     pt_signal_fence_func);
>>>>>> +
>>>>>> +    if (signal_pt->pre_pt_base) {
>>>>>> +        ret = dma_fence_add_callback(signal_pt->pre_pt_base,
>>>>>> +                         &signal_pt->pre_pt_cb,
>>>>>> +                         pt_pre_fence_func);
>>>>>> +        if (ret == -ENOENT)
>>>>>> + pt_pre_fence_func(signal_pt->pre_pt_base,
>>>>>> +                      &signal_pt->pre_pt_cb);
>>>>>> +        else if (ret)
>>>>>> +            goto out;
>>>>>> +    } else if (ret == -ENOENT) {
>>>>>> + pt_signal_fence_func(signal_pt->signal_fence,
>>>>>> +                     &signal_pt->signal_cb);
>>>>>> +    } else if (ret) {
>>>>>> +            goto out;
>>>>>> +    }
>>>>>> +
>>>>>> +    return 0;
>>>>>> +out:
>>>>>> +    dma_fence_put(&signal_pt->base.base);
>>>>>> +    return ret;
>>>>>>   }
>>>>>>     /**
>>>>>> @@ -149,53 +381,30 @@ void drm_syncobj_replace_fence(struct 
>>>>>> drm_syncobj *syncobj,
>>>>>>                      u64 point,
>>>>>>                      struct dma_fence *fence)
>>>>>>   {
>>>>>> -    struct dma_fence *old_fence;
>>>>>> -    struct drm_syncobj_cb *cur, *tmp;
>>>>>> -
>>>>>> -    if (fence)
>>>>>> -        dma_fence_get(fence);
>>>>>> -
>>>>>> -    spin_lock(&syncobj->lock);
>>>>>> -
>>>>>> -    old_fence = rcu_dereference_protected(syncobj->fence,
>>>>>> - lockdep_is_held(&syncobj->lock));
>>>>>> -    rcu_assign_pointer(syncobj->fence, fence);
>>>>>> -
>>>>>> -    if (fence != old_fence) {
>>>>>> -        list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, 
>>>>>> node) {
>>>>>> -            list_del_init(&cur->node);
>>>>>> -            cur->func(syncobj, cur);
>>>>>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>>> +        if (fence)
>>>>>> + drm_syncobj_timeline_create_signal_pt(syncobj, fence,
>>>>>> +                                  point);
>>>>>> +        return;
>>>>>> +    } else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>>>>> +        u64 pt_value;
>>>>>> +
>>>>>> +        if (!fence) {
>>>>>> +            drm_syncobj_timeline_fini(syncobj,
>>>>>> + &syncobj->syncobj_timeline);
>>>>>> + drm_syncobj_timeline_init(&syncobj->syncobj_timeline);
>>>>>> +            return;
>>>>>>           }
>>>>>> +        pt_value = syncobj->syncobj_timeline.signal_point +
>>>>>> +            DRM_SYNCOBJ_NORMAL_POINT;
>>>>>> +        drm_syncobj_timeline_create_signal_pt(syncobj, fence, 
>>>>>> pt_value);
>>>>>> +        return;
>>>>>> +    } else {
>>>>>> +        DRM_ERROR("the syncobj type isn't support\n");
>>>>>>       }
>>>>>> -
>>>>>> -    spin_unlock(&syncobj->lock);
>>>>>> -
>>>>>> -    dma_fence_put(old_fence);
>>>>>>   }
>>>>>>   EXPORT_SYMBOL(drm_syncobj_replace_fence);
>>>>>>   -struct drm_syncobj_stub_fence {
>>>>>> -    struct dma_fence base;
>>>>>> -    spinlock_t lock;
>>>>>> -};
>>>>>> -
>>>>>> -static const char *drm_syncobj_stub_fence_get_name(struct 
>>>>>> dma_fence *fence)
>>>>>> -{
>>>>>> -        return "syncobjstub";
>>>>>> -}
>>>>>> -
>>>>>> -static bool drm_syncobj_stub_fence_enable_signaling(struct 
>>>>>> dma_fence *fence)
>>>>>> -{
>>>>>> -    return !dma_fence_is_signaled(fence);
>>>>>> -}
>>>>>> -
>>>>>> -static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
>>>>>> -    .get_driver_name = drm_syncobj_stub_fence_get_name,
>>>>>> -    .get_timeline_name = drm_syncobj_stub_fence_get_name,
>>>>>> -    .enable_signaling = drm_syncobj_stub_fence_enable_signaling,
>>>>>> -    .release = NULL,
>>>>>> -};
>>>>>> -
>>>>>>   static int drm_syncobj_assign_null_handle(struct drm_syncobj 
>>>>>> *syncobj)
>>>>>>   {
>>>>>>       struct drm_syncobj_stub_fence *fence;
>>>>>> @@ -215,6 +424,143 @@ static int 
>>>>>> drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>>>>>>       return 0;
>>>>>>   }
>>>>>>   +static struct drm_syncobj_wait_pt *
>>>>>> +drm_syncobj_timeline_lookup_wait_pt_fence(struct drm_syncobj 
>>>>>> *syncobj,
>>>>>> +                      u64 point,
>>>>>> +                      struct dma_fence **fence)
>>>>>> +{
>>>>>> +    struct rb_node *node = 
>>>>>> syncobj->syncobj_timeline.wait_pt_tree.rb_node;
>>>>>> +    struct drm_syncobj_wait_pt *wait_pt = NULL;
>>>>>> +
>>>>>> +
>>>>>> +    spin_lock(&syncobj->lock);
>>>>>> +    while(node) {
>>>>>> +        int result;
>>>>>> +
>>>>>> +        wait_pt = rb_entry(node, struct drm_syncobj_wait_pt, node);
>>>>>> +        result = point - wait_pt->value;
>>>>>> +        if (result < 0) {
>>>>>> +            node = node->rb_left;
>>>>>> +        } else if (result > 0) {
>>>>>> +            node = node->rb_right;
>>>>>> +        } else {
>>>>>> +            if (fence)
>>>>>> +            *fence = dma_fence_get(&wait_pt->base.base);
>>>>>> +            break;
>>>>>> +        }
>>>>>> +    }
>>>>>> +    spin_unlock(&syncobj->lock);
>>>>>> +
>>>>>> +    return wait_pt;
>>>>>> +}
>>>>>> +
>>>>>> +static struct drm_syncobj_wait_pt *
>>>>>> +drm_syncobj_timeline_create_wait_pt_and_return_fence(struct 
>>>>>> drm_syncobj *syncobj,
>>>>>> +                             u64 point,
>>>>>> +                             struct dma_fence **fence)
>>>>>> +{
>>>>>> +    struct drm_syncobj_wait_pt *wait_pt = kzalloc(sizeof(*wait_pt),
>>>>>> +                              GFP_KERNEL);
>>>>>> +    struct drm_syncobj_stub_fence *base;
>>>>>> +    struct rb_node **new = 
>>>>>> &(syncobj->syncobj_timeline.wait_pt_tree.rb_node), *parent = NULL;
>>>>>> +    struct drm_syncobj_signal_pt *tail_signal_pt =
>>>>>> + list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
>>>>>> +                struct drm_syncobj_signal_pt, list);
>>>>>> +
>>>>>> +    if (!wait_pt)
>>>>>> +        return NULL;
>>>>>> +    base = &wait_pt->base;
>>>>>> +    spin_lock_init(&base->lock);
>>>>>> +    dma_fence_init(&base->base,
>>>>>> +               &drm_syncobj_stub_fence_ops,
>>>>>> +               &base->lock,
>>>>>> + syncobj->syncobj_timeline.timeline_context, point);
>>>>>> +    wait_pt->value = point;
>>>>>> +
>>>>>> +    /* wait pt must be in an order, so that we can easily lookup 
>>>>>> and signal
>>>>>> +     * it */
>>>>>> +    spin_lock(&syncobj->lock);
>>>>>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE &&
>>>>>> +        point <= syncobj->syncobj_timeline.timeline)
>>>>>> +        dma_fence_signal(&wait_pt->base.base);
>>>>>> +    if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
>>>>>> +        (point == tail_signal_pt->value) &&
>>>>>> + dma_fence_is_signaled(&tail_signal_pt->base.base))
>>>>>> +        dma_fence_signal(&wait_pt->base.base);
>>>>>> +    while(*new) {
>>>>>> +        struct drm_syncobj_wait_pt *this =
>>>>>> +            rb_entry(*new, struct drm_syncobj_wait_pt, node);
>>>>>> +        int result = wait_pt->value - this->value;
>>>>>> +
>>>>>> +        parent = *new;
>>>>>> +        if (result < 0)
>>>>>> +            new = &((*new)->rb_left);
>>>>>> +        else if (result > 0)
>>>>>> +            new = &((*new)->rb_right);
>>>>>> +        else
>>>>>> +            goto exist;
>>>>>> +    }
>>>>>> +    if (fence)
>>>>>> +        *fence = dma_fence_get(&wait_pt->base.base);
>>>>>> +    rb_link_node(&wait_pt->node, parent, new);
>>>>>> +    rb_insert_color(&wait_pt->node, 
>>>>>> &syncobj->syncobj_timeline.wait_pt_tree);
>>>>>> +    spin_unlock(&syncobj->lock);
>>>>>> +    return wait_pt;
>>>>>> +exist:
>>>>>> +    spin_unlock(&syncobj->lock);
>>>>>> +    dma_fence_put(&wait_pt->base.base);
>>>>>> +    wait_pt = drm_syncobj_timeline_lookup_wait_pt_fence(syncobj, 
>>>>>> point,
>>>>>> +                                fence);
>>>>>> +    return wait_pt;
>>>>>> +}
>>>>>> +
>>>>>> +static struct dma_fence *
>>>>>> +drm_syncobj_timeline_point_get(struct drm_syncobj *syncobj, u64 
>>>>>> point, u64 flags)
>>>>>> +{
>>>>>> +    struct drm_syncobj_wait_pt *wait_pt;
>>>>>> +    struct dma_fence *fence = NULL;
>>>>>> +
>>>>>> +    /* already signaled, simply return a signaled stub fence */
>>>>>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE &&
>>>>>> +        point <= syncobj->syncobj_timeline.timeline) {
>>>>>> +        struct drm_syncobj_stub_fence *fence;
>>>>>> +
>>>>>> +        fence = kzalloc(sizeof(*fence), GFP_KERNEL);
>>>>>> +        if (fence == NULL)
>>>>>> +            return NULL;
>>>>>> +
>>>>>> +        spin_lock_init(&fence->lock);
>>>>>> +        dma_fence_init(&fence->base, &drm_syncobj_stub_fence_ops,
>>>>>> +                   &fence->lock, 0, 0);
>>>>>> +        dma_fence_signal(&fence->base);
>>>>>> +        return &fence->base;
>>>>>> +    }
>>>>>> +
>>>>>> +    /* check if the wait pt exists */
>>>>>> +    wait_pt = drm_syncobj_timeline_lookup_wait_pt_fence(syncobj, 
>>>>>> point,
>>>>>> +                                &fence);
>>>>>> +    if (!fence) {
>>>>>> +        /* This is a new wait pt, so create it */
>>>>>> +        wait_pt = 
>>>>>> drm_syncobj_timeline_create_wait_pt_and_return_fence(syncobj, point,
>>>>>> +                                  &fence);
>>>>>> +        if (!fence)
>>>>>> +            return NULL;
>>>>>> +    }
>>>>>> +    if (wait_pt) {
>>>>>> +        int ret = 0;
>>>>>> +
>>>>>> +        if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>>>> +            ret = 
>>>>>> wait_event_interruptible_timeout(syncobj->syncobj_timeline.wq,
>>>>>> +                                   wait_pt->value <= 
>>>>>> syncobj->syncobj_timeline.signal_point,
>>>>>> + msecs_to_jiffies(10000)); /* wait 10s */
>>>>>> +            if (ret <= 0)
>>>>>> +                return NULL;
>>>>>> +        }
>>>>>> +        return fence;
>>>>>> +    }
>>>>>> +    return NULL;
>>>>>> +}
>>>>>> +
>>>>>>   /**
>>>>>>    * drm_syncobj_find_fence - lookup and reference the fence in a 
>>>>>> sync object
>>>>>>    * @file_private: drm file private pointer
>>>>>> @@ -229,20 +575,45 @@ static int 
>>>>>> drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>>>>>>    * contains a reference to the fence, which must be released by 
>>>>>> calling
>>>>>>    * dma_fence_put().
>>>>>>    */
>>>>>> -int drm_syncobj_find_fence(struct drm_file *file_private,
>>>>>> -               u32 handle, u64 point,
>>>>>> -               struct dma_fence **fence)
>>>>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 
>>>>>> point,
>>>>>> +                 u64 flags, struct dma_fence **fence)
>>>>>>   {
>>>>>> -    struct drm_syncobj *syncobj = drm_syncobj_find(file_private, 
>>>>>> handle);
>>>>>>       int ret = 0;
>>>>>>         if (!syncobj)
>>>>>>           return -ENOENT;
>>>>>>   -    *fence = drm_syncobj_fence_get(syncobj);
>>>>>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>>>>> +        /*NORMAL syncobj always wait on last pt */
>>>>>> +        u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
>>>>>> +
>>>>>> +        if (tail_pt_value == 0)
>>>>>> +            tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
>>>>>> +        /* NORMAL syncobj doesn't care point value */
>>>>>> +        WARN_ON(point != 0);
>>>>>> +        *fence = drm_syncobj_timeline_point_get(syncobj, 
>>>>>> tail_pt_value,
>>>>>> +                            flags);
>>>>>> +    } else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>>> +        *fence = drm_syncobj_timeline_point_get(syncobj, point,
>>>>>> +                            flags);
>>>>>> +    } else {
>>>>>> +        DRM_ERROR("Don't support this type syncobj\n");
>>>>>> +        *fence = NULL;
>>>>>> +    }
>>>>>>       if (!*fence) {
>>>>>>           ret = -EINVAL;
>>>>>>       }
>>>>>> +    return ret;
>>>>>> +}
>>>>>> +EXPORT_SYMBOL(drm_syncobj_search_fence);
>>>>>> +int drm_syncobj_find_fence(struct drm_file *file_private,
>>>>>> +               u32 handle, u64 point,
>>>>>> +               struct dma_fence **fence) {
>>>>>> +    struct drm_syncobj *syncobj = drm_syncobj_find(file_private, 
>>>>>> handle);
>>>>>> +
>>>>>> +    int ret = drm_syncobj_search_fence(syncobj, point,
>>>>>> + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>>>>> +                    fence);
>>>>>>       drm_syncobj_put(syncobj);
>>>>>>       return ret;
>>>>>>   }
>>>>>> @@ -259,7 +630,7 @@ void drm_syncobj_free(struct kref *kref)
>>>>>>       struct drm_syncobj *syncobj = container_of(kref,
>>>>>>                              struct drm_syncobj,
>>>>>>                              refcount);
>>>>>> -    drm_syncobj_replace_fence(syncobj, 0, NULL);
>>>>>> +    drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
>>>>>>       kfree(syncobj);
>>>>>>   }
>>>>>>   EXPORT_SYMBOL(drm_syncobj_free);
>>>>>> @@ -287,8 +658,12 @@ int drm_syncobj_create(struct drm_syncobj 
>>>>>> **out_syncobj, uint32_t flags,
>>>>>>           return -ENOMEM;
>>>>>>         kref_init(&syncobj->refcount);
>>>>>> -    INIT_LIST_HEAD(&syncobj->cb_list);
>>>>>>       spin_lock_init(&syncobj->lock);
>>>>>> +    if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
>>>>>> +        syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
>>>>>> +    else
>>>>>> +        syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
>>>>>> + drm_syncobj_timeline_init(&syncobj->syncobj_timeline);
>>>>>>         if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
>>>>>>           ret = drm_syncobj_assign_null_handle(syncobj);
>>>>>> @@ -646,7 +1021,6 @@ struct syncobj_wait_entry {
>>>>>>       struct task_struct *task;
>>>>>>       struct dma_fence *fence;
>>>>>>       struct dma_fence_cb fence_cb;
>>>>>> -    struct drm_syncobj_cb syncobj_cb;
>>>>>>   };
>>>>>>     static void syncobj_wait_fence_func(struct dma_fence *fence,
>>>>>> @@ -658,18 +1032,6 @@ static void syncobj_wait_fence_func(struct 
>>>>>> dma_fence *fence,
>>>>>>       wake_up_process(wait->task);
>>>>>>   }
>>>>>>   -static void syncobj_wait_syncobj_func(struct drm_syncobj 
>>>>>> *syncobj,
>>>>>> -                      struct drm_syncobj_cb *cb)
>>>>>> -{
>>>>>> -    struct syncobj_wait_entry *wait =
>>>>>> -        container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>>>>>> -
>>>>>> -    /* This happens inside the syncobj lock */
>>>>>> -    wait->fence = 
>>>>>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>>>>> - lockdep_is_held(&syncobj->lock)));
>>>>>> -    wake_up_process(wait->task);
>>>>>> -}
>>>>>> -
>>>>>>   static signed long drm_syncobj_array_wait_timeout(struct 
>>>>>> drm_syncobj **syncobjs,
>>>>>>                             uint32_t count,
>>>>>>                             uint32_t flags,
>>>>>> @@ -693,14 +1055,11 @@ static signed long 
>>>>>> drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>>>>>>       signaled_count = 0;
>>>>>>       for (i = 0; i < count; ++i) {
>>>>>>           entries[i].task = current;
>>>>>> -        entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
>>>>>> +        ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
>>>>>> +                           &entries[i].fence);
>>>>>>           if (!entries[i].fence) {
>>>>>> -            if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>>>> -                continue;
>>>>>> -            } else {
>>>>>> -                ret = -EINVAL;
>>>>>> -                goto cleanup_entries;
>>>>>> -            }
>>>>>> +            ret = -EINVAL;
>>>>>> +            goto cleanup_entries;
>>>>>>           }
>>>>>>             if (dma_fence_is_signaled(entries[i].fence)) {
>>>>>> @@ -728,15 +1087,6 @@ static signed long 
>>>>>> drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>>>>>>        * fallthough and try a 0 timeout wait!
>>>>>>        */
>>>>>>   -    if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>>>> -        for (i = 0; i < count; ++i) {
>>>>>> - drm_syncobj_fence_get_or_add_callback(syncobjs[i],
>>>>>> - &entries[i].fence,
>>>>>> - &entries[i].syncobj_cb,
>>>>>> - syncobj_wait_syncobj_func);
>>>>>> -        }
>>>>>> -    }
>>>>>> -
>>>>>>       do {
>>>>>>           set_current_state(TASK_INTERRUPTIBLE);
>>>>>>   @@ -784,13 +1134,10 @@ static signed long 
>>>>>> drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>>>>>>     cleanup_entries:
>>>>>>       for (i = 0; i < count; ++i) {
>>>>>> -        if (entries[i].syncobj_cb.func)
>>>>>> -            drm_syncobj_remove_callback(syncobjs[i],
>>>>>> - &entries[i].syncobj_cb);
>>>>>> +        dma_fence_put(entries[i].fence);
>>>>>>           if (entries[i].fence_cb.func)
>>>>>> dma_fence_remove_callback(entries[i].fence,
>>>>>>                             &entries[i].fence_cb);
>>>>>> -        dma_fence_put(entries[i].fence);
>>>>>>       }
>>>>>>       kfree(entries);
>>>>>>   @@ -965,12 +1312,20 @@ drm_syncobj_reset_ioctl(struct 
>>>>>> drm_device *dev, void *data,
>>>>>>       if (ret < 0)
>>>>>>           return ret;
>>>>>>   -    for (i = 0; i < args->count_handles; i++)
>>>>>> -        drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
>>>>>> -
>>>>>> +    for (i = 0; i < args->count_handles; i++) {
>>>>>> +        if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>>> +            DRM_ERROR("timeline syncobj cannot reset!\n");
>>>>>> +            ret = -EINVAL;
>>>>>> +            goto out;
>>>>>> +        }
>>>>>> +        drm_syncobj_timeline_fini(syncobjs[i],
>>>>>> + &syncobjs[i]->syncobj_timeline);
>>>>>> + drm_syncobj_timeline_init(&syncobjs[i]->syncobj_timeline);
>>>>>> +    }
>>>>>> +out:
>>>>>>       drm_syncobj_array_free(syncobjs, args->count_handles);
>>>>>>   -    return 0;
>>>>>> +    return ret;
>>>>>>   }
>>>>>>     int
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
>>>>>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>>>> index 7209dd832d39..bb20d318c9d6 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>>>> @@ -2182,7 +2182,9 @@ await_fence_array(struct i915_execbuffer *eb,
>>>>>>           if (!(flags & I915_EXEC_FENCE_WAIT))
>>>>>>               continue;
>>>>>>   -        fence = drm_syncobj_fence_get(syncobj);
>>>>>> +        drm_syncobj_search_fence(syncobj, 0,
>>>>>> + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>>>>> +                     &fence);
>>>>>>           if (!fence)
>>>>>>               return -EINVAL;
>>>>>>   diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
>>>>>> index 425432b85a87..657c97dc25ec 100644
>>>>>> --- a/include/drm/drm_syncobj.h
>>>>>> +++ b/include/drm/drm_syncobj.h
>>>>>> @@ -30,6 +30,25 @@
>>>>>>     struct drm_syncobj_cb;
>>>>>>   +enum drm_syncobj_type {
>>>>>> +    DRM_SYNCOBJ_TYPE_NORMAL,
>>>>>> +    DRM_SYNCOBJ_TYPE_TIMELINE
>>>>>> +};
>>>>>> +
>>>>>> +struct drm_syncobj_timeline {
>>>>>> +    wait_queue_head_t    wq;
>>>>>> +    u64 timeline_context;
>>>>>> +    /**
>>>>>> +     * @timeline: syncobj timeline
>>>>>> +     */
>>>>>> +    u64 timeline;
>>>>>> +    u64 signal_point;
>>>>>> +
>>>>>> +
>>>>>> +    struct rb_root wait_pt_tree;
>>>>>> +    struct list_head signal_pt_list;
>>>>>> +};
>>>>>> +
>>>>>>   /**
>>>>>>    * struct drm_syncobj - sync object.
>>>>>>    *
>>>>>> @@ -41,19 +60,16 @@ struct drm_syncobj {
>>>>>>        */
>>>>>>       struct kref refcount;
>>>>>>       /**
>>>>>> -     * @fence:
>>>>>> -     * NULL or a pointer to the fence bound to this object.
>>>>>> -     *
>>>>>> -     * This field should not be used directly. Use 
>>>>>> drm_syncobj_fence_get()
>>>>>> -     * and drm_syncobj_replace_fence() instead.
>>>>>> +     * @type: indicate syncobj type
>>>>>>        */
>>>>>> -    struct dma_fence __rcu *fence;
>>>>>> +    enum drm_syncobj_type type;
>>>>>>       /**
>>>>>> -     * @cb_list: List of callbacks to call when the &fence gets 
>>>>>> replaced.
>>>>>> +     * @syncobj_timeline: timeline
>>>>>>        */
>>>>>> -    struct list_head cb_list;
>>>>>> +    struct drm_syncobj_timeline syncobj_timeline;
>>>>>> +
>>>>>>       /**
>>>>>> -     * @lock: Protects &cb_list and write-locks &fence.
>>>>>> +     * @lock: Protects timeline list and write-locks &fence.
>>>>>>        */
>>>>>>       spinlock_t lock;
>>>>>>       /**
>>>>>> @@ -62,25 +78,6 @@ struct drm_syncobj {
>>>>>>       struct file *file;
>>>>>>   };
>>>>>>   -typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
>>>>>> -                   struct drm_syncobj_cb *cb);
>>>>>> -
>>>>>> -/**
>>>>>> - * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
>>>>>> - * @node: used by drm_syncob_add_callback to append this struct to
>>>>>> - *      &drm_syncobj.cb_list
>>>>>> - * @func: drm_syncobj_func_t to call
>>>>>> - *
>>>>>> - * This struct will be initialized by drm_syncobj_add_callback, 
>>>>>> additional
>>>>>> - * data can be passed along by embedding drm_syncobj_cb in 
>>>>>> another struct.
>>>>>> - * The callback will get called the next time 
>>>>>> drm_syncobj_replace_fence is
>>>>>> - * called.
>>>>>> - */
>>>>>> -struct drm_syncobj_cb {
>>>>>> -    struct list_head node;
>>>>>> -    drm_syncobj_func_t func;
>>>>>> -};
>>>>>> -
>>>>>>   void drm_syncobj_free(struct kref *kref);
>>>>>>     /**
>>>>>> @@ -106,29 +103,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
>>>>>>       kref_put(&obj->refcount, drm_syncobj_free);
>>>>>>   }
>>>>>>   -/**
>>>>>> - * drm_syncobj_fence_get - get a reference to a fence in a sync 
>>>>>> object
>>>>>> - * @syncobj: sync object.
>>>>>> - *
>>>>>> - * This acquires additional reference to &drm_syncobj.fence 
>>>>>> contained in @obj,
>>>>>> - * if not NULL. It is illegal to call this without already 
>>>>>> holding a reference.
>>>>>> - * No locks required.
>>>>>> - *
>>>>>> - * Returns:
>>>>>> - * Either the fence of @obj or NULL if there's none.
>>>>>> - */
>>>>>> -static inline struct dma_fence *
>>>>>> -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
>>>>>> -{
>>>>>> -    struct dma_fence *fence;
>>>>>> -
>>>>>> -    rcu_read_lock();
>>>>>> -    fence = dma_fence_get_rcu_safe(&syncobj->fence);
>>>>>> -    rcu_read_unlock();
>>>>>> -
>>>>>> -    return fence;
>>>>>> -}
>>>>>> -
>>>>>>   struct drm_syncobj *drm_syncobj_find(struct drm_file 
>>>>>> *file_private,
>>>>>>                        u32 handle);
>>>>>>   void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 
>>>>>> point,
>>>>>> @@ -142,5 +116,7 @@ int drm_syncobj_create(struct drm_syncobj 
>>>>>> **out_syncobj, uint32_t flags,
>>>>>>   int drm_syncobj_get_handle(struct drm_file *file_private,
>>>>>>                  struct drm_syncobj *syncobj, u32 *handle);
>>>>>>   int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
>>>>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 
>>>>>> point,
>>>>>> +                 u64 flags, struct dma_fence **fence);
>>>>>>     #endif
>>>>>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>>>>>> index 300f336633f2..cebdb2541eb7 100644
>>>>>> --- a/include/uapi/drm/drm.h
>>>>>> +++ b/include/uapi/drm/drm.h
>>>>>> @@ -717,6 +717,7 @@ struct drm_prime_handle {
>>>>>>   struct drm_syncobj_create {
>>>>>>       __u32 handle;
>>>>>>   #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
>>>>>> +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
>>>>>>       __u32 flags;
>>>>>>   };
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list