[PATCH 5/5] [RFC]drm: add syncobj timeline support v3
Christian König
christian.koenig at amd.com
Mon Sep 3 08:50:14 UTC 2018
Am 03.09.2018 um 06:13 schrieb Chunming Zhou:
>
>
> 在 2018/8/30 19:32, Christian König 写道:
>> [SNIP]
>>>>>>
>>>>>>> +
>>>>>>> +struct drm_syncobj_wait_pt {
>>>>>>> + struct drm_syncobj_stub_fence base;
>>>>>>> + u64 value;
>>>>>>> + struct rb_node node;
>>>>>>> +};
>>>>>>> +struct drm_syncobj_signal_pt {
>>>>>>> + struct drm_syncobj_stub_fence base;
>>>>>>> + struct dma_fence *signal_fence;
>>>>>>> + struct dma_fence *pre_pt_base;
>>>>>>> + struct dma_fence_cb signal_cb;
>>>>>>> + struct dma_fence_cb pre_pt_cb;
>>>>>>> + struct drm_syncobj *syncobj;
>>>>>>> + u64 value;
>>>>>>> + struct list_head list;
>>>>>>> +};
>>>>>>
>>>>>> What are those two structures good for
>>>>> They are used to record wait ops points and signal ops points.
>>>>> For timeline, they are connected by timeline value, works like:
>>>>> a. signal pt increase timeline value to signal_pt value when
>>>>> signal_pt is signaled. signal_pt is depending on previous pt fence
>>>>> and itself signal fence from signal ops.
>>>>> b. wait pt tree checks if timeline value over itself value.
>>>>>
>>>>> For normal, works like:
>>>>> a. signal pt increase 1 for syncobj_timeline->signal_point
>>>>> every time when signal ops is performed.
>>>>> b. when wait ops is comming, wait pt will fetch above last
>>>>> signal pt value as its wait point. wait pt will be only signaled
>>>>> by equal point value signal_pt.
>>>>>
>>>>>
>>>>>> and why is the stub fence their base?
>>>>> Good question, I tried to kzalloc for them as well when I debug
>>>>> them, I encountered a problem:
>>>>> I lookup/find wait_pt or signal_pt successfully, but when I tried
>>>>> to use them, they are freed sometimes, and results in NULL point.
>>>>> and generally, when lookup them, we often need their stub fence as
>>>>> well, and in the meantime, their lifecycle are same.
>>>>> Above reason, I make stub fence as their base.
>>>>
>>>> That sounds like you only did this because you messed up the
>>>> lifecycle.
>>>>
>>>> Additional to that I don't think you correctly considered the
>>>> lifecycle of the waits and the sync object itself. E.g. blocking in
>>>> drm_syncobj_timeline_fini() until all waits are done is not a good
>>>> idea.
>>>>
>>>> What you should do instead is to create a fence_array object with
>>>> all the fence we need to wait for when a wait point is requested.
>>> Yeah, this is our initial discussion result, but when I tried to do
>>> that, I found that cannot meet the advance signal requirement:
>>> a. We need to consider the wait and signal pt value are not
>>> one-to-one match, it's difficult to find dependent point, at least,
>>> there is some overhead.
>>
>> As far as I can see that is independent of using a fence array here.
>> See you can either use a ring buffer or an rb-tree, but when you want
>> to wait for a specific point we need to condense the not yet signaled
>> fences into an array.
> again, need to find the range of where the specific point in, we
> should close to timeline semantics, I also refered the sw_sync.c
> timeline, usally wait_pt is signalled by timeline point. And I agree
> we can implement it with several methods, but I don't think there are
> basical differences.
The problem is that with your current approach you need the sync_obj
alive for the synchronization to work. That is most likely not a good idea.
Additional to that you enable signaling without a need from the waiting
side. That is rather bad for implementations which need that optimization.
I suggest to either condense all the fences you need to wait for in an
array during the wait operation, or reference count the sync_obj and
only enable the signaling on the fences when requested by a wait.
>
>>
>>> b. because we allowed "wait-before-signal", if
>>> "wait-before-signal" happens, there isn't signal fence which can be
>>> used to create fence array.
>>
>> Well, again we DON'T support wait-before-signal here. I will
>> certainly NAK any implementation which tries to do this until we
>> haven't figured out all the resource management constraints and I
>> still don't see how we can do this.
> yes, we don't support real wait-before-signal in patch now, just a
> fake wait-before-signal, which still wait on CS submission until
> signal operation coming through wait_event, which is the conclusion we
> disscussed before.
Well in this case we should call it wait-for-signal and not
wait-before-signal :)
>
>>
>>> So timeline value is good to resolve that.
>>>
>>>>
>>>> Otherwise somebody can easily construct a situation where timeline
>>>> sync obj A waits on B and B waits on A.
>>> Same situation also can happens with fence-array, we only can see
>>> this is a programming bug with incorrect use.
>>
>> No, fence-array is initialized only once with a static list of
>> fences. This way it is impossible to add the fence-array to itself
>> for example.
>>
>> E.g. you can't build circle dependencies with that.
> we already wait for signal operation event, so never build circle
> dependencies with that. The theory is same.
Yeah, ok that is certainly true.
Regards,
Christian.
>
>
>>
>>> Better use rbtree_postorder_for_each_entry_safe() for this.
>>>>> From the comments, seems we cannot use it here, we need erase node
>>>>> here.
>>>>> Comments:
>>>>> * Note, however, that it cannot handle other modifications that
>>>>> re-order the
>>>>> * rbtree it is iterating over. This includes calling rb_erase()
>>>>> on @pos, as
>>>>> * rb_erase() may rebalance the tree, causing us to miss some nodes.
>>>>> */
>>>>
>>>> Yeah, but your current implementation has the same problem :)
>>>>
>>>> You need to use something like "while (node = rb_first(...))"
>>>> instead if you want to destruct the whole tree.
> OK, got it, I will change it in v4.
>
> Thanks,
> David Zhou
More information about the amd-gfx
mailing list