[PATCH 1/5] dma-buf: fix dma_resv_test_signaled test_all handling

Christian König ckoenig.leichtzumerken at gmail.com
Mon Jun 14 17:15:44 UTC 2021


Am 11.06.21 um 16:55 schrieb Daniel Vetter:
> On Fri, Jun 11, 2021 at 04:53:11PM +0200, Christian König wrote:
>>
>> Am 11.06.21 um 16:47 schrieb Daniel Vetter:
>>> On Fri, Jun 11, 2021 at 02:02:57PM +0200, Christian König wrote:
>>>> As the name implies if testing all fences is requested we
>>>> should indeed test all fences and not skip the exclusive
>>>> one because we see shared ones.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>> Hm I thought we've had the rule that when both fences exist, then
>>> collectively the shared ones must signale no earlier than the exclusive
>>> one.
>>>
>>> That's at least the contract we've implemented in dma_resv.h. But I've
>>> also found a bunch of drivers who are a lot more yolo on this.
>>>
>>> I think there's a solid case here to just always take all the fences if we
>>> ask for all the shared ones, but if we go that way then I'd say
>>> - clear kerneldoc patch to really hammer this in (currently we're not good
>>>     at all in this regard)
>>> - going through drivers a bit to check for this (I have some of that done
>>>     already in my earlier series, need to respin it and send it out)
>>>
>>> But I'm kinda not seeing why this needs to be in this patch series here.
>> You mentioned that this is a problem in the last patch and if you ask me
>> that's just a bug or at least very inconsistent.
>>
>> See dma_resv_wait_timeout() always waits for all fences, including the
>> exclusive one even if shared ones are present. But dma_resv_test_signaled()
>> ignores the exclusive one if shared ones are present.
> Hm the only one I thought I've mentioned is that dma_buf_poll doesn't use
> dma_fence_get_rcu_safe where I think it should. Different problem. I think
> this is one you spotted.
>
>> The only other driver I could find trying to make use of this is nouveau and
>> I already provided a fix for this as well.
> i915 also does this, and I think I've found a few more.
>
>> I just think that this is the more defensive approach to fix this and have
>> at least the core functions consistent on the handling.
> Oh fully agree, it's just current dma_resv docs aren't the greatest, and
> hacking on semantics without updating the docs isn't great. Especially
> when it's ad-hoc.

Well when the requirement that shared fences should always signal after 
the exclusive fence is not documented anywhere then I would say that it 
is naturally allowed to just add any fence to the list of shared fence 
and any code assuming something else is just broken and need fixing.

Christian.

> -Daniel
>
>> Christian.
>>
>>> -Daniel
>>>
>>>> ---
>>>>    drivers/dma-buf/dma-resv.c | 33 ++++++++++++---------------------
>>>>    1 file changed, 12 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
>>>> index f26c71747d43..c66bfdde9454 100644
>>>> --- a/drivers/dma-buf/dma-resv.c
>>>> +++ b/drivers/dma-buf/dma-resv.c
>>>> @@ -615,25 +615,21 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
>>>>     */
>>>>    bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>>>>    {
>>>> -	unsigned int seq, shared_count;
>>>> +	struct dma_fence *fence;
>>>> +	unsigned int seq;
>>>>    	int ret;
>>>>    	rcu_read_lock();
>>>>    retry:
>>>>    	ret = true;
>>>> -	shared_count = 0;
>>>>    	seq = read_seqcount_begin(&obj->seq);
>>>>    	if (test_all) {
>>>>    		struct dma_resv_list *fobj = dma_resv_shared_list(obj);
>>>> -		unsigned int i;
>>>> -
>>>> -		if (fobj)
>>>> -			shared_count = fobj->shared_count;
>>>> +		unsigned int i, shared_count;
>>>> +		shared_count = fobj ? fobj->shared_count : 0;
>>>>    		for (i = 0; i < shared_count; ++i) {
>>>> -			struct dma_fence *fence;
>>>> -
>>>>    			fence = rcu_dereference(fobj->shared[i]);
>>>>    			ret = dma_resv_test_signaled_single(fence);
>>>>    			if (ret < 0)
>>>> @@ -641,24 +637,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
>>>>    			else if (!ret)
>>>>    				break;
>>>>    		}
>>>> -
>>>> -		if (read_seqcount_retry(&obj->seq, seq))
>>>> -			goto retry;
>>>>    	}
>>>> -	if (!shared_count) {
>>>> -		struct dma_fence *fence_excl = dma_resv_excl_fence(obj);
>>>> -
>>>> -		if (fence_excl) {
>>>> -			ret = dma_resv_test_signaled_single(fence_excl);
>>>> -			if (ret < 0)
>>>> -				goto retry;
>>>> +	fence = dma_resv_excl_fence(obj);
>>>> +	if (fence) {
>>>> +		ret = dma_resv_test_signaled_single(fence);
>>>> +		if (ret < 0)
>>>> +			goto retry;
>>>> -			if (read_seqcount_retry(&obj->seq, seq))
>>>> -				goto retry;
>>>> -		}
>>>>    	}
>>>> +	if (read_seqcount_retry(&obj->seq, seq))
>>>> +		goto retry;
>>>> +
>>>>    	rcu_read_unlock();
>>>>    	return ret;
>>>>    }
>>>> -- 
>>>> 2.25.1
>>>>



More information about the amd-gfx mailing list