[PATCH 10/13] drm/radeon: return -ENOENT in fence_wait_*

Fri Apr 20 03:24:53 PDT 2012

On 20.04.2012 11:15, Michel Dänzer wrote:
> On Fre, 2012-04-20 at 10:49 +0200, Christian König wrote:
>> On 20.04.2012 09:20, Michel Dänzer wrote:
>>> On Fre, 2012-04-20 at 00:39 +0200, Christian König wrote:
>>>> Signed-off-by: Christian König<deathsimple at vodafone.de>
>>>> ---
>>>>    drivers/gpu/drm/radeon/radeon_fence.c |    4 ++--
>>>>    1 files changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
>>>> index 1a9765a..764ab7e 100644
>>>> --- a/drivers/gpu/drm/radeon/radeon_fence.c
>>>> +++ b/drivers/gpu/drm/radeon/radeon_fence.c
>>>> @@ -286,7 +286,7 @@ int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
>>>>    	}
>>>>    	if (list_empty(&rdev->fence_drv[ring].emitted)) {
>>>>    		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
>>>> -		return 0;
>>>> +		return -ENOENT;
>>>>    	}
>>>>    	fence = list_entry(rdev->fence_drv[ring].emitted.next,
>>>>    			   struct radeon_fence, list);
>>>> @@ -310,7 +310,7 @@ int radeon_fence_wait_last(struct radeon_device *rdev, int ring)
>>>>    	}
>>>>    	if (list_empty(&rdev->fence_drv[ring].emitted)) {
>>>>    		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
>>>> -		return 0;
>>>> +		return -ENOENT;
>>>>    	}
>>>>    	fence = list_entry(rdev->fence_drv[ring].emitted.prev,
>>>>    			   struct radeon_fence, list);
>>> It seems weird to declare a fence wait as failed when there are no
>>> outstanding fences in the first place. If there are callers which
>>> require outstanding fences, they should probably handle that themselves..
>> Why that sounds so weird? Ok, maybe for radeon_fence_wait_last that's
>> questionable,
> Indeed. It happens not to break radeon_suspend_kms because it doesn't
> check the return value, but otherwise it would fail spuriously.
>
>
>> but for radeon_fence_wait_next it's quite clear to me that
>> we should signal the caller that there is no fence to wait for.
>>
>> The problem I wanted to fix with that is the usage of
>> radeon_fence_wait_next in radeon_ring_alloc (for example):
>>> int radeon_ring_alloc(struct radeon_device *rdev, struct radeon_ring
>>> *ring, unsigned ndw)
>>> {
>>>          int r;
>>>
>>>          /* Align requested size with padding so unlock_commit can
>>>           * pad safely */
>>>          ndw = (ndw + ring->align_mask)&  ~ring->align_mask;
>>>          while (ndw>  (ring->ring_free_dw - 1)) {
>>>                  radeon_ring_free_size(rdev, ring);
>>>                  if (ndw<  ring->ring_free_dw) {
>>>                          break;
>>>                  }
>>>                  r = radeon_fence_wait_next(rdev,
>>> radeon_ring_index(rdev, ring));
>>>                  if (r)
>>>                          return r;
>>>          }
>>>          ring->count_dw = ndw;
>>>          ring->wptr_old = ring->wptr;
>>>          return 0;
>>> }
>> If the ring is full, but actually has no more fences in it (which in my
>> case was caused by my stupidity and actually shouldn't happen otherwise)
>> this loop will just busy wait with a critical mutex locked for something
>> that never happens.
> My suggestion was to explicitly check for that in radeon_ring_alloc. But
> I guess right now it doesn't really matter, as it's the only caller. :)
Yeah, but when we check that explicitly we need to call into the fence 
code twice, without locking in between, so the result of the first call 
could change before the second call happens etc... well that's just crap.

So what do you think of this: Just add the -ENOENT to fence_wait_next 
and rename fence_wait_last to fence_wait_empty instead?

Well I'm just trying to program defensively here, making it as robust as 
possible to both technically and human errors.

Christian.