[PATCH 3/6] drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held
Maarten Lankhorst
maarten.lankhorst at canonical.com
Wed Nov 28 14:26:27 PST 2012
Op 28-11-12 20:21, Thomas Hellstrom schreef:
> On 11/28/2012 07:32 PM, Maarten Lankhorst wrote:
>> Op 28-11-12 16:10, Thomas Hellstrom schreef:
>>> On 11/28/2012 03:46 PM, Maarten Lankhorst wrote:
>>>> Op 28-11-12 15:23, Thomas Hellstrom schreef:
>>>>> On 11/28/2012 02:55 PM, Maarten Lankhorst wrote:
>>>>>> Op 28-11-12 14:21, Thomas Hellstrom schreef:
>>>>>>> On 11/28/2012 01:15 PM, Maarten Lankhorst wrote:
>>>>>>>> Op 28-11-12 12:54, Thomas Hellstrom schreef:
>>>>>>>>> On 11/28/2012 12:25 PM, Maarten Lankhorst wrote:
>>>>>>>>>> By removing the unlocking of lru and retaking it immediately, a race is
>>>>>>>>>> removed where the bo is taken off the swap list or the lru list between
>>>>>>>>>> the unlock and relock. As such the cleanup_refs code can be simplified,
>>>>>>>>>> it will attempt to call ttm_bo_wait non-blockingly, and if it fails
>>>>>>>>>> it will drop the locks and perform a blocking wait, or return an error
>>>>>>>>>> if no_wait_gpu was set.
>>>>>>>>>>
>>>>>>>>>> The need for looping is also eliminated, since swapout and evict_mem_first
>>>>>>>>>> will always follow the destruction path, so no new fence is allowed
>>>>>>>>>> to be attached. As far as I can see this may already have been the case,
>>>>>>>>>> but the unlocking / relocking required a complicated loop to deal with
>>>>>>>>>> re-reservation.
>>>>>>>>>>
>>>>>>>>>> The downside is that ttm_bo_cleanup_memtype_use is no longer called with
>>>>>>>>>> reservation held, so drivers must be aware that move_notify with a null
>>>>>>>>>> parameter doesn't require a reservation.
>>>>>>>>> Why can't we unreserve *after* ttm_bo_cleanup_memtype_use? That's not
>>>>>>>>> immediately clear from this patch.
>>>>>>>> Because we would hold the reservation while waiting and with the object still
>>>>>>>> on swap and lru lists still, that would defeat the whole purpose of keeping
>>>>>>>> the object on multiple lists, plus break current code that assumes bo's on the
>>>>>>>> those lists can always be reserved.
>>>>>>>>
>>>>>>>> the if (ret && !no_wait_gpu) path has to drop the reservation and lru lock, and
>>>>>>>> isn't guaranteed to be able to retake it. Maybe it could be guaranteed now, but
>>>>>>>> I'm sure that would not be the case if the reservations were shared across
>>>>>>>> devices.
>>>>>>> The evict path removes the BO from the LRU lists, drops the LRU lock but hangs on to the reservation,
>>>>>>> and in case the wait goes wrong, re-adds the bo to the LRU lists and returns an error.
>>>>>> If you really want to, we could hang on to the !no_wait_gpu path, wait shouldn't ever fail there, so I suppose
>>>>>> leaving it off the lru lists and not re-add on any list in case of wait fail is fine. It's still on the ddestroy list in that
>>>>>> case, so not adding it back to the other lists is harmless.
>>>>>>
>>>>> Well I'm a bit afraid that theoretically, other callers may have a bo reserved, while cleanup_refs_and_unlock
>>>>> more or less runs the whole destroy path on that buffer. Sure, we have control over those other reservers,
>>>>> but it may come back and bite us.
>>>> That's why initially I moved all the destruction to ttm_bo_release_list, to have all destruction in
>>>> only 1 place. But even now it's serialized with the lru lock, while the destruction may not happen
>>>> right away, it still happens before last list ref to the bo is dropped.
>>>>
>>>> But it's your call, just choose the approach you want and I'll resubmit this. :-)
>>>>
>>>>> Also the wait might fail if a signal is hit, so it's definitely possible, and even likely in the case of the X server process.
>>>>>
>>>>> Anyway, I prefer if we could try to keep the reservation across the ttm_cleanup_memtype_use function, and as far
>>>>> as I can tell, the only thing preventing that is the reservation release in the (!no_wait_gpu) path. So if we alter that to
>>>>> do the same as the evict path I think without looking to deeply into the consequences that we should be safe.
>>>> I think returning success early without removing off ddestroy list if re-reserving fails
>>>> with lru lock held would be better.
>>>>
>>>> We completed the wait and attempt to reserve the bo, which failed. Without the lru
>>>> lock atomicity, this can't happen since the only places that would do it call this with
>>>> the lru lock held.
>>>>
>>>> With the atomicity removal, the only place that could do this is ttm_bo_delayed_delete
>>>> with remove_all set to true. And even if that happened the destruction code would run
>>>> *anyway* since we completed the waiting part already, it would just not necessarily be
>>>> run from this thread, but that guarantee didn't exist anyway.
>>>>> Then we should be able to skip patch 2 as well.
>>>> If my tryreserve approach sounds sane, second patch should still be skippable. :-)
>>> Sure, Lets go for that approach.
>> Ok updated patch below, you only need to check if you like the approach or not,
>> since I haven't tested it yet beyond compiling. Rest of series (minus patch 2)
>> should still apply without modification.
>>
>> drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held, v2
>> By removing the unlocking of lru and retaking it immediately, a race is
>> removed where the bo is taken off the swap list or the lru list between
>> the unlock and relock. As such the cleanup_refs code can be simplified,
>> it will attempt to call ttm_bo_wait non-blockingly, and if it fails
>> it will drop the locks and perform a blocking wait, or return an error
>> if no_wait_gpu was set.
>> The need for looping is also eliminated, since swapout and evict_mem_first
>> will always follow the destruction path, no new fence is allowed
>> to be attached. As far as I can see this may already have been the case,
>> but the unlocking / relocking required a complicated loop to deal with
>> re-reservation.
>> Changes since v1:
>> - Simplify no_wait_gpu case by folding it in with empty ddestroy.
>> - Hold a reservation while calling ttm_bo_cleanup_memtype_use again.
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 202fc20..e9f01fe 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -488,12 +488,16 @@ static void ttm_bo_cleanup_memtype_use(struct ttm_buffer_object *bo)
>> ttm_bo_mem_put(bo, &bo->mem);
>> atomic_set(&bo->reserved, 0);
>> + wake_up_all(&bo->event_queue);
>> /*
>> - * Make processes trying to reserve really pick it up.
>> + * Since the final reference to this bo may not be dropped by
>> + * the current task we have to put a memory barrier here to make
>> + * sure the changes done in this function are always visible.
>> + *
>> + * This function only needs protection against the final kref_put.
>> */
>> - smp_mb__after_atomic_dec();
>> - wake_up_all(&bo->event_queue);
>> + smp_mb__before_atomic_dec();
>> }
>> static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
>> @@ -543,68 +547,95 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
>> }
>> /**
>> - * function ttm_bo_cleanup_refs
>> + * function ttm_bo_cleanup_refs_and_unlock
>> * If bo idle, remove from delayed- and lru lists, and unref.
>> * If not idle, do nothing.
>> *
>> + * Must be called with lru_lock and reservation held, this function
>> + * will drop both before returning.
>> + *
>> * @interruptible Any sleeps should occur interruptibly.
>> - * @no_wait_reserve Never wait for reserve. Return -EBUSY instead.
>> * @no_wait_gpu Never wait for gpu. Return -EBUSY instead.
>> */
>> -static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo,
>> - bool interruptible,
>> - bool no_wait_reserve,
>> - bool no_wait_gpu)
>> +static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo,
>> + bool interruptible,
>> + bool no_wait_gpu)
>> {
>> struct ttm_bo_device *bdev = bo->bdev;
>> + struct ttm_bo_driver *driver = bdev->driver;
>> struct ttm_bo_global *glob = bo->glob;
>> int put_count;
>> - int ret = 0;
>> + int ret;
>> -retry:
>> spin_lock(&bdev->fence_lock);
>> - ret = ttm_bo_wait(bo, false, interruptible, no_wait_gpu);
>> - spin_unlock(&bdev->fence_lock);
>> + ret = ttm_bo_wait(bo, false, false, true);
>> - if (unlikely(ret != 0))
>> - return ret;
>> + if (ret && !no_wait_gpu) {
>> + void *sync_obj;
>> -retry_reserve:
>> - spin_lock(&glob->lru_lock);
>> + /*
>> + * Take a reference to the fence and unreserve,
>> + * at this point the buffer should be dead, so
>> + * no new sync objects can be attached.
>> + */
>> + sync_obj = driver->sync_obj_ref(&bo->sync_obj);
>> + spin_unlock(&bdev->fence_lock);
>> - if (unlikely(list_empty(&bo->ddestroy))) {
>> + put_count = ttm_bo_del_from_lru(bo);
>> +
>> + atomic_set(&bo->reserved, 0);
>> + wake_up_all(&bo->event_queue);
>> spin_unlock(&glob->lru_lock);
>> - return 0;
>> - }
>> - ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
>> + ttm_bo_list_ref_sub(bo, put_count, true);
>> - if (unlikely(ret == -EBUSY)) {
>> - spin_unlock(&glob->lru_lock);
>> - if (likely(!no_wait_reserve))
>> - ret = ttm_bo_wait_unreserved(bo, interruptible);
>> - if (unlikely(ret != 0))
>> + ret = driver->sync_obj_wait(sync_obj, false, interruptible);
>> + driver->sync_obj_unref(&sync_obj);
>> + if (ret) {
>> + /*
>> + * Either the wait returned -ERESTARTSYS, or -EDEADLK
>> + * (radeon lockup) here. No effort is made to re-add
>> + * this bo to any lru list. Instead the bo only
>> + * appears on the delayed destroy list.
>> + */
>> return ret;
>> + }
> Actually, we *must* re-add the bo to LRU lists here, because otherwise when a driver needs
> to evict a memory type completely, there's a large chance it will miss this bo.
>
> So I think either we need to keep the reservation, or keep the bo on the LRU lists.
The second option is what v1 did, except I never bothered to re-take the reservation. ;-)
It shouldn't cause troubles to leave it on the lru lists if we drop the the reservation,
we can keep handling re-reservation failure in the same way as in v2.
In that case would v3 be the same as v2 of this patch, except with those 2 lines from the
ret && !no_wait_gpu branch removed:
put_count = ttm_bo_del_from_lru(bo);
ttm_bo_list_ref_sub(bo, put_count, true);
And of course the comment after sync_obj_wait failure would no longer apply.
~Maarten
More information about the dri-devel
mailing list