[PATCH 3/6] drm/ttm: call ttm_bo_cleanup_refs with reservation and lru lock held

Wed Nov 28 05:21:00 PST 2012

On 11/28/2012 01:15 PM, Maarten Lankhorst wrote:
> Op 28-11-12 12:54, Thomas Hellstrom schreef:
>> On 11/28/2012 12:25 PM, Maarten Lankhorst wrote:
>>> By removing the unlocking of lru and retaking it immediately, a race is
>>> removed where the bo is taken off the swap list or the lru list between
>>> the unlock and relock. As such the cleanup_refs code can be simplified,
>>> it will attempt to call ttm_bo_wait non-blockingly, and if it fails
>>> it will drop the locks and perform a blocking wait, or return an error
>>> if no_wait_gpu was set.
>>>
>>> The need for looping is also eliminated, since swapout and evict_mem_first
>>> will always follow the destruction path, so no new fence is allowed
>>> to be attached. As far as I can see this may already have been the case,
>>> but the unlocking / relocking required a complicated loop to deal with
>>> re-reservation.
>>>
>>> The downside is that ttm_bo_cleanup_memtype_use is no longer called with
>>> reservation held, so drivers must be aware that move_notify with a null
>>> parameter doesn't require a reservation.
>> Why can't we unreserve *after* ttm_bo_cleanup_memtype_use? That's not
>> immediately clear from this patch.
> Because we would hold the reservation while waiting and with the object still
> on swap and lru lists still, that would defeat the whole purpose of keeping
> the object on multiple lists, plus break current code that assumes bo's on the
> those lists can always be reserved.
>
> the if (ret && !no_wait_gpu) path has to drop the reservation and lru lock, and
> isn't guaranteed to be able to retake it. Maybe it could be guaranteed now, but
> I'm sure that would not be the case if the reservations were shared across
> devices.
The evict path removes the BO from the LRU lists, drops the LRU lock but 
hangs on to the reservation,
and in case the wait goes wrong, re-adds the bo to the LRU lists and 
returns an error.

Isn't it possible to do the same in the !no_wait_gpu case?
/Thomas