[PATCH] ttm: wait mem space if user allow while gpu busy

Wed Apr 24 08:06:07 UTC 2019

Am 24.04.19 um 10:01 schrieb Daniel Vetter:
> On Tue, Apr 23, 2019 at 4:42 PM Christian König
> <ckoenig.leichtzumerken at gmail.com> wrote:
>> Well that's not so easy of hand.
>>
>> The basic problem here is that when you busy wait at this place you can easily run into situations where application A busy waits for B while B busy waits for A -> deadlock.
>>
>> So what we need here is the deadlock detection logic of the ww_mutex. To use this we at least need to do the following steps:
>>
>> 1. Reserve the BO in DC using a ww_mutex ticket (trivial).
>>
>> 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned.
>>
>> 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket.
>>
>> 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved).
> I don't think you actually need this check. Once you're in this slow
> reclaim mode all hope for performance is pretty much lost (you're
> thrashin vram terribly), forward progress matters. Also, less code :-)

This step is not for performance, but for correctness.

When you drop the LRU lock and grab the ww_mutex in the slow path you 
need to double check that this BO wasn't evicted by somebody else in the 
meantime.

>> 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO.
>>
>> 6. If any of the "If's" above fail we just back off and return -EBUSY.
> Another idea I pondered (but never implemented) is a slow reclaim lru
> lock. Essentially there'd be two ways to walk the lru and evict bo:
>
> - fast path: spinlock + trylock, like today
>
> - slow path: ww_mutex lru lock, plus every bo is reserved (nested
> within that ww_mutex lru lock) with a full ww_mutex_lock. Guaranteed
> forward progress.

Of hand I don't see any advantage to this. So what is the benefit?

Regards,
Christian.

>
> Transition would be that as soon as someone hits an EBUSY, they set
> the slow reclaim flag (while holding the quick reclaim spinlock
> quickly, which will drain anyone still stuck in fast reclaim path).
> Everytime fast reclaim acquires the spinlock it needs to check for the
> slow reclaim flag, and if that's set, fall back to slow reclaim.
>
> Transitioning out of slow reclaim would only happen once the thread
> (with it's ww ticket) that hit the EBUSY has completed whatever it was
> trying to do (either successfully, or failed because even evicting
> everyone else didn't give you enough vram). Tricky part here is making
> sure threads still in slow reclaim don't blow up if we switch back.
> Since only ever one thread can be actually doing slow reclaim
> (everyone is serialized on the single ww mutex lru lock) should be
> doable by checking for the slow reclaim conditiona once you have the
> lru ww_mutex and if the slow reclaim condition is lifted, switch back
> to fast reclaim.
>
> The slow reclaim conditiona might also need to be a full reference
> count, to handle multiple threads hitting EBUSY/slow reclaim without
> the book-keeping getting all confused.
>
> Upshot of this is that it's guranteeing forward progress, but the perf
> cliff might be too steep if this happens too often. You might need to
> round it off with 1-2 retries when you hit EBUSY, before forcing slow
> reclaim.
> -Daniel
>
>> Steps 2-5 are certainly not trivial, but doable as far as I can see.
>>
>> Have fun :)
>> Christian.
>>
>> Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing):
>>
>> How about adding more condition ctx->resv inline to address your concern? As well as don't wait from same user, shouldn't lead to deadlock.
>>
>> Otherwise, any other idea?
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
>> From: Christian König
>> To: "Liang, Prike" ,"Zhou, David(ChunMing)" ,dri-devel at lists.freedesktop.org
>> CC:
>>
>> Well that is certainly a NAK because it can lead to deadlock in the
>> memory management.
>>
>> You can't just busy wait with all those locks held.
>>
>> Regards,
>> Christian.
>>
>> Am 23.04.19 um 03:45 schrieb Liang, Prike:
>>> Acked-by: Prike Liang <Prike.Liang at amd.com>
>>>
>>> Thanks,
>>> Prike
>>> -----Original Message-----
>>> From: Chunming Zhou <david1.zhou at amd.com>
>>> Sent: Monday, April 22, 2019 6:39 PM
>>> To: dri-devel at lists.freedesktop.org
>>> Cc: Liang, Prike <Prike.Liang at amd.com>; Zhou, David(ChunMing) <David1.Zhou at amd.com>
>>> Subject: [PATCH] ttm: wait mem space if user allow while gpu busy
>>>
>>> heavy gpu job could occupy memory long time, which could lead to other user fail to get memory.
>>>
>>> Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11
>>> Signed-off-by: Chunming Zhou <david1.zhou at amd.com>
>>> ---
>>>    drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++--
>>>    1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct ttm_buffer_object *bo,
>>>                 if (mem->mm_node)
>>>                         break;
>>>                 ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
>>> -             if (unlikely(ret != 0))
>>> -                     return ret;
>>> +             if (unlikely(ret != 0)) {
>>> +                     if (!ctx || ctx->no_wait_gpu || ret != -EBUSY)
>>> +                             return ret;
>>> +             }
>>>         } while (1);
>>>         mem->mem_type = mem_type;
>>>         return ttm_bo_add_move_fence(bo, man, mem);
>>> --
>>> 2.17.1
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>