[PATCH] ttm: wait mem space if user allow while gpu busy

Christian König ckoenig.leichtzumerken at gmail.com
Wed Apr 24 07:12:20 UTC 2019


> how about just adding a wrapper for pin function as below?
I considered this as well and don't think it will work reliable.

We could use it as a band aid for this specific problem, but in general 
we need to improve the handling in TTM to resolve those kind of resource 
conflicts.

Regards,
Christian.

Am 23.04.19 um 17:09 schrieb Zhou, David(ChunMing):
> >3. If we have a ticket we grab a reference to the first BO on the 
> LRU, drop the LRU lock and try to grab the reservation lock with the 
> ticket.
>
> The BO on LRU is already locked by cs user, can it be dropped here by 
> DC user? and then DC user grab its lock with ticket, how does CS grab 
> it again?
>
> If you think waiting in ttm has this risk, how about just adding a 
> wrapper for pin function as below?
> amdgpu_get_pin_bo_timeout()
> {
> do {
> amdgpo_bo_reserve();
> r=amdgpu_bo_pin();
>
> if(!r)
>         break;
> amdgpu_bo_unreserve();
> timeout--;
>
> } while(timeout>0);
>
> }
>
> -------- Original Message --------
> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
> From: Christian König
> To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Liang, Prike" 
> ,dri-devel at lists.freedesktop.org
> CC:
>
> Well that's not so easy of hand.
>
> The basic problem here is that when you busy wait at this place you 
> can easily run into situations where application A busy waits for B 
> while B busy waits for A -> deadlock.
>
> So what we need here is the deadlock detection logic of the ww_mutex. 
> To use this we at least need to do the following steps:
>
> 1. Reserve the BO in DC using a ww_mutex ticket (trivial).
>
> 2. If we then run into this EBUSY condition in TTM check if the BO we 
> need memory for (or rather the ww_mutex of its reservation object) has 
> a ticket assigned.
>
> 3. If we have a ticket we grab a reference to the first BO on the LRU, 
> drop the LRU lock and try to grab the reservation lock with the ticket.
>
> 4. If getting the reservation lock with the ticket succeeded we check 
> if the BO is still the first one on the LRU in question (the BO could 
> have moved).
>
> 5. If the BO is still the first one on the LRU in question we try to 
> evict it as we would evict any other BO.
>
> 6. If any of the "If's" above fail we just back off and return -EBUSY.
>
> Steps 2-5 are certainly not trivial, but doable as far as I can see.
>
> Have fun :)
> Christian.
>
> Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing):
>> How about adding more condition ctx->resv inline to address your 
>> concern? As well as don't wait from same user, shouldn't lead to 
>> deadlock.
>>
>> Otherwise, any other idea?
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
>> From: Christian König
>> To: "Liang, Prike" ,"Zhou, David(ChunMing)" 
>> ,dri-devel at lists.freedesktop.org
>> CC:
>>
>> Well that is certainly a NAK because it can lead to deadlock in the
>> memory management.
>>
>> You can't just busy wait with all those locks held.
>>
>> Regards,
>> Christian.
>>
>> Am 23.04.19 um 03:45 schrieb Liang, Prike:
>> > Acked-by: Prike Liang <Prike.Liang at amd.com>
>> >
>> > Thanks,
>> > Prike
>> > -----Original Message-----
>> > From: Chunming Zhou <david1.zhou at amd.com>
>> > Sent: Monday, April 22, 2019 6:39 PM
>> > To: dri-devel at lists.freedesktop.org
>> > Cc: Liang, Prike <Prike.Liang at amd.com>; Zhou, David(ChunMing) 
>> <David1.Zhou at amd.com>
>> > Subject: [PATCH] ttm: wait mem space if user allow while gpu busy
>> >
>> > heavy gpu job could occupy memory long time, which could lead to 
>> other user fail to get memory.
>> >
>> > Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11
>> > Signed-off-by: Chunming Zhou <david1.zhou at amd.com>
>> > ---
>> >   drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++--
>> >   1 file changed, 4 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>> b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644
>> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> > @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct 
>> ttm_buffer_object *bo,
>> >                if (mem->mm_node)
>> >                        break;
>> >                ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
>> > -             if (unlikely(ret != 0))
>> > -                     return ret;
>> > +             if (unlikely(ret != 0)) {
>> > +                     if (!ctx || ctx->no_wait_gpu || ret != -EBUSY)
>> > +                             return ret;
>> > +             }
>> >        } while (1);
>> >        mem->mem_type = mem_type;
>> >        return ttm_bo_add_move_fence(bo, man, mem);
>> > --
>> > 2.17.1
>> >
>> > _______________________________________________
>> > dri-devel mailing list
>> > dri-devel at lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190424/1bbdf6c9/attachment.html>


More information about the dri-devel mailing list