[PATCH] ttm: wait mem space if user allow while gpu busy

zhoucm1 zhoucm1 at amd.com
Wed Apr 24 07:17:51 UTC 2019


I made a patch as attached.

I'm not sure how to make patch as your proposal, Could you make a patch 
for that if mine isn't enough?

-David

On 2019年04月24日 15:12, Christian König wrote:
>> how about just adding a wrapper for pin function as below?
> I considered this as well and don't think it will work reliable.
>
> We could use it as a band aid for this specific problem, but in 
> general we need to improve the handling in TTM to resolve those kind 
> of resource conflicts.
>
> Regards,
> Christian.
>
> Am 23.04.19 um 17:09 schrieb Zhou, David(ChunMing):
>> >3. If we have a ticket we grab a reference to the first BO on the 
>> LRU, drop the LRU lock and try to grab the reservation lock with the 
>> ticket.
>>
>> The BO on LRU is already locked by cs user, can it be dropped here by 
>> DC user? and then DC user grab its lock with ticket, how does CS grab 
>> it again?
>>
>> If you think waiting in ttm has this risk, how about just adding a 
>> wrapper for pin function as below?
>> amdgpu_get_pin_bo_timeout()
>> {
>> do {
>> amdgpo_bo_reserve();
>> r=amdgpu_bo_pin();
>>
>> if(!r)
>>         break;
>> amdgpu_bo_unreserve();
>> timeout--;
>>
>> } while(timeout>0);
>>
>> }
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
>> From: Christian König
>> To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Liang, Prike" 
>> ,dri-devel at lists.freedesktop.org
>> CC:
>>
>> Well that's not so easy of hand.
>>
>> The basic problem here is that when you busy wait at this place you 
>> can easily run into situations where application A busy waits for B 
>> while B busy waits for A -> deadlock.
>>
>> So what we need here is the deadlock detection logic of the ww_mutex. 
>> To use this we at least need to do the following steps:
>>
>> 1. Reserve the BO in DC using a ww_mutex ticket (trivial).
>>
>> 2. If we then run into this EBUSY condition in TTM check if the BO we 
>> need memory for (or rather the ww_mutex of its reservation object) 
>> has a ticket assigned.
>>
>> 3. If we have a ticket we grab a reference to the first BO on the 
>> LRU, drop the LRU lock and try to grab the reservation lock with the 
>> ticket.
>>
>> 4. If getting the reservation lock with the ticket succeeded we check 
>> if the BO is still the first one on the LRU in question (the BO could 
>> have moved).
>>
>> 5. If the BO is still the first one on the LRU in question we try to 
>> evict it as we would evict any other BO.
>>
>> 6. If any of the "If's" above fail we just back off and return -EBUSY.
>>
>> Steps 2-5 are certainly not trivial, but doable as far as I can see.
>>
>> Have fun :)
>> Christian.
>>
>> Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing):
>>> How about adding more condition ctx->resv inline to address your 
>>> concern? As well as don't wait from same user, shouldn't lead to 
>>> deadlock.
>>>
>>> Otherwise, any other idea?
>>>
>>> -------- Original Message --------
>>> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
>>> From: Christian König
>>> To: "Liang, Prike" ,"Zhou, David(ChunMing)" 
>>> ,dri-devel at lists.freedesktop.org
>>> CC:
>>>
>>> Well that is certainly a NAK because it can lead to deadlock in the
>>> memory management.
>>>
>>> You can't just busy wait with all those locks held.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 23.04.19 um 03:45 schrieb Liang, Prike:
>>> > Acked-by: Prike Liang <Prike.Liang at amd.com>
>>> >
>>> > Thanks,
>>> > Prike
>>> > -----Original Message-----
>>> > From: Chunming Zhou <david1.zhou at amd.com>
>>> > Sent: Monday, April 22, 2019 6:39 PM
>>> > To: dri-devel at lists.freedesktop.org
>>> > Cc: Liang, Prike <Prike.Liang at amd.com>; Zhou, David(ChunMing) 
>>> <David1.Zhou at amd.com>
>>> > Subject: [PATCH] ttm: wait mem space if user allow while gpu busy
>>> >
>>> > heavy gpu job could occupy memory long time, which could lead to 
>>> other user fail to get memory.
>>> >
>>> > Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11
>>> > Signed-off-by: Chunming Zhou <david1.zhou at amd.com>
>>> > ---
>>> >   drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++--
>>> >   1 file changed, 4 insertions(+), 2 deletions(-)
>>> >
>>> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>> b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644
>>> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> > @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct 
>>> ttm_buffer_object *bo,
>>> >                if (mem->mm_node)
>>> >                        break;
>>> >                ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
>>> > -             if (unlikely(ret != 0))
>>> > -                     return ret;
>>> > +             if (unlikely(ret != 0)) {
>>> > +                     if (!ctx || ctx->no_wait_gpu || ret != -EBUSY)
>>> > +                             return ret;
>>> > +             }
>>> >        } while (1);
>>> >        mem->mem_type = mem_type;
>>> >        return ttm_bo_add_move_fence(bo, man, mem);
>>> > --
>>> > 2.17.1
>>> >
>>> > _______________________________________________
>>> > dri-devel mailing list
>>> > dri-devel at lists.freedesktop.org
>>> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190424/d1b59cdd/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ttm-return-EBUSY-to-user.patch
Type: text/x-patch
Size: 944 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190424/d1b59cdd/attachment-0001.bin>


More information about the dri-devel mailing list