[PATCH] ttm: wait mem space if user allow while gpu busy

zhoucm1 zhoucm1 at amd.com
Wed Apr 24 07:59:33 UTC 2019


how about new attached?


-David


On 2019年04月24日 15:30, Christian König wrote:
> That would change the semantics of ttm_bo_mem_space() and so could 
> change the return code in an IOCTL as well. Not a good idea, cause 
> that could have a lot of side effects.
>
> Instead in the calling DC code you could check if you get an -ENOMEM 
> and then call schedule().
>
> If after the schedule() we see that we have now BOs on the LRU we can 
> try again and see if pinning the frame buffer now succeeds.
>
> Christian.
>
> Am 24.04.19 um 09:17 schrieb zhoucm1:
>>
>> I made a patch as attached.
>>
>> I'm not sure how to make patch as your proposal, Could you make a 
>> patch for that if mine isn't enough?
>>
>> -David
>>
>> On 2019年04月24日 15:12, Christian König wrote:
>>>> how about just adding a wrapper for pin function as below?
>>> I considered this as well and don't think it will work reliable.
>>>
>>> We could use it as a band aid for this specific problem, but in 
>>> general we need to improve the handling in TTM to resolve those kind 
>>> of resource conflicts.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 23.04.19 um 17:09 schrieb Zhou, David(ChunMing):
>>>> >3. If we have a ticket we grab a reference to the first BO on the 
>>>> LRU, drop the LRU lock and try to grab the reservation lock with 
>>>> the ticket.
>>>>
>>>> The BO on LRU is already locked by cs user, can it be dropped here 
>>>> by DC user? and then DC user grab its lock with ticket, how does CS 
>>>> grab it again?
>>>>
>>>> If you think waiting in ttm has this risk, how about just adding a 
>>>> wrapper for pin function as below?
>>>> amdgpu_get_pin_bo_timeout()
>>>> {
>>>> do {
>>>> amdgpo_bo_reserve();
>>>> r=amdgpu_bo_pin();
>>>>
>>>> if(!r)
>>>>         break;
>>>> amdgpu_bo_unreserve();
>>>> timeout--;
>>>>
>>>> } while(timeout>0);
>>>>
>>>> }
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
>>>> From: Christian König
>>>> To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Liang, Prike" 
>>>> ,dri-devel at lists.freedesktop.org
>>>> CC:
>>>>
>>>> Well that's not so easy of hand.
>>>>
>>>> The basic problem here is that when you busy wait at this place you 
>>>> can easily run into situations where application A busy waits for B 
>>>> while B busy waits for A -> deadlock.
>>>>
>>>> So what we need here is the deadlock detection logic of the 
>>>> ww_mutex. To use this we at least need to do the following steps:
>>>>
>>>> 1. Reserve the BO in DC using a ww_mutex ticket (trivial).
>>>>
>>>> 2. If we then run into this EBUSY condition in TTM check if the BO 
>>>> we need memory for (or rather the ww_mutex of its reservation 
>>>> object) has a ticket assigned.
>>>>
>>>> 3. If we have a ticket we grab a reference to the first BO on the 
>>>> LRU, drop the LRU lock and try to grab the reservation lock with 
>>>> the ticket.
>>>>
>>>> 4. If getting the reservation lock with the ticket succeeded we 
>>>> check if the BO is still the first one on the LRU in question (the 
>>>> BO could have moved).
>>>>
>>>> 5. If the BO is still the first one on the LRU in question we try 
>>>> to evict it as we would evict any other BO.
>>>>
>>>> 6. If any of the "If's" above fail we just back off and return -EBUSY.
>>>>
>>>> Steps 2-5 are certainly not trivial, but doable as far as I can see.
>>>>
>>>> Have fun :)
>>>> Christian.
>>>>
>>>> Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing):
>>>>> How about adding more condition ctx->resv inline to address your 
>>>>> concern? As well as don't wait from same user, shouldn't lead to 
>>>>> deadlock.
>>>>>
>>>>> Otherwise, any other idea?
>>>>>
>>>>> -------- Original Message --------
>>>>> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
>>>>> From: Christian König
>>>>> To: "Liang, Prike" ,"Zhou, David(ChunMing)" 
>>>>> ,dri-devel at lists.freedesktop.org
>>>>> CC:
>>>>>
>>>>> Well that is certainly a NAK because it can lead to deadlock in the
>>>>> memory management.
>>>>>
>>>>> You can't just busy wait with all those locks held.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 23.04.19 um 03:45 schrieb Liang, Prike:
>>>>> > Acked-by: Prike Liang <Prike.Liang at amd.com>
>>>>> >
>>>>> > Thanks,
>>>>> > Prike
>>>>> > -----Original Message-----
>>>>> > From: Chunming Zhou <david1.zhou at amd.com>
>>>>> > Sent: Monday, April 22, 2019 6:39 PM
>>>>> > To: dri-devel at lists.freedesktop.org
>>>>> > Cc: Liang, Prike <Prike.Liang at amd.com>; Zhou, David(ChunMing) 
>>>>> <David1.Zhou at amd.com>
>>>>> > Subject: [PATCH] ttm: wait mem space if user allow while gpu busy
>>>>> >
>>>>> > heavy gpu job could occupy memory long time, which could lead to 
>>>>> other user fail to get memory.
>>>>> >
>>>>> > Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11
>>>>> > Signed-off-by: Chunming Zhou <david1.zhou at amd.com>
>>>>> > ---
>>>>> >   drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++--
>>>>> >   1 file changed, 4 insertions(+), 2 deletions(-)
>>>>> >
>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644
>>>>> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> > @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct 
>>>>> ttm_buffer_object *bo,
>>>>> >                if (mem->mm_node)
>>>>> >                        break;
>>>>> >                ret = ttm_mem_evict_first(bdev, mem_type, place, 
>>>>> ctx);
>>>>> > -             if (unlikely(ret != 0))
>>>>> > -                     return ret;
>>>>> > +             if (unlikely(ret != 0)) {
>>>>> > +                     if (!ctx || ctx->no_wait_gpu || ret != -EBUSY)
>>>>> > +                             return ret;
>>>>> > +             }
>>>>> >        } while (1);
>>>>> >        mem->mem_type = mem_type;
>>>>> >        return ttm_bo_add_move_fence(bo, man, mem);
>>>>> > --
>>>>> > 2.17.1
>>>>> >
>>>>> > _______________________________________________
>>>>> > dri-devel mailing list
>>>>> > dri-devel at lists.freedesktop.org
>>>>> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> dri-devel mailing list
>>>>> dri-devel at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>>
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190424/e40cf1f9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-amdgpu-wait-memory-idle-when-fb-is-pinned-fail.patch
Type: text/x-patch
Size: 2292 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190424/e40cf1f9/attachment-0001.bin>


More information about the dri-devel mailing list