[PATCH] ttm: wait mem space if user allow while gpu busy
Christian König
ckoenig.leichtzumerken at gmail.com
Wed Apr 24 07:12:20 UTC 2019
> how about just adding a wrapper for pin function as below?
I considered this as well and don't think it will work reliable.
We could use it as a band aid for this specific problem, but in general
we need to improve the handling in TTM to resolve those kind of resource
conflicts.
Regards,
Christian.
Am 23.04.19 um 17:09 schrieb Zhou, David(ChunMing):
> >3. If we have a ticket we grab a reference to the first BO on the
> LRU, drop the LRU lock and try to grab the reservation lock with the
> ticket.
>
> The BO on LRU is already locked by cs user, can it be dropped here by
> DC user? and then DC user grab its lock with ticket, how does CS grab
> it again?
>
> If you think waiting in ttm has this risk, how about just adding a
> wrapper for pin function as below?
> amdgpu_get_pin_bo_timeout()
> {
> do {
> amdgpo_bo_reserve();
> r=amdgpu_bo_pin();
>
> if(!r)
> break;
> amdgpu_bo_unreserve();
> timeout--;
>
> } while(timeout>0);
>
> }
>
> -------- Original Message --------
> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
> From: Christian König
> To: "Zhou, David(ChunMing)" ,"Koenig, Christian" ,"Liang, Prike"
> ,dri-devel at lists.freedesktop.org
> CC:
>
> Well that's not so easy of hand.
>
> The basic problem here is that when you busy wait at this place you
> can easily run into situations where application A busy waits for B
> while B busy waits for A -> deadlock.
>
> So what we need here is the deadlock detection logic of the ww_mutex.
> To use this we at least need to do the following steps:
>
> 1. Reserve the BO in DC using a ww_mutex ticket (trivial).
>
> 2. If we then run into this EBUSY condition in TTM check if the BO we
> need memory for (or rather the ww_mutex of its reservation object) has
> a ticket assigned.
>
> 3. If we have a ticket we grab a reference to the first BO on the LRU,
> drop the LRU lock and try to grab the reservation lock with the ticket.
>
> 4. If getting the reservation lock with the ticket succeeded we check
> if the BO is still the first one on the LRU in question (the BO could
> have moved).
>
> 5. If the BO is still the first one on the LRU in question we try to
> evict it as we would evict any other BO.
>
> 6. If any of the "If's" above fail we just back off and return -EBUSY.
>
> Steps 2-5 are certainly not trivial, but doable as far as I can see.
>
> Have fun :)
> Christian.
>
> Am 23.04.19 um 15:19 schrieb Zhou, David(ChunMing):
>> How about adding more condition ctx->resv inline to address your
>> concern? As well as don't wait from same user, shouldn't lead to
>> deadlock.
>>
>> Otherwise, any other idea?
>>
>> -------- Original Message --------
>> Subject: Re: [PATCH] ttm: wait mem space if user allow while gpu busy
>> From: Christian König
>> To: "Liang, Prike" ,"Zhou, David(ChunMing)"
>> ,dri-devel at lists.freedesktop.org
>> CC:
>>
>> Well that is certainly a NAK because it can lead to deadlock in the
>> memory management.
>>
>> You can't just busy wait with all those locks held.
>>
>> Regards,
>> Christian.
>>
>> Am 23.04.19 um 03:45 schrieb Liang, Prike:
>> > Acked-by: Prike Liang <Prike.Liang at amd.com>
>> >
>> > Thanks,
>> > Prike
>> > -----Original Message-----
>> > From: Chunming Zhou <david1.zhou at amd.com>
>> > Sent: Monday, April 22, 2019 6:39 PM
>> > To: dri-devel at lists.freedesktop.org
>> > Cc: Liang, Prike <Prike.Liang at amd.com>; Zhou, David(ChunMing)
>> <David1.Zhou at amd.com>
>> > Subject: [PATCH] ttm: wait mem space if user allow while gpu busy
>> >
>> > heavy gpu job could occupy memory long time, which could lead to
>> other user fail to get memory.
>> >
>> > Change-Id: I0b322d98cd76e5ac32b00462bbae8008d76c5e11
>> > Signed-off-by: Chunming Zhou <david1.zhou at amd.com>
>> > ---
>> > drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++--
>> > 1 file changed, 4 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>> b/drivers/gpu/drm/ttm/ttm_bo.c index 7c484729f9b2..6c596cc24bec 100644
>> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> > @@ -830,8 +830,10 @@ static int ttm_bo_mem_force_space(struct
>> ttm_buffer_object *bo,
>> > if (mem->mm_node)
>> > break;
>> > ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);
>> > - if (unlikely(ret != 0))
>> > - return ret;
>> > + if (unlikely(ret != 0)) {
>> > + if (!ctx || ctx->no_wait_gpu || ret != -EBUSY)
>> > + return ret;
>> > + }
>> > } while (1);
>> > mem->mem_type = mem_type;
>> > return ttm_bo_add_move_fence(bo, man, mem);
>> > --
>> > 2.17.1
>> >
>> > _______________________________________________
>> > dri-devel mailing list
>> > dri-devel at lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190424/1bbdf6c9/attachment.html>
More information about the dri-devel
mailing list