[PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
Pan, Xinhui
Xinhui.Pan at amd.com
Fri Apr 10 01:21:19 UTC 2020
> 2020年4月9日 22:59,Koenig, Christian <Christian.Koenig at amd.com> 写道:
>
>> Why we break out the loops when there are pending bos to be released?
>
> We do this anyway if we can't acquire the necessary locks. Freeing already deleted BOs is just a very lazy background work.
That is true. eviction will reclaim the BO resource too.
>
>> So it did not break anything with this patch I think.
>
> Oh, the patch will certainly work. I'm just not sure if it's the ideal behavior.
>
>> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026
>>
>> This is another example of the usage of cond_sched.
>
> Yes, and that is also a good example of what I mean here:
>
>> if (!mutex_trylock(&slab_mutex))
>>
>>
>> /* Give up. Setup the next iteration. */
>>
>>
>> goto out;
>
> If the function can't acquire the lock immediately it gives up and waits for the next iteration.
>
> I think it would be better if we do this in TTM as well if we spend to much time cleaning up old BOs.
fair enough.
>
> On the other hand you are right that cond_resched() has the advantage that we could spend more time on cleaning up old BOs if there is nothing else for the CPU TODO.
>
> Regards,
> Christian.
>
> Am 09.04.20 um 16:24 schrieb Pan, Xinhui:
>> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026
>>
>> This is another example of the usage of cond_sched.
>> From: Pan, Xinhui <Xinhui.Pan at amd.com>
>> Sent: Thursday, April 9, 2020 10:11:08 PM
>> To: Lucas Stach <l.stach at pengutronix.de>; amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>; Koenig, Christian <Christian.Koenig at amd.com>
>> Cc: dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>
>> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
>>
>> I think it doesn't matter if workitem schedule out. Even we did not schedule out, the workqueue itself will schedule out later.
>> So it did not break anything with this patch I think.
>> From: Pan, Xinhui <Xinhui.Pan at amd.com>
>> Sent: Thursday, April 9, 2020 10:07:09 PM
>> To: Lucas Stach <l.stach at pengutronix.de>; amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>; Koenig, Christian <Christian.Koenig at amd.com>
>> Cc: dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>
>> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
>>
>> Why we break out the loops when there are pending bos to be released?
>>
>> And I just checked the process_one_work. Right after the work item callback is called, the workqueue itself will call cond_resched. So I think
>> From: Koenig, Christian <Christian.Koenig at amd.com>
>> Sent: Thursday, April 9, 2020 9:38:24 PM
>> To: Lucas Stach <l.stach at pengutronix.de>; Pan, Xinhui <Xinhui.Pan at amd.com>; amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
>> Cc: dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>
>> Subject: Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker
>>
>> Am 09.04.20 um 15:25 schrieb Lucas Stach:
>> > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König:
>> >> Am 09.04.20 um 03:31 schrieb xinhui pan:
>> >>> The delayed delete list is per device which might be very huge. And in
>> >>> a heavy workload test, the list might always not be empty. That will
>> >>> trigger any RCU stall warnings or softlockups in non-preemptible kernels
>> >>> Lets do schedule out if possible in that case.
>> >> Mhm, I'm not sure if that is actually allowed. This is called from a
>> >> work item and those are not really supposed to be scheduled away.
>> > Huh? Workitems can schedule out just fine, otherwise they would be
>> > horribly broken when it comes to sleeping locks.
>>
>> Let me refine the sentence: Work items are not really supposed to be
>> scheduled purposely. E.g. you shouldn't call schedule() or
>> cond_resched() like in the case here.
>>
>> Getting scheduled away because we wait for a lock is of course perfectly
>> fine.
>>
>> > The workqueue code
>> > even has measures to keep the workqueues at the expected concurrency
>> > level by starting other workitems when one of them goes to sleep.
>>
>> Yeah, and exactly that's what I would say we should avoid here :)
>>
>> In other words work items can be scheduled away, but they should not if
>> not really necessary (e.g. waiting for a lock).
>>
>> Otherwise as you said new threads for work item processing are started
>> up and I don't think we want that.
>>
>> Just returning from the work item and waiting for the next cycle is most
>> likely the better option.
>>
>> Regards,
>> Christian.
>>
>> >
>> > Regards,
>> > Lucas
>> >
>> >> Maybe rather change the while into while (!list_empty(&bdev->ddestroy)
>> >> && !should_reschedule(0)).
>> >>
>> >> Christian.
>> >>
>> >>> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
>> >>> ---
>> >>> drivers/gpu/drm/ttm/ttm_bo.c | 1 +
>> >>> 1 file changed, 1 insertion(+)
>> >>>
>> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> >>> index 9e07c3f75156..b8d853cab33b 100644
>> >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> >>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
>> >>> }
>> >>>
>> >>> ttm_bo_put(bo);
>> >>> + cond_resched();
>> >>> spin_lock(&glob->lru_lock);
>> >>> }
>> >>> list_splice_tail(&removed, &bdev->ddestroy);
>> >> _______________________________________________
>> >> dri-devel mailing list
>> >> dri-devel at lists.freedesktop.org
>> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868&sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3D&reserved=0
>>
>
More information about the amd-gfx
mailing list