[PATCH] drm/ttm: Schedule out if possibe in bo delayed delete worker

Christian König christian.koenig at amd.com
Thu Apr 9 14:59:23 UTC 2020


> Why we break out the loops when there are pending bos to be released?

We do this anyway if we can't acquire the necessary locks. Freeing 
already deleted BOs is just a very lazy background work.

> So it did not break anything with this patch I think.

Oh, the patch will certainly work. I'm just not sure if it's the ideal 
behavior.

> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026
>
> This is another example of the usage of  cond_sched.

Yes, and that is also a good example of what I mean here:

> 	if  (!mutex_trylock 
> <https://elixir.bootlin.com/linux/latest/ident/mutex_trylock>(&slab_mutex 
> <https://elixir.bootlin.com/linux/latest/ident/slab_mutex>))
> 		/* Give up. Setup the next iteration. */
> 		goto  out;

If the function can't acquire the lock immediately it gives up and waits 
for the next iteration.

I think it would be better if we do this in TTM as well if we spend to 
much time cleaning up old BOs.

On the other hand you are right that cond_resched() has the advantage 
that we could spend more time on cleaning up old BOs if there is nothing 
else for the CPU TODO.

Regards,
Christian.

Am 09.04.20 um 16:24 schrieb Pan, Xinhui:
> https://elixir.bootlin.com/linux/latest/source/mm/slab.c#L4026
>
> This is another example of the usage of  cond_sched.
> ------------------------------------------------------------------------
> *From:* Pan, Xinhui <Xinhui.Pan at amd.com>
> *Sent:* Thursday, April 9, 2020 10:11:08 PM
> *To:* Lucas Stach <l.stach at pengutronix.de>; 
> amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>; Koenig, 
> Christian <Christian.Koenig at amd.com>
> *Cc:* dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed 
> delete worker
> I think it doesn't matter if workitem schedule out. Even we did not 
> schedule out, the workqueue itself will schedule out later.
> So it did not break anything with this patch I think.
> ------------------------------------------------------------------------
> *From:* Pan, Xinhui <Xinhui.Pan at amd.com>
> *Sent:* Thursday, April 9, 2020 10:07:09 PM
> *To:* Lucas Stach <l.stach at pengutronix.de>; 
> amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>; Koenig, 
> Christian <Christian.Koenig at amd.com>
> *Cc:* dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed 
> delete worker
> Why we break out the loops when there are pending bos to be released?
>
> And I just checked the process_one_work. Right after the work item 
> callback is called,  the workqueue itself will call cond_resched. So I 
> think
> ------------------------------------------------------------------------
> *From:* Koenig, Christian <Christian.Koenig at amd.com>
> *Sent:* Thursday, April 9, 2020 9:38:24 PM
> *To:* Lucas Stach <l.stach at pengutronix.de>; Pan, Xinhui 
> <Xinhui.Pan at amd.com>; amd-gfx at lists.freedesktop.org 
> <amd-gfx at lists.freedesktop.org>
> *Cc:* dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: Schedule out if possibe in bo delayed 
> delete worker
> Am 09.04.20 um 15:25 schrieb Lucas Stach:
> > Am Donnerstag, den 09.04.2020, 14:35 +0200 schrieb Christian König:
> >> Am 09.04.20 um 03:31 schrieb xinhui pan:
> >>> The delayed delete list is per device which might be very huge. And in
> >>> a heavy workload test, the list might always not be empty. That will
> >>> trigger any RCU stall warnings or softlockups in non-preemptible 
> kernels
> >>> Lets do schedule out if possible in that case.
> >> Mhm, I'm not sure if that is actually allowed. This is called from a
> >> work item and those are not really supposed to be scheduled away.
> > Huh? Workitems can schedule out just fine, otherwise they would be
> > horribly broken when it comes to sleeping locks.
>
> Let me refine the sentence: Work items are not really supposed to be
> scheduled purposely. E.g. you shouldn't call schedule() or
> cond_resched() like in the case here.
>
> Getting scheduled away because we wait for a lock is of course perfectly
> fine.
>
> >   The workqueue code
> > even has measures to keep the workqueues at the expected concurrency
> > level by starting other workitems when one of them goes to sleep.
>
> Yeah, and exactly that's what I would say we should avoid here :)
>
> In other words work items can be scheduled away, but they should not if
> not really necessary (e.g. waiting for a lock).
>
> Otherwise as you said new threads for work item processing are started
> up and I don't think we want that.
>
> Just returning from the work item and waiting for the next cycle is most
> likely the better option.
>
> Regards,
> Christian.
>
> >
> > Regards,
> > Lucas
> >
> >> Maybe rather change the while into while (!list_empty(&bdev->ddestroy)
> >> && !should_reschedule(0)).
> >>
> >> Christian.
> >>
> >>> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
> >>> ---
> >>>    drivers/gpu/drm/ttm/ttm_bo.c | 1 +
> >>>    1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
> b/drivers/gpu/drm/ttm/ttm_bo.c
> >>> index 9e07c3f75156..b8d853cab33b 100644
> >>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> >>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> >>> @@ -541,6 +541,7 @@ static bool ttm_bo_delayed_delete(struct 
> ttm_bo_device *bdev, bool remove_all)
> >>>              }
> >>>
> >>>              ttm_bo_put(bo);
> >>> +           cond_resched();
> >>> spin_lock(&glob->lru_lock);
> >>>      }
> >>>      list_splice_tail(&removed, &bdev->ddestroy);
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel at lists.freedesktop.org
> >> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C0a47486676a74702f05408d7dc89839c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220355504145868&sdata=wbRkYBPI6mYuZjKBtQN3AGLDOwqJlWY3XUtwwSiUQHg%3D&reserved=0
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20200409/5580c857/attachment.htm>


More information about the amd-gfx mailing list