Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

Thu Sep 14 13:59:01 UTC 2023

Am 14.09.23 um 15:37 schrieb Felix Kuehling:
>
> Userptr and SVM restore work is scheduled to the system WQ with 
> schedule_delayed_work. See amdgpu_amdkfd_evict_userptr and 
> svm_range_evict. This would need to use queue_delayed_work with the 
> system_freezable_wq.
>
> BO restoration is scheduled with queue_delayed_work on our own 
> kfd_restore_wq that was allocated with alloc_ordered_workqueue. This 
> would need to add the WQ_FREEZABLE flag when we create the wq in 
> kfd_process_create_wq.
>
> There is also evict_process_worker scheduled with 
> schedule_delayed_work, which handles stopping of user mode queues, 
> signaling of eviction fences and scheduling of restore work when BOs 
> are evicted. I think that should not be freezable because it's needed 
> to signal the eviction fences to allow suspend to evict BOs.
>
> To make sure I'm not misunderstanding, I assume that freezing a 
> freezable workqueue flushes work items in progress and prevents 
> execution of more work until it is unfrozen. I assume work items are 
> not frozen in the middle of execution, because that would not solve 
> the problem.
>

I was wondering the exact same thing and to be honest I don't know that 
detail either and of hand can't find any documentation about it.

My suspicion is that a work item can freeze when it calls schedule(), 
e.g. when taking a look or similar.

That would then indeed not work at all and we would need to make sure 
that the work is completed manually somehow.

Regards,
Christian.

> Regards,
>   Felix
>
>
> On 2023-09-14 2:23, Christian König wrote:
>> [putting Harry on BCC, sorry for the noise]
>>
>> Yeah, that is clearly a bug in the KFD.
>>
>> During the second eviction the hw should already be disabled, so we 
>> don't have any SDMA or similar to evict BOs any more and can only 
>> copy them with the CPU.
>>
>> @Felix what workqueue do you guys use for the restore work? I've just 
>> double checked and on the system workqueues you explicitly need to 
>> specify that stuff is freezable. E.g. use system_freezable_wq instead 
>> of system_wq.
>>
>> Alternatively as Xinhui mentioned it might be necessary to flush all 
>> restore work before the first eviction phase or we have the chance 
>> that BOs are moved back into VRAM again.
>>
>> Regards,
>> Christian.
>>
>> Am 14.09.23 um 03:54 schrieb Pan, Xinhui:
>>>
>>> [AMD Official Use Only - General]
>>>
>>>
>>> I just make one debug patch to show busy BO’s alloc-trace when the 
>>> eviction fails in suspend.
>>>
>>> And dmesg log attached.
>>>
>>> Looks like they are just kfd user Bos and locked by evict/restore work.
>>>
>>> So in kfd suspend callback, it really need to flush the 
>>> evict/restore work before HW fini as it do now.
>>>
>>> That is why the first very early eviction fails and the second 
>>> eviction succeed.
>>>
>>> Thanks
>>>
>>> xinhui
>>>
>>> *From:* Pan, Xinhui
>>> *Sent:* Thursday, September 14, 2023 8:02 AM
>>> *To:* Koenig, Christian <Christian.Koenig at amd.com>; Kuehling, Felix 
>>> <Felix.Kuehling at amd.com>; Christian König 
>>> <ckoenig.leichtzumerken at gmail.com>; amd-gfx at lists.freedesktop.org; 
>>> Wentland, Harry <Harry.Wentland at amd.com>
>>> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Fan, Shikang 
>>> <Shikang.Fan at amd.com>
>>> *Subject:* RE: 回复: [PATCH] drm/amdgpu: Ignore first evction failure 
>>> during suspend
>>>
>>> Chris,
>>>
>>> I can dump these busy BOs with their alloc/free stack later today.
>>>
>>> BTW, the two evictions and the kfd suspend are all called before 
>>> hw_fini. IOW, between phase 1 and phase 2. SDMA is turned only in 
>>> phase2. So current code works fine maybe.
>>>
>>> *From:* Koenig, Christian <Christian.Koenig at amd.com>
>>> *Sent:* Wednesday, September 13, 2023 10:29 PM
>>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>; Christian König 
>>> <ckoenig.leichtzumerken at gmail.com>; Pan, Xinhui 
>>> <Xinhui.Pan at amd.com>; amd-gfx at lists.freedesktop.org; Wentland, Harry 
>>> <Harry.Wentland at amd.com>
>>> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Fan, Shikang 
>>> <Shikang.Fan at amd.com>
>>> *Subject:* Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure 
>>> during suspend
>>>
>>> [+Harry]
>>>
>>> Am 13.09.23 um 15:54 schrieb Felix Kuehling:
>>>
>>>     On 2023-09-13 4:07, Christian König wrote:
>>>
>>>         [+Fleix]
>>>
>>>         Well that looks like quite a serious bug.
>>>
>>>         If I'm not completely mistaken the KFD work item tries to
>>>         restore the process by moving BOs into memory even after the
>>>         suspend freeze. Normally work items are frozen together with
>>>         the user space processes unless explicitly marked as not
>>>         freezable.
>>>
>>>         That this causes problem during the first eviction phase is
>>>         just the tip of the iceberg here. If a BO is moved into
>>>         invisible memory during this we wouldn't be able to get it
>>>         out of that in the second phase because SDMA and hw is
>>>         already turned off.
>>>
>>>         @Felix any idea how that can happen? Have you guys marked a
>>>         work item / work queue as not freezable?
>>>
>>>     We don't set anything to non-freezable in KFD.
>>>
>>>     Regards,
>>>       Felix
>>>
>>>         Or maybe the display guys?
>>>
>>>
>>> Do you guys in the display do any delayed update in a work item 
>>> which is marked as not-freezable?
>>>
>>> Otherwise I have absolutely no idea what's going on here.
>>>
>>> Thanks,
>>> Christian.
>>>
>>>
>>>         @Xinhui please investigate what work item that is and where
>>>         that is coming from. Something like "if (adev->in_suspend)
>>>         dump_stack();" in the right place should probably do it.
>>>
>>>         Thanks,
>>>         Christian.
>>>
>>>         Am 13.09.23 um 07:13 schrieb Pan, Xinhui:
>>>
>>>             [AMD Official Use Only - General]
>>>
>>>             I notice that only user space process are frozen on my
>>>             side.  kthread and workqueue keeps running. Maybe some
>>>             kernel configs are not enabled.
>>>
>>>             I made one module which just prints something like i++
>>>             with mutex lock both in workqueue and kthread. I paste
>>>             some logs below.
>>>
>>>             [438619.696196] XH: 14 from workqueue
>>>
>>>             [438619.700193] XH: 15 from kthread
>>>
>>>             [438620.394335] PM: suspend entry (deep)
>>>
>>>             [438620.399619] Filesystems sync: 0.001 seconds
>>>
>>>             [438620.403887] PM: Preparing system for sleep (deep)
>>>
>>>             [438620.409299] Freezing user space processes
>>>
>>>             [438620.414862] Freezing user space processes completed
>>>             (elapsed 0.001 seconds)
>>>
>>>             [438620.421881] OOM killer disabled.
>>>
>>>             [438620.425197] Freezing remaining freezable tasks
>>>
>>>             [438620.430890] Freezing remaining freezable tasks
>>>             completed (elapsed 0.001 seconds)
>>>
>>>             [438620.438348] PM: Suspending system (deep)
>>>
>>>             .....
>>>
>>>             [438623.746038] PM: suspend of devices complete after
>>>             3303.137 msecs
>>>
>>>             [438623.752125] PM: start suspend of devices complete
>>>             after 3309.713 msecs
>>>
>>>             [438623.758722] PM: suspend debug: Waiting for 5 second(s).
>>>
>>>             [438623.792166] XH: 22 from kthread
>>>
>>>             [438623.824140] XH: 23 from workqueue
>>>
>>>             So BOs definitely can be in use during suspend.
>>>
>>>             Even if kthread or workqueue can be stopped with one
>>>             special kernel config. I think suspend can only stop the
>>>             workqueue with its callback finish.
>>>
>>>             otherwise something like below makes things crazy.
>>>
>>>             LOCK BO
>>>
>>>             do something
>>>
>>>             -> schedule or wait, anycode might sleep.  Stopped by
>>>             suspend now? no, i think.
>>>
>>>             UNLOCK BO
>>>
>>>             I do tests  with  cmds below.
>>>
>>>             echo devices  > /sys/power/pm_test
>>>
>>>             echo 0  > /sys/power/pm_async
>>>
>>>             echo 1  > /sys/power/pm_print_times
>>>
>>>             echo 1 > /sys/power/pm_debug_messages
>>>
>>>             echo 1 > /sys/module/amdgpu/parameters/debug_evictions
>>>
>>>             ./kfd.sh --gtest_filter=KFDEvictTest.BasicTest
>>>
>>>             pm-suspend
>>>
>>>             thanks
>>>
>>>             xinhui
>>>
>>>             ------------------------------------------------------------------------
>>>
>>>             *发件人:*Christian König <ckoenig.leichtzumerken at gmail.com>
>>>             <mailto:ckoenig.leichtzumerken at gmail.com>
>>>             *发送时间:*2023年9月12日17:01
>>>             *收件人:*Pan, Xinhui <Xinhui.Pan at amd.com>
>>>             <mailto:Xinhui.Pan at amd.com>;
>>>             amd-gfx at lists.freedesktop.org
>>>             <amd-gfx at lists.freedesktop.org>
>>>             <mailto:amd-gfx at lists.freedesktop.org>
>>>             *抄送:*Deucher, Alexander <Alexander.Deucher at amd.com>
>>>             <mailto:Alexander.Deucher at amd.com>; Koenig, Christian
>>>             <Christian.Koenig at amd.com>
>>>             <mailto:Christian.Koenig at amd.com>; Fan, Shikang
>>>             <Shikang.Fan at amd.com> <mailto:Shikang.Fan at amd.com>
>>>             *主题:*Re: [PATCH] drm/amdgpu: Ignore first evction
>>>             failure during suspend
>>>
>>>             When amdgpu_device_suspend() is called processes should
>>>             be frozen
>>>             already. In other words KFD queues etc... should already
>>>             be idle.
>>>
>>>             So when the eviction fails here we missed something
>>>             previously and that
>>>             in turn can cause tons amount of problems.
>>>
>>>             So ignoring those errors is most likely not a good idea
>>>             at all.
>>>
>>>             Regards,
>>>             Christian.
>>>
>>>             Am 12.09.23 um 02:21 schrieb Pan, Xinhui:
>>>             > [AMD Official Use Only - General]
>>>             >
>>>             > Oh yep, Pinned BO is moved to other LRU list, So
>>>             eviction fails because of other reason.
>>>             > I will change the comments in the patch.
>>>             > The problem is eviction fails as many reasons, say, BO
>>>             is locked.
>>>             > ASAIK, kfd will stop the queues and flush some
>>>             evict/restore work in its suspend callback. SO the first
>>>             eviction before kfd callback likely fails.
>>>             >
>>>             > -----Original Message-----
>>>             > From: Christian König
>>>             <ckoenig.leichtzumerken at gmail.com>
>>>             <mailto:ckoenig.leichtzumerken at gmail.com>
>>>             > Sent: Friday, September 8, 2023 2:49 PM
>>>             > To: Pan, Xinhui <Xinhui.Pan at amd.com>
>>>             <mailto:Xinhui.Pan at amd.com>; amd-gfx at lists.freedesktop.org
>>>             > Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
>>>             <mailto:Alexander.Deucher at amd.com>; Koenig, Christian
>>>             <Christian.Koenig at amd.com>
>>>             <mailto:Christian.Koenig at amd.com>; Fan, Shikang
>>>             <Shikang.Fan at amd.com> <mailto:Shikang.Fan at amd.com>
>>>             > Subject: Re: [PATCH] drm/amdgpu: Ignore first evction
>>>             failure during suspend
>>>             >
>>>             > Am 08.09.23 um 05:39 schrieb xinhui pan:
>>>             >> Some BOs might be pinned. So the first eviction's
>>>             failure will abort
>>>             >> the suspend sequence. These pinned BOs will be
>>>             unpined afterwards
>>>             >> during suspend.
>>>             > That doesn't make much sense since pinned BOs don't
>>>             cause eviction failure here.
>>>             >
>>>             > What exactly is the error code you see?
>>>             >
>>>             > Christian.
>>>             >
>>>             >> Actaully it has evicted most BOs, so that should stil
>>>             work fine in
>>>             >> sriov full access mode.
>>>             >>
>>>             >> Fixes: 47ea20762bb7 ("drm/amdgpu: Add an extra
>>>             evict_resource call
>>>             >> during device_suspend.")
>>>             >> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
>>>             <mailto:xinhui.pan at amd.com>
>>>             >> ---
>>>             >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++----
>>>             >>    1 file changed, 5 insertions(+), 4 deletions(-)
>>>             >>
>>>             >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>             >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>             >> index 5c0e2b766026..39af526cdbbe 100644
>>>             >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>             >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>             >> @@ -4148,10 +4148,11 @@ int
>>>             amdgpu_device_suspend(struct drm_device
>>>             >> *dev, bool fbcon)
>>>             >>
>>>             >>        adev->in_suspend = true;
>>>             >>
>>>             >> -     /* Evict the majority of BOs before grabbing
>>>             the full access */
>>>             >> -     r = amdgpu_device_evict_resources(adev);
>>>             >> -     if (r)
>>>             >> -             return r;
>>>             >> +     /* Try to evict the majority of BOs before
>>>             grabbing the full access
>>>             >> +      * Ignore the ret val at first place as we will
>>>             unpin some BOs if any
>>>             >> +      * afterwards.
>>>             >> +      */
>>>             >> + (void)amdgpu_device_evict_resources(adev);
>>>             >>
>>>             >>        if (amdgpu_sriov_vf(adev)) {
>>>             >> amdgpu_virt_fini_data_exchange(adev);
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20230914/b5485b06/attachment-0001.htm>