Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure during suspend

Thu Sep 14 13:37:54 UTC 2023

Userptr and SVM restore work is scheduled to the system WQ with 
schedule_delayed_work. See amdgpu_amdkfd_evict_userptr and 
svm_range_evict. This would need to use queue_delayed_work with the 
system_freezable_wq.

BO restoration is scheduled with queue_delayed_work on our own 
kfd_restore_wq that was allocated with alloc_ordered_workqueue. This 
would need to add the WQ_FREEZABLE flag when we create the wq in 
kfd_process_create_wq.

There is also evict_process_worker scheduled with schedule_delayed_work, 
which handles stopping of user mode queues, signaling of eviction fences 
and scheduling of restore work when BOs are evicted. I think that should 
not be freezable because it's needed to signal the eviction fences to 
allow suspend to evict BOs.

To make sure I'm not misunderstanding, I assume that freezing a 
freezable workqueue flushes work items in progress and prevents 
execution of more work until it is unfrozen. I assume work items are not 
frozen in the middle of execution, because that would not solve the problem.

Regards,
   Felix

On 2023-09-14 2:23, Christian König wrote:
> [putting Harry on BCC, sorry for the noise]
>
> Yeah, that is clearly a bug in the KFD.
>
> During the second eviction the hw should already be disabled, so we 
> don't have any SDMA or similar to evict BOs any more and can only copy 
> them with the CPU.
>
> @Felix what workqueue do you guys use for the restore work? I've just 
> double checked and on the system workqueues you explicitly need to 
> specify that stuff is freezable. E.g. use system_freezable_wq instead 
> of system_wq.
>
> Alternatively as Xinhui mentioned it might be necessary to flush all 
> restore work before the first eviction phase or we have the chance 
> that BOs are moved back into VRAM again.
>
> Regards,
> Christian.
>
> Am 14.09.23 um 03:54 schrieb Pan, Xinhui:
>>
>> [AMD Official Use Only - General]
>>
>>
>> I just make one debug patch to show busy BO’s alloc-trace when the 
>> eviction fails in suspend.
>>
>> And dmesg log attached.
>>
>> Looks like they are just kfd user Bos and locked by evict/restore work.
>>
>> So in kfd suspend callback, it really need to flush the evict/restore 
>> work before HW fini as it do now.
>>
>> That is why the first very early eviction fails and the second 
>> eviction succeed.
>>
>> Thanks
>>
>> xinhui
>>
>> *From:* Pan, Xinhui
>> *Sent:* Thursday, September 14, 2023 8:02 AM
>> *To:* Koenig, Christian <Christian.Koenig at amd.com>; Kuehling, Felix 
>> <Felix.Kuehling at amd.com>; Christian König 
>> <ckoenig.leichtzumerken at gmail.com>; amd-gfx at lists.freedesktop.org; 
>> Wentland, Harry <Harry.Wentland at amd.com>
>> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Fan, Shikang 
>> <Shikang.Fan at amd.com>
>> *Subject:* RE: 回复: [PATCH] drm/amdgpu: Ignore first evction failure 
>> during suspend
>>
>> Chris,
>>
>> I can dump these busy BOs with their alloc/free stack later today.
>>
>> BTW, the two evictions and the kfd suspend are all called before 
>> hw_fini. IOW, between phase 1 and phase 2. SDMA is turned only in 
>> phase2. So current code works fine maybe.
>>
>> *From:* Koenig, Christian <Christian.Koenig at amd.com>
>> *Sent:* Wednesday, September 13, 2023 10:29 PM
>> *To:* Kuehling, Felix <Felix.Kuehling at amd.com>; Christian König 
>> <ckoenig.leichtzumerken at gmail.com>; Pan, Xinhui <Xinhui.Pan at amd.com>; 
>> amd-gfx at lists.freedesktop.org; Wentland, Harry <Harry.Wentland at amd.com>
>> *Cc:* Deucher, Alexander <Alexander.Deucher at amd.com>; Fan, Shikang 
>> <Shikang.Fan at amd.com>
>> *Subject:* Re: 回复: [PATCH] drm/amdgpu: Ignore first evction failure 
>> during suspend
>>
>> [+Harry]
>>
>> Am 13.09.23 um 15:54 schrieb Felix Kuehling:
>>
>>     On 2023-09-13 4:07, Christian König wrote:
>>
>>         [+Fleix]
>>
>>         Well that looks like quite a serious bug.
>>
>>         If I'm not completely mistaken the KFD work item tries to
>>         restore the process by moving BOs into memory even after the
>>         suspend freeze. Normally work items are frozen together with
>>         the user space processes unless explicitly marked as not
>>         freezable.
>>
>>         That this causes problem during the first eviction phase is
>>         just the tip of the iceberg here. If a BO is moved into
>>         invisible memory during this we wouldn't be able to get it
>>         out of that in the second phase because SDMA and hw is
>>         already turned off.
>>
>>         @Felix any idea how that can happen? Have you guys marked a
>>         work item / work queue as not freezable?
>>
>>     We don't set anything to non-freezable in KFD.
>>
>>     Regards,
>>       Felix
>>
>>         Or maybe the display guys?
>>
>>
>> Do you guys in the display do any delayed update in a work item which 
>> is marked as not-freezable?
>>
>> Otherwise I have absolutely no idea what's going on here.
>>
>> Thanks,
>> Christian.
>>
>>
>>         @Xinhui please investigate what work item that is and where
>>         that is coming from. Something like "if (adev->in_suspend)
>>         dump_stack();" in the right place should probably do it.
>>
>>         Thanks,
>>         Christian.
>>
>>         Am 13.09.23 um 07:13 schrieb Pan, Xinhui:
>>
>>             [AMD Official Use Only - General]
>>
>>             I notice that only user space process are frozen on my
>>             side.  kthread and workqueue  keeps running. Maybe some
>>             kernel configs are not enabled.
>>
>>             I made one module which just prints something like i++
>>             with mutex lock both in workqueue and kthread. I paste
>>             some logs below.
>>
>>             [438619.696196] XH: 14 from workqueue
>>
>>             [438619.700193] XH: 15 from kthread
>>
>>             [438620.394335] PM: suspend entry (deep)
>>
>>             [438620.399619] Filesystems sync: 0.001 seconds
>>
>>             [438620.403887] PM: Preparing system for sleep (deep)
>>
>>             [438620.409299] Freezing user space processes
>>
>>             [438620.414862] Freezing user space processes completed
>>             (elapsed 0.001 seconds)
>>
>>             [438620.421881] OOM killer disabled.
>>
>>             [438620.425197] Freezing remaining freezable tasks
>>
>>             [438620.430890] Freezing remaining freezable tasks
>>             completed (elapsed 0.001 seconds)
>>
>>             [438620.438348] PM: Suspending system (deep)
>>
>>             .....
>>
>>             [438623.746038] PM: suspend of devices complete after
>>             3303.137 msecs
>>
>>             [438623.752125] PM: start suspend of devices complete
>>             after 3309.713 msecs
>>
>>             [438623.758722] PM: suspend debug: Waiting for 5 second(s).
>>
>>             [438623.792166] XH: 22 from kthread
>>
>>             [438623.824140] XH: 23 from workqueue
>>
>>             So BOs definitely can be in use during suspend.
>>
>>             Even if kthread or workqueue can be stopped with one
>>             special kernel config. I think suspend can only stop the
>>             workqueue with its callback finish.
>>
>>             otherwise something like below makes things crazy.
>>
>>             LOCK BO
>>
>>             do something
>>
>>             -> schedule or wait, anycode might sleep. Stopped by
>>             suspend now? no, i think.
>>
>>             UNLOCK BO
>>
>>             I do tests  with  cmds below.
>>
>>             echo devices  > /sys/power/pm_test
>>
>>             echo 0  > /sys/power/pm_async
>>
>>             echo 1  > /sys/power/pm_print_times
>>
>>             echo 1 > /sys/power/pm_debug_messages
>>
>>             echo 1 > /sys/module/amdgpu/parameters/debug_evictions
>>
>>             ./kfd.sh --gtest_filter=KFDEvictTest.BasicTest
>>
>>             pm-suspend
>>
>>             thanks
>>
>>             xinhui
>>
>>             ------------------------------------------------------------------------
>>
>>             *发件人:*Christian König <ckoenig.leichtzumerken at gmail.com>
>>             <mailto:ckoenig.leichtzumerken at gmail.com>
>>             *发送时间:*2023年9月12日17:01
>>             *收件人:*Pan, Xinhui <Xinhui.Pan at amd.com>
>>             <mailto:Xinhui.Pan at amd.com>;
>>             amd-gfx at lists.freedesktop.org
>>             <amd-gfx at lists.freedesktop.org>
>>             <mailto:amd-gfx at lists.freedesktop.org>
>>             *抄送:*Deucher, Alexander <Alexander.Deucher at amd.com>
>>             <mailto:Alexander.Deucher at amd.com>; Koenig, Christian
>>             <Christian.Koenig at amd.com>
>>             <mailto:Christian.Koenig at amd.com>; Fan, Shikang
>>             <Shikang.Fan at amd.com> <mailto:Shikang.Fan at amd.com>
>>             *主题:*Re: [PATCH] drm/amdgpu: Ignore first evction failure
>>             during suspend
>>
>>             When amdgpu_device_suspend() is called processes should
>>             be frozen
>>             already. In other words KFD queues etc... should already
>>             be idle.
>>
>>             So when the eviction fails here we missed something
>>             previously and that
>>             in turn can cause tons amount of problems.
>>
>>             So ignoring those errors is most likely not a good idea
>>             at all.
>>
>>             Regards,
>>             Christian.
>>
>>             Am 12.09.23 um 02:21 schrieb Pan, Xinhui:
>>             > [AMD Official Use Only - General]
>>             >
>>             > Oh yep, Pinned BO is moved to other LRU list, So
>>             eviction fails because of other reason.
>>             > I will change the comments in the patch.
>>             > The problem is eviction fails as many reasons, say, BO
>>             is locked.
>>             > ASAIK, kfd will stop the queues and flush some
>>             evict/restore work in its suspend callback. SO the first
>>             eviction before kfd callback likely fails.
>>             >
>>             > -----Original Message-----
>>             > From: Christian König
>>             <ckoenig.leichtzumerken at gmail.com>
>>             <mailto:ckoenig.leichtzumerken at gmail.com>
>>             > Sent: Friday, September 8, 2023 2:49 PM
>>             > To: Pan, Xinhui <Xinhui.Pan at amd.com>
>>             <mailto:Xinhui.Pan at amd.com>; amd-gfx at lists.freedesktop.org
>>             > Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
>>             <mailto:Alexander.Deucher at amd.com>; Koenig, Christian
>>             <Christian.Koenig at amd.com>
>>             <mailto:Christian.Koenig at amd.com>; Fan, Shikang
>>             <Shikang.Fan at amd.com> <mailto:Shikang.Fan at amd.com>
>>             > Subject: Re: [PATCH] drm/amdgpu: Ignore first evction
>>             failure during suspend
>>             >
>>             > Am 08.09.23 um 05:39 schrieb xinhui pan:
>>             >> Some BOs might be pinned. So the first eviction's
>>             failure will abort
>>             >> the suspend sequence. These pinned BOs will be unpined
>>             afterwards
>>             >> during suspend.
>>             > That doesn't make much sense since pinned BOs don't
>>             cause eviction failure here.
>>             >
>>             > What exactly is the error code you see?
>>             >
>>             > Christian.
>>             >
>>             >> Actaully it has evicted most BOs, so that should stil
>>             work fine in
>>             >> sriov full access mode.
>>             >>
>>             >> Fixes: 47ea20762bb7 ("drm/amdgpu: Add an extra
>>             evict_resource call
>>             >> during device_suspend.")
>>             >> Signed-off-by: xinhui pan <xinhui.pan at amd.com>
>>             <mailto:xinhui.pan at amd.com>
>>             >> ---
>>             >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 9 +++++----
>>             >>    1 file changed, 5 insertions(+), 4 deletions(-)
>>             >>
>>             >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>             >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>             >> index 5c0e2b766026..39af526cdbbe 100644
>>             >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>             >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>             >> @@ -4148,10 +4148,11 @@ int
>>             amdgpu_device_suspend(struct drm_device
>>             >> *dev, bool fbcon)
>>             >>
>>             >>        adev->in_suspend = true;
>>             >>
>>             >> -     /* Evict the majority of BOs before grabbing the
>>             full access */
>>             >> -     r = amdgpu_device_evict_resources(adev);
>>             >> -     if (r)
>>             >> -             return r;
>>             >> +     /* Try to evict the majority of BOs before
>>             grabbing the full access
>>             >> +      * Ignore the ret val at first place as we will
>>             unpin some BOs if any
>>             >> +      * afterwards.
>>             >> +      */
>>             >> + (void)amdgpu_device_evict_resources(adev);
>>             >>
>>             >>        if (amdgpu_sriov_vf(adev)) {
>>             >> amdgpu_virt_fini_data_exchange(adev);
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20230914/82f176cc/attachment-0001.htm>