[PATCH] drm/amdkfd: improve performance with XNACK enable
James Zhu
jamesz at amd.com
Fri May 9 12:45:01 UTC 2025
On 2025-05-09 02:00, Christian König wrote:
> On 5/8/25 19:25, James Zhu wrote:
>> On 2025-05-08 11:20, James Zhu wrote:
>>> On 2025-05-08 10:50, Christian König wrote:
>>>> On 5/8/25 16:46, James Zhu wrote:
>>>>> When XNACK on, hang or low performance is observed with some test cases.
>>>>> The restoring page process has unexpected stuck during evicting/restoring
>>>>> if some bo's flag has KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED setting:
>>>>> 1. when xnack on, retry pagefault will invoke restoring pages process
>>>>> 2. A. if there is enough VRAM space, simply migrating pages from ram to vram
>>>>> B. if there is no enough VRAM space left, searching resource LRU list, and
>>>>> scheduling a new eviction work queue to evict LRU bo from vram to ram
>>>>> first, then resume restoring pages process, or waiting for eviction
>>>>> timeout and try to schedule evicting next LRU bo
>>>>> 3. for case 2B, if bo has KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED setting,
>>>>> queue eviction will be triggered.So restoring work queue will be scheduled.
>>>>> 4. step 1, restoring pages process will hold one mm->mmap_lock's read until
>>>>> restoring pages is completed
>>>>> step 2B, evictiion work queue process will hold one mm->mmap_lock's read
>>>>> until evicting bo is completed
>>>>> step 3, restoring work queue process is trying to acquire one mm->mmap_lock's
>>>>> write after the above two mm->mmap_lock's read are released, and in the
>>>>> meantime which will block all following mm->mmap_lock's read request.
>>>>> 5. in step 2, if the first eviction bo's size is big enough for step 1
>>>>> restoring pages request, everything is fine. if not, which means that the
>>>>> mm->mmap_lock's read step 1 won't be release right the way. In step 3, first
>>>>> eviction bo's restoring work queue will compete for mm->mmap_lock's write,
>>>>> the second and following LRU bo's evictiion work queue will be blocked by
>>>>> tring to acquire mm->mmap_lock's read until timeout. All restoring pages
>>>>> process will be stuck here.
>>>>> Using down_write_trylock to replace mmap_write_lock will help not block the
>>>>> second and following evictiion work queue process.
>>>>>
>>>>> Signed-off-by: James Zhu<James.Zhu at amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 6 +++++-
>>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> index 72be6e152e88..5f6ed70559b7 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> @@ -1794,7 +1794,11 @@ svm_range_list_lock_and_flush_work(struct svm_range_list *svms,
>>>>> {
>>>>> retry_flush_work:
>>>>> flush_work(&svms->deferred_list_work);
>>>>> - mmap_write_lock(mm);
>>>>> + while (true) {
>>>>> + if (down_write_trylock(&(mm->mmap_lock)))
>>>>> + break;
>>>>> + schedule();
>>>>> + }
>>>> Oh, stuff like that is usually an absolutely clear NAK from upstream.
>>>>
>>>> As far as I know that is not something we can do so easily.
>>>>
>>>> Would it be possible to wait for progress instead of calling schedule() here?
>>> [JZ] At 1st beginning, I am thinking adding sync with restoring pages done.
>>>
>>> but the original restoring work design philosophy is blindly scheduled after certain delay.
>>>
>>> the changes with sync may take more time and risk. I would like Felix and Philip give comments
>>>
>>> if there is efficient and safe way to fix it. Thanks!
>> [JZ] BTW, in worse case, mmap_write_lock will fall into rwsem_down_write_slowpath(), schedule_preempt_disabled() and schedule();
> Yeah, but drivers are not allowed to re-implement or even bypass that logic.
[JZ] here can take as a new version mmap_write_lock without blocking
the following read request and competing for write
lock which means read has higher priority under this case. Logically
sync is better way to replace it. In practice, under
current scenery, sync will increase other burdens an risk.
>
> Regards,
> Christian.
>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> if (list_empty(&svms->deferred_range_list))
>>>>> return;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250509/e22b4e8c/attachment.htm>
More information about the amd-gfx
mailing list