[PATCH 2/2] drm/amdkfd: change svm range evict
Felix Kuehling
felix.kuehling at amd.com
Thu Jun 30 16:09:00 UTC 2022
On 2022-06-30 11:19, Eric Huang wrote:
>
> On 2022-06-29 19:29, Felix Kuehling wrote:
>> On 2022-06-29 18:53, Eric Huang wrote:
>>>
>>>
>>> On 2022-06-29 18:20, Felix Kuehling wrote:
>>>> On 2022-06-28 17:43, Eric Huang wrote:
>>>>> Two changes:
>>>>> 1. reducing unnecessary evict/unmap when range is not mapped to gpu.
>>>>> 2. adding always evict when flags is set to always_mapped.
>>>>>
>>>>> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 ++++++++--
>>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> index 4bf2f75f853b..76e817687ef9 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>> @@ -1767,12 +1767,16 @@ svm_range_evict(struct svm_range *prange,
>>>>> struct mm_struct *mm,
>>>>> struct kfd_process *p;
>>>>> int r = 0;
>>>>> + if (!prange->mapped_to_gpu)
>>>>> + return 0;
>>>>
>>>> This feels like an unrelated optimization that should be in a
>>>> separate patch.
>>>>
>>>> But I'm not sure this is correct, because it doesn't consider child
>>>> ranges. svm_range_unmap_from_gpus already contains this check, so
>>>> ranges should not be unmapped unnecessarily either way. Is there
>>>> any other benefit to this change that I'm missing?
>>> I will send another patch separately that considers child ranges.
>>
>> I think this should only be done in the XNACK-off case. For XNACK-on
>> it's already handled in the svm_range_unmap_from_gpus.
> Yes and It is also done when KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED is set.
>>
>>
>>> The benefit is it will reduce unnecessary queue evicts when
>>> allocating context save memory, which is unmapped to gpu.
>>
>> That sounds wrong. The context save area should never be unmapped
>> from GPU. That's the whole point of setting the
>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag. I guess this is happening
>> while migrating the context save area to VRAM for the first time,
>> even before it's mapped to GPU?
> Yes. It is for the first time when registering svm range and migrating
> to VRAM are doing together, at this moment, the range is not mapped to
> GPU.
>>
>> I think there may be another eviction, when the CWSR header is
>> initialized by the CPU. That would also migrate it back to system
>> memory. To avoid that, you should probably register the context save
>> area only after the header has been initialized.
> Yes. I am using this way. Please look at patch 4/4.
>>
>> I think avoiding an eviction when memory is migrated when it is first
>> registered is worthwhile, not just for CWSR.
>>
>>
>>> It is for efficiency reason. On the other hand, without this
>>> optimization KFDCWSRTest.InterruptRestore fails with queue
>>> preemption error.
>>
>> What do you mean by "queue preemption error"? Does HWS hang?
> HWS doesn't hang immediately, so there is not error for fence timeout
> "The cp might be in an unrecoverable state due to an unsuccessful
> queues preemption". The error is "HIQ MQD's queue_doorbell_id0 is not
> 0, Queue preemption time out" after checking mqd manager, which means
> HWS abandons unmap queue request without returning timeout error to
> driver. And after this error, the following test will fail at queue
> creation as HWS hangs
OK, that sounds like the kind of bug the InterruptRestore test is meant
to catch. I think you just created a better test by causing more
preemptions. ;)
So we should do two things:
* Avoid unnecessary preemptions in KFD
* Improve the test to reproduce this hang even without unnecessary
preemptions in KFD, so we can investigate the issue
Regards,
Felix
>>
>>
>>> I think the reason is the extra queue evicts make HWS too busy to
>>> preempt existing queues. There is one unmap_queue packet sent to HWS
>>> in current code, and will be three unmap_queue packets with unified
>>> memory allocation.
>>
>> When queues of a process are already evicted, they should not get
>> evicted again. That's handled by the qpd->evicted counter. There
>> should never be multiple unmap_queues packets in flight at the same
>> time. If you're seeing three unmap_queues, you should also see queues
>> restored three times.
>>
>> HWS should never be too busy to evict queues. If you're seeing
>> preemptions fail, you may have found a bug.
> The restore delay worker will do something differently in term of
> timing. It could restore queues before next unmap_queues, so the
> situation is too complicate to debug in multiple queues evict/restore
> environment. The error definitely means there is a bug, from driver
> point of view there is nothing wrong even adding extra queue eviction,
> so I try to avoid extra queue eviction and keep it as before. The
> bottom line is to make sure unified svm range for context save area
> doesn't cause any failure in kfdtest, so I can theoretically assume
> extra queue eviction/restoring causes HWS failure.
>
> Regards,
> Eric
>>
>> Regards,
>> Felix
>>
>>
>>> So this optimization will keep only one unmap_queue as before.
>>>
>>> Regards,
>>> Eric
>>>>
>>>> Regards,
>>>> Felix
>>>>
>>>>
>>>>> +
>>>>> p = container_of(svms, struct kfd_process, svms);
>>>>> pr_debug("invalidate svms 0x%p prange [0x%lx 0x%lx] [0x%lx
>>>>> 0x%lx]\n",
>>>>> svms, prange->start, prange->last, start, last);
>>>>> - if (!p->xnack_enabled) {
>>>>> + if (!p->xnack_enabled ||
>>>>> + (prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) {
>>>>> int evicted_ranges;
>>>>> list_for_each_entry(pchild, &prange->child_list,
>>>>> child_list) {
>>>>> @@ -3321,7 +3325,9 @@ svm_range_set_attr(struct kfd_process *p,
>>>>> struct mm_struct *mm,
>>>>> if (r)
>>>>> goto out_unlock_range;
>>>>> - if (migrated && !p->xnack_enabled) {
>>>>> + if (migrated && (!p->xnack_enabled ||
>>>>> + (prange->flags &
>>>>> KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)) &&
>>>>> + prange->mapped_to_gpu) {
>>>>> pr_debug("restore_work will update mappings of
>>>>> GPUs\n");
>>>>> mutex_unlock(&prange->migrate_mutex);
>>>>> continue;
>>>
>
More information about the amd-gfx
mailing list