[PATCH] drm/amdgpu: Unmap CSA under SR-IOV in KFD path

Christian König ckoenig.leichtzumerken at gmail.com
Tue Apr 30 10:31:33 UTC 2019


Am 30.04.19 um 12:28 schrieb Huang, Trigger:
> Hi Christian,
>
> Thanks for the quick response.
>
> One thing I want to confirm, per my understanding that CSA is only used by CP preemption for KCQ and KGQ path, but in KFD user mode queue path, can we delete CSA?
> If yes, then maybe we can split this topic into two patches
> 1, the original one that unmap CSA in KFD path is still needed

I won't go down that route anyway, we should probably move the CSA 
mapping into the CTX handling completely.

> 2, Add new patch to use  new method for VM checking.

That should actually be the first patch, cause this checking is 
certainly incorrect.

Christian.

>
> Thanks & Best Wishes,
> Trigger Huang
>
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig at amd.com>
> Sent: Tuesday, April 30, 2019 5:59 PM
> To: Huang, Trigger <Trigger.Huang at amd.com>; Kuehling, Felix <Felix.Kuehling at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: Unmap CSA under SR-IOV in KFD path
>
> Am 30.04.19 um 11:53 schrieb Huang, Trigger:
>> Hi Christian & Felix,
>>
>> Thanks for the information.
>>
>> But actually currently CSA is mapped only in amdgpu_driver_open_kms under SR-IOV.
>> So would you give more information about ' Well that is exactly what we already do here'? Where I can find its implementation?
> Well the plan was to move this to when the actually execution context is created. But Rex left the company, could be that the patches for this where never committed.
>
>> On the other hand, I will  try the method 'Instead of checking if some page tables are already filled we check if some mapping is already made'
> Yeah, that should certainly work as well. Just check all entries of the
> root PD in a for loop if any subsequent PDs are allocated and abort if
> you find one.
>
> Christian.
>
>> Thanks & Best Wishes,
>> Trigger Huang
>>
>> -----Original Message-----
>> From: Christian König <ckoenig.leichtzumerken at gmail.com>
>> Sent: Tuesday, April 30, 2019 5:23 PM
>> To: Kuehling, Felix <Felix.Kuehling at amd.com>; Huang, Trigger <Trigger.Huang at amd.com>; amd-gfx at lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amdgpu: Unmap CSA under SR-IOV in KFD path
>>
>> [CAUTION: External Email]
>>
>> Well that is exactly what we already do here. The only problem is we do the wrong check amdgpu_vm_make_compute().
>>
>> Instead of checking if some page tables are already filled we check if some mapping is already made.
>>
>> Regards,
>> Christian.
>>
>> Am 30.04.19 um 01:34 schrieb Kuehling, Felix:
>>> I remember a past discussion to change the CSA allocation/mapping
>>> scheme to avoid this issue in the first place. Can adding the CSA to
>>> the VM be delayed a little to a point after the VM gets converted to a compute VM?
>>> Maybe the first command submission?
>>>
>>> Regards,
>>>       Felix
>>>
>>> On 2019-04-28 6:25 a.m., Trigger Huang wrote:
>>>> In amdgpu open path, CSA will be mappened in VM, so when opening KFD,
>>>> calling mdgpu_vm_make_compute  will fail because it found this VM is
>>>> not a clean VM with some mappings, as a result, it will lead to
>>>> failed to create process VM object
>>>>
>>>> The fix is try to unmap CSA, and actually CSA is not needed in
>>>> compute VF world switch
>>>>
>>>> Signed-off-by: Trigger Huang <Trigger.Huang at amd.com>
>>>> ---
>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 ++++++++++
>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c          |  2 +-
>>>>      2 files changed, 11 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> index 697b8ef..e0bc457 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>>> @@ -956,6 +956,16 @@ int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct kgd_dev *kgd,
>>>>        if (avm->process_info)
>>>>                return -EINVAL;
>>>>
>>>> +    /* Delete CSA mapping to make sure this VM is a clean VM  before
>>>> +     *  converting VM
>>>> +     */
>>>> +    if (amdgpu_sriov_vf(adev) && drv_priv->csa_va) {
>>>> +            amdgpu_bo_reserve(adev->virt.csa_obj, true);
>>>> +            amdgpu_vm_bo_rmv(adev, drv_priv->csa_va);
>>>> +            drv_priv->csa_va = NULL;
>>>> +            amdgpu_bo_unreserve(adev->virt.csa_obj);
>>>> +    }
>>>> +
>>>>        /* Convert VM into a compute VM */
>>>>        ret = amdgpu_vm_make_compute(adev, avm, pasid);
>>>>        if (ret)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>> index da7b4fe..361c2e5 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>> @@ -1069,7 +1069,7 @@ void amdgpu_driver_postclose_kms(struct
>>>> drm_device *dev,
>>>>
>>>>        amdgpu_vm_bo_rmv(adev, fpriv->prt_va);
>>>>
>>>> -    if (amdgpu_sriov_vf(adev)) {
>>>> +    if (amdgpu_sriov_vf(adev) && fpriv->csa_va) {
>>>>                /* TODO: how to handle reserve failure */
>>>>                BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, true));
>>>>                amdgpu_vm_bo_rmv(adev, fpriv->csa_va);
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list