[Patch v1 5/5] drm/amdkfd: refactor runtime pm for baco

Thu Jan 30 19:01:53 UTC 2020

Hello Felix,

Thanks for your time to review and for your feedback.

On 1/29/2020 5:52 PM, Felix Kuehling wrote:
> HI Rajneesh,
>
> See comments inline ...
>
> And a general question: Why do you need to set the autosuspend_delay 
> in so many places? Amdgpu only has a single call to this function 
> during initialization.

We don't. I have called this out in cover letter since i too feel its 
not the best way to do what we want to do. I have seen weird issues on 
dual GPUs with KFDTest that sometimes results in system hang and gpu 
hang etc.

Even if i try with running kfdtest on just one node in a multi gpu 
system, the baco exit seems to wake up all gpus and then one goes to 
auto suspend again while other performs the desired kfdtest operation. I 
have never seen any issue with a single gpu card. So in current approch, 
i am making sure that the GPUs are not quickly auto-suspended while the 
kfd operations are ongoing but once the kfdtest is finished, the runtime 
suspend call with ensure to reset the timeout back to 5 seconds.

I would like to avoid this and appreciate some pointers where i can put 
get_sync calls while unbinding. I have tried a number of places but saw 
some issues. So any help will be highly appreciated, to identify the 
proper logical inverse of kfd_bind_process_to_device.

>
> On 2020-01-27 20:29, Rajneesh Bhardwaj wrote:
>> So far the kfd driver implemented same routines for runtime and system
>> wide suspend and resume (s2idle or mem). During system wide suspend the
>> kfd aquires an atomic lock that prevents any more user processes to
>> create queues and interact with kfd driver and amd gpu. This mechanism
>> created problem when amdgpu device is runtime suspended with BACO
>> enabled. Any application that relies on kfd driver fails to load because
>> the driver reports a locked kfd device since gpu is runtime suspended.
>>
>> However, in an ideal case, when gpu is runtime  suspended the kfd driver
>> should be able to:
>>
>>   - auto resume amdgpu driver whenever a client requests compute service
>>   - prevent runtime suspend for amdgpu  while kfd is in use
>>
>> This change refactors the amdgpu and amdkfd drivers to support BACO and
>> runtime runtime power management.
>>
>> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 12 ++++-----
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |  8 +++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 +--
>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c    | 31 +++++++++++++---------
>>   drivers/gpu/drm/amd/amdkfd/kfd_iommu.c     |  5 +++-
>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c   | 19 ++++++++++---
>>   6 files changed, 51 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 8609287620ea..314c4a2a0354 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -178,18 +178,18 @@ void amdgpu_amdkfd_interrupt(struct 
>> amdgpu_device *adev,
>>           kgd2kfd_interrupt(adev->kfd.dev, ih_ring_entry);
>>   }
>>   -void amdgpu_amdkfd_suspend(struct amdgpu_device *adev)
>> +void amdgpu_amdkfd_suspend(struct amdgpu_device *adev, bool run_pm)
>>   {
>>       if (adev->kfd.dev)
>> -        kgd2kfd_suspend(adev->kfd.dev);
>> +        kgd2kfd_suspend(adev->kfd.dev, run_pm);
>>   }
>>   -int amdgpu_amdkfd_resume(struct amdgpu_device *adev)
>> +int amdgpu_amdkfd_resume(struct amdgpu_device *adev, bool run_pm)
>>   {
>>       int r = 0;
>>         if (adev->kfd.dev)
>> -        r = kgd2kfd_resume(adev->kfd.dev);
>> +        r = kgd2kfd_resume(adev->kfd.dev, run_pm);
>>         return r;
>>   }
>> @@ -713,11 +713,11 @@ void kgd2kfd_exit(void)
>>   {
>>   }
>>   -void kgd2kfd_suspend(struct kfd_dev *kfd)
>> +void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm)
>>   {
>>   }
>>   -int kgd2kfd_resume(struct kfd_dev *kfd)
>> +int kgd2kfd_resume(struct kfd_dev *kfd, bool run_pm)
>>   {
>>       return 0;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index 0fa898d30207..3dce4a51e522 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -123,8 +123,8 @@ struct amdkfd_process_info {
>>   int amdgpu_amdkfd_init(void);
>>   void amdgpu_amdkfd_fini(void);
>>   -void amdgpu_amdkfd_suspend(struct amdgpu_device *adev);
>> -int amdgpu_amdkfd_resume(struct amdgpu_device *adev);
>> +void amdgpu_amdkfd_suspend(struct amdgpu_device *adev, bool run_pm);
>> +int amdgpu_amdkfd_resume(struct amdgpu_device *adev, bool run_pm);
>>   void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
>>               const void *ih_ring_entry);
>>   void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
>> @@ -250,8 +250,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>                struct drm_device *ddev,
>>                const struct kgd2kfd_shared_resources *gpu_resources);
>>   void kgd2kfd_device_exit(struct kfd_dev *kfd);
>> -void kgd2kfd_suspend(struct kfd_dev *kfd);
>> -int kgd2kfd_resume(struct kfd_dev *kfd);
>> +void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm);
>> +int kgd2kfd_resume(struct kfd_dev *kfd, bool run_pm);
>>   int kgd2kfd_pre_reset(struct kfd_dev *kfd);
>>   int kgd2kfd_post_reset(struct kfd_dev *kfd);
>>   void kgd2kfd_interrupt(struct kfd_dev *kfd, const void 
>> *ih_ring_entry);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index d9e5eac182d3..787d49e9f4de 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3314,7 +3314,7 @@ int amdgpu_device_suspend(struct drm_device 
>> *dev, bool fbcon)
>>           }
>>       }
>>   -    amdgpu_amdkfd_suspend(adev);
>> +    amdgpu_amdkfd_suspend(adev, fbcon);
>
> The logic seems inverted here. There are four different pmops 
> callbacks that use different values for this parameter:
>
> * amdgpu_pmops_suspend: true
> * amdgpu_pmops_freeze: true
> * amdgpu_pmops_poweroff: true
> * amdgpu_pmops_runtime_suspend: false
>
> It looks like runtime_suspend uses false, so you should pass the 
> run_pm parameter as !fbcon.

Ok. will fix in v2.

>
>
>>         amdgpu_ras_suspend(adev);
>>   @@ -3398,7 +3398,7 @@ int amdgpu_device_resume(struct drm_device 
>> *dev, bool fbcon)
>>               }
>>           }
>>       }
>> -    r = amdgpu_amdkfd_resume(adev);
>> +    r = amdgpu_amdkfd_resume(adev, fbcon);
>>       if (r)
>>           return r;
>>   diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 2a9e40131735..e9f00c3a067a 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -23,6 +23,7 @@
>>   #include <linux/bsearch.h>
>>   #include <linux/pci.h>
>>   #include <linux/slab.h>
>> +#include <linux/pm_runtime.h>
>>   #include "kfd_priv.h"
>>   #include "kfd_device_queue_manager.h"
>>   #include "kfd_pm4_headers_vi.h"
>> @@ -710,7 +711,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>   void kgd2kfd_device_exit(struct kfd_dev *kfd)
>>   {
>>       if (kfd->init_complete) {
>> -        kgd2kfd_suspend(kfd);
>> +        kgd2kfd_suspend(kfd, true);
>>           device_queue_manager_uninit(kfd->dqm);
>>           kfd_interrupt_exit(kfd);
>>           kfd_topology_remove_device(kfd);
>> @@ -731,7 +732,7 @@ int kgd2kfd_pre_reset(struct kfd_dev *kfd)
>>         kfd->dqm->ops.pre_reset(kfd->dqm);
>>   -    kgd2kfd_suspend(kfd);
>> +    kgd2kfd_suspend(kfd, true);
>
> This should use false. This is not runtime PM.
>
>
>>         kfd_signal_reset_event(kfd);
>>       return 0;
>> @@ -765,21 +766,24 @@ bool kfd_is_locked(void)
>>       return  (atomic_read(&kfd_locked) > 0);
>>   }
>>   -void kgd2kfd_suspend(struct kfd_dev *kfd)
>> +void kgd2kfd_suspend(struct kfd_dev *kfd, bool run_pm)
>>   {
>>       if (!kfd->init_complete)
>>           return;
>>   -    /* For first KFD device suspend all the KFD processes */
>> -    if (atomic_inc_return(&kfd_locked) == 1)
>> -        kfd_suspend_all_processes();
>> +    /* for runtime suspend, skip locking kfd */
>> +    if (run_pm) {
>
> This should be done for the non-runtime PM case: if (!run_pm).
>
>
>> +        /* For first KFD device suspend all the KFD processes */
>> +        if (atomic_inc_return(&kfd_locked) == 1)
>> +            kfd_suspend_all_processes();
>> +    }
>>   +    pm_runtime_set_autosuspend_delay(kfd->ddev->dev, 5000);
>>       kfd->dqm->ops.stop(kfd->dqm);
>> -
>>       kfd_iommu_suspend(kfd);
>>   }
>>   -int kgd2kfd_resume(struct kfd_dev *kfd)
>> +int kgd2kfd_resume(struct kfd_dev *kfd, bool run_pm)
>>   {
>>       int ret, count;
>>   @@ -790,10 +794,13 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>       if (ret)
>>           return ret;
>>   -    count = atomic_dec_return(&kfd_locked);
>> -    WARN_ONCE(count < 0, "KFD suspend / resume ref. error");
>> -    if (count == 0)
>> -        ret = kfd_resume_all_processes();
>> +    /* for runtime resume, skip unlocking kfd */
>> +    if (run_pm) {
>
> Same as above.
>
>
>> +        count = atomic_dec_return(&kfd_locked);
>> +        WARN_ONCE(count < 0, "KFD suspend / resume ref. error");
>> +        if (count == 0)
>> +            ret = kfd_resume_all_processes();
>> +    }
>>         return ret;
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
>> index 8d871514671e..6301d77ed3d6 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/slab.h>
>>   #include <linux/pci.h>
>>   #include <linux/amd-iommu.h>
>> +#include <linux/pm_runtime.h>
>>   #include "kfd_priv.h"
>>   #include "kfd_dbgmgr.h"
>>   #include "kfd_topology.h"
>> @@ -134,8 +135,10 @@ void kfd_iommu_unbind_process(struct kfd_process 
>> *p)
>>       struct kfd_process_device *pdd;
>>         list_for_each_entry(pdd, &p->per_device_data, per_device_list)
>> -        if (pdd->bound == PDD_BOUND)
>> +        if (pdd->bound == PDD_BOUND) {
>>               amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>> + pm_runtime_set_autosuspend_delay(pdd->dev->ddev->dev, 5000);
>
> I don't think this is the right place. This code should only run on 
> APUs with IOMMUv2 enabled. Probably missing a check at the start of 
> the function. On kernels with IOMMU disabled, this source file is not 
> compiled at all.
>
> I think this code should go into kfd_process_destroy_pdds.

Ok. will fix in v2. So can we consider this as logical inverse of 
bind_process_to_device?

>
>
>> +        }
>>   }
>>     /* Callback for process shutdown invoked by the IOMMU driver */
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index 25b90f70aecd..d19d5e97405c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -31,6 +31,7 @@
>>   #include <linux/compat.h>
>>   #include <linux/mman.h>
>>   #include <linux/file.h>
>> +#include <linux/pm_runtime.h>
>>   #include "amdgpu_amdkfd.h"
>>   #include "amdgpu.h"
>>   @@ -843,15 +844,27 @@ struct kfd_process_device 
>> *kfd_bind_process_to_device(struct kfd_dev *dev,
>>           return ERR_PTR(-ENOMEM);
>>       }
>>   +    err = pm_runtime_get_sync(dev->ddev->dev);
>> +    pm_runtime_set_autosuspend_delay(dev->ddev->dev, 60000);
>
> Why are you using a different hard-coded delay (60s?) here?

It needs to be fixed. Explained above the reason for such deviation.

>
> Regards,
>   Felix
>

Thanks
Rajneesh

>> +    if (err < 0)
>> +        return ERR_PTR(err);
>> +
>>       err = kfd_iommu_bind_process_to_device(pdd);
>>       if (err)
>> -        return ERR_PTR(err);
>> +        goto out;
>>         err = kfd_process_device_init_vm(pdd, NULL);
>>       if (err)
>> -        return ERR_PTR(err);
>> +        goto out;
>>   -    return pdd;
>> +out:
>> +    pm_runtime_mark_last_busy(dev->ddev->dev);
>> +    pm_runtime_put_autosuspend(dev->ddev->dev);
>> +
>> +    if (!err)
>> +        return pdd;
>> +    else
>> +        return ERR_PTR(err);
>>   }
>>     struct kfd_process_device *kfd_get_first_process_device_data(