[PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute

Tue Sep 27 14:48:47 UTC 2022

Am 2022-09-27 um 02:12 schrieb Christian König:
> Am 26.09.22 um 23:40 schrieb Shashank Sharma:
>> This patch switches the GPU workload mode to/from
>> compute mode, while submitting compute workload.
>>
>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>> Signed-off-by: Shashank Sharma <shashank.sharma at amd.com>
>
> Feel free to add my acked-by, but Felix should probably take a look as 
> well.

This look OK purely from a compute perspective. But I'm concerned about 
the interaction of compute with graphics or multiple graphics contexts 
submitting work concurrently. They would constantly override or disable 
each other's workload hints.

For example, you have an amdgpu_ctx with 
AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD 
process that also wants the compute profile. Those could be different 
processes belonging to different users. Say, KFD enables the compute 
profile first. Then the graphics context submits a job. At the start of 
the job, the compute profile is enabled. That's a no-op because KFD 
already enabled the compute profile. When the job finishes, it disables 
the compute profile for everyone, including KFD. That's unexpected.

Or you have multiple VCN contexts. When context1 finishes a job, it 
disables the VIDEO profile. But context2 still has a job on the other 
VCN engine and wants the VIDEO profile to still be enabled.

Regards,
   Felix

>
> Christian.
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++---
>>   1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 5e53a5293935..1caed319a448 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -34,6 +34,7 @@
>>   #include "amdgpu_ras.h"
>>   #include "amdgpu_umc.h"
>>   #include "amdgpu_reset.h"
>> +#include "amdgpu_ctx_workload.h"
>>     /* Total memory size in system memory and all GPU VRAM. Used to
>>    * estimate worst case amount of memory to reserve for page tables
>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device 
>> *adev,
>>     void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, 
>> bool idle)
>>   {
>> -    amdgpu_dpm_switch_power_profile(adev,
>> -                    PP_SMC_POWER_PROFILE_COMPUTE,
>> -                    !idle);
>> +    int ret;
>> +
>> +    if (idle)
>> +        ret = amdgpu_clear_workload_profile(adev, 
>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
>> +    else
>> +        ret = amdgpu_set_workload_profile(adev, 
>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE);
>> +
>> +    if (ret)
>> +        drm_warn(&adev->ddev, "Failed to %s power profile to compute 
>> mode\n",
>> +             idle ? "reset" : "set");
>>   }
>>     bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
>