[PATCH] drm/amd/amdgpu: vm entities should have kernel priority

Mon Jul 19 08:24:53 UTC 2021

Am 19.07.21 um 07:57 schrieb Jingwen Chen:
> [Why]
> Current vm_pte entities have NORMAL priority, in SRIOV multi-vf
> use case, the vf flr happens first and then job time out is found.
> There can be several jobs timeout during a very small time slice.
> And if the innocent sdma job time out is found before the real bad
> job, then the innocent sdma job will be set to guilty as it only
> has NORMAL priority. This will lead to a page fault after
> resubmitting job.
>
> [How]
> sdma should always have KERNEL priority. The kernel job will always
> be resubmitted.

I'm not sure if that is a good idea. We intentionally didn't gave the 
page table updates kernel priority to avoid clashing with the move jobs.

Christian.

>
> Signed-off-by: Jingwen Chen <Jingwen.Chen2 at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 358316d6a38c..f7526b67cc5d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2923,13 +2923,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   	INIT_LIST_HEAD(&vm->done);
>   
>   	/* create scheduler entities for page table updates */
> -	r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_NORMAL,
> +	r = drm_sched_entity_init(&vm->immediate, DRM_SCHED_PRIORITY_KERNEL,
>   				  adev->vm_manager.vm_pte_scheds,
>   				  adev->vm_manager.vm_pte_num_scheds, NULL);
>   	if (r)
>   		return r;
>   
> -	r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_NORMAL,
> +	r = drm_sched_entity_init(&vm->delayed, DRM_SCHED_PRIORITY_KERNEL,
>   				  adev->vm_manager.vm_pte_scheds,
>   				  adev->vm_manager.vm_pte_num_scheds, NULL);
>   	if (r)