[PATCH] drm/amdkfd: Fix CRIU restore op due to doorbell offset

Felix Kuehling felix.kuehling at amd.com
Wed Sep 7 19:01:43 UTC 2022


On 2022-09-07 14:36, Rajneesh Bhardwaj wrote:
> Recently introduced change to allocate doorbells only when the first
> queue is created or mapped for CPU / GPU access, did not consider
> Checkpoint Restore scenario completely. This fix allows the CRIU restore
> operation by extedning the doorbell optimization to CRIU restore
> scenario.

typo: extending

A few more nit-picks inline.


>
> Fixes: 'commit 15bcfbc55b57 ("drm/amdkfd: Allocate doorbells only when needed")'
>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj at amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c               | 8 ++++++++
>   drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c              | 4 +++-
>   drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 9 +++++++++
>   3 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 84da1a9ce37c..c476993e3927 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -2153,6 +2153,13 @@ static int criu_restore_devices(struct kfd_process *p,
>   			ret = PTR_ERR(pdd);
>   			goto exit;
>   		}
> +
> +		if (!pdd->doorbell_index &&
> +			kfd_alloc_process_doorbells(pdd->dev, &pdd->doorbell_index) < 0) {
> +			pr_err("Failed to allocate process doorbells\n");
> +			ret = -ENOMEM;
> +			goto err_alloc_doorbells;
> +	}

Incorrect indentation. The closing brace should be indented one more 
tab. And the if condition would be more readable if the second line was 
aligned under the !pdd->doorbell_index.

You don't need a new err_alloc_doorbells label. Just goto exit.


>   	}
>   
>   	/*
> @@ -2161,6 +2168,7 @@ static int criu_restore_devices(struct kfd_process *p,
>   	 */
>   	*priv_offset += args->num_devices * sizeof(*device_privs);
>   
> +err_alloc_doorbells:
>   exit:
>   	kfree(device_buckets);
>   	return ret;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index b33798f89ef0..7690514c4eb3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -157,8 +157,10 @@ int kfd_doorbell_mmap(struct kfd_dev *dev, struct kfd_process *process,
>   
>   	/* Calculate physical address of doorbell */
>   	address = kfd_get_process_doorbells(pdd);
> -	if (!address)
> +	if (!address) {
> +		pr_err("Failed to  get physical address of process doorbell\n");

Please use the same error message as above for consistency. Or better 
yet, move the error printing into kfd_alloc_process_doorbells, so you 
don't have to duplicate the same error message in multiple places.


>   		return -ENOMEM;
> +	}
>   	vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE |
>   				VM_DONTDUMP | VM_PFNMAP;
>   
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 6e3e7f54381b..9f05f64c5af8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -857,6 +857,14 @@ int kfd_criu_restore_queue(struct kfd_process *p,
>   		ret = -EINVAL;
>   		goto exit;
>   	}
> +
> +	if (!pdd->doorbell_index &&
> +	    kfd_alloc_process_doorbells(pdd->dev, &pdd->doorbell_index) < 0) {
> +		pr_err("Failed to alloc process doorbells\n");

Same as above.


> +		ret = -ENOMEM;
> +		goto err_alloc_doorbells;

You don't need a new err_alloc_doorbells label. Just goto exit.


> +	}
> +
>   	/* data stored in this order: mqd, ctl_stack */
>   	mqd = q_extra_data;
>   	ctl_stack = mqd + q_data->mqd_size;
> @@ -876,6 +884,7 @@ int kfd_criu_restore_queue(struct kfd_process *p,
>   	if (q_data->gws)
>   		ret = pqm_set_gws(&p->pqm, q_data->q_id, pdd->dev->gws);
>   
> +err_alloc_doorbells:
>   exit:
>   	if (ret)
>   		pr_err("Failed to restore queue (%d)\n", ret);


More information about the amd-gfx mailing list