[PATCH 3/3] drm/amdgpu: add AMDGPURESET uevent on AMD GPU reset

Christian König christian.koenig at amd.com
Tue Jan 11 11:26:01 UTC 2022


Am 11.01.22 um 08:12 schrieb Somalapuram Amaranath:
> AMDGPURESET uevent added to notify userspace,
> collect dump_stack and amdgpu_reset_reg_dumps
>
> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/nv.c | 31 +++++++++++++++++++++++++++++++
>   1 file changed, 31 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
> index 2ec1ffb36b1f..41a2c37e825f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> @@ -529,10 +529,41 @@ nv_asic_reset_method(struct amdgpu_device *adev)
>   	}
>   }
>   
> +/**
> + * drm_sysfs_reset_event - generate a DRM uevent
> + * @dev: DRM device
> + *
> + * Send a uevent for the DRM device specified by @dev.  Currently we only
> + * set AMDGPURESET=1 in the uevent environment, but this could be expanded to
> + * deal with other types of events.
> + *
> + * Any new uapi should be using the drm_sysfs_connector_status_event()
> + * for uevents on connector status change.
> + */
> +void drm_sysfs_reset_event(struct drm_device *dev)
> +{
> +	char *event_string = "AMDGPURESET=1";
> +	char *envp[2] = { event_string, NULL };
> +
> +	kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);

That won't work like this.

kobject_uevent_env() needs to allocate memory to send the event to 
userspace and that is not allowed while we do an reset. The Intel guys 
felt into the same trap already.

What we could maybe do is to teach kobject_uevent_env() gfp flags and 
make all allocations from the atomic pool.

Regards,
Christian.

> +}
> +
> +void amdgpu_reset_dumps(struct amdgpu_device *adev)
> +{
> +	struct drm_device *ddev = adev_to_drm(adev);
> +	/* original raven doesn't have full asic reset */
> +	if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
> +		!(adev->apu_flags & AMD_APU_IS_RAVEN2))
> +		return;
> +	drm_sysfs_reset_event(ddev);
> +	dump_stack();
> +}
> +
>   static int nv_asic_reset(struct amdgpu_device *adev)
>   {
>   	int ret = 0;
>   
> +	amdgpu_reset_dumps(adev);
>   	switch (nv_asic_reset_method(adev)) {
>   	case AMD_RESET_METHOD_PCI:
>   		dev_info(adev->dev, "PCI reset\n");



More information about the amd-gfx mailing list