[PATCH 2/2] drm/amdgpu: add AMDGPURESET uevent on AMD GPU reset
Lazar, Lijo
lijo.lazar at amd.com
Mon Jan 17 11:54:32 UTC 2022
On 1/17/2022 12:03 PM, Somalapuram Amaranath wrote:
> AMDGPURESET uevent added to notify userspace, collect dump_stack and trace
>
> Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/nv.c | 45 +++++++++++++++++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
> index 2ec1ffb36b1f..b73147ae41fb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> @@ -529,10 +529,55 @@ nv_asic_reset_method(struct amdgpu_device *adev)
> }
> }
>
> +/**
> + * drm_sysfs_reset_event - generate a DRM uevent
> + * @dev: DRM device
> + *
> + * Send a uevent for the DRM device specified by @dev. Currently we only
> + * set AMDGPURESET=1 in the uevent environment, but this could be expanded to
> + * deal with other types of events.
> + *
> + * Any new uapi should be using the drm_sysfs_connector_status_event()
> + * for uevents on connector status change.
> + */
> +void drm_sysfs_reset_event(struct drm_device *dev)
> +{
> + char *event_string = "AMDGPURESET=1";
> + char *envp[2] = { event_string, NULL };
> +
> + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
> +}
> +
> +void amdgpu_reset_dumps(struct amdgpu_device *adev)
> +{
> + struct drm_device *ddev = adev_to_drm(adev);
> + int r = 0, i;
> +
> + /* original raven doesn't have full asic reset */
> + if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
> + !(adev->apu_flags & AMD_APU_IS_RAVEN2))
> + return;
> + for (i = 0; i < adev->num_ip_blocks; i++) {
> + if (!adev->ip_blocks[i].status.valid)
> + continue;
> + if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
> + continue;
> + r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
> +
> + if (r)
> + DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n",
> + adev->ip_blocks[i].version->funcs->name, r);
> + }
> +
> + drm_sysfs_reset_event(ddev);
> + dump_stack();
> +}
> +
> static int nv_asic_reset(struct amdgpu_device *adev)
> {
> int ret = 0;
>
> + amdgpu_reset_dumps(adev);
Alex recently added a patch to reset GPU on suspend. It doesn't make
sense to send an event in such cases, guess the original intention is
for gpu recovery related cases.
Thanks,
Lijo
> switch (nv_asic_reset_method(adev)) {
> case AMD_RESET_METHOD_PCI:
> dev_info(adev->dev, "PCI reset\n");
>
More information about the amd-gfx
mailing list