[PATCH] drm/amdgpu: Add recovery_lock to save bad pages function
Zhou1, Tao
Tao.Zhou1 at amd.com
Tue Nov 16 08:26:57 UTC 2021
[AMD Official Use Only]
> -----Original Message-----
> From: Li, Candice <Candice.Li at amd.com>
> Sent: Tuesday, November 16, 2021 4:02 PM
> To: amd-gfx at lists.freedesktop.org
> Cc: Clements, John <John.Clements at amd.com>; Zhou1, Tao
> <Tao.Zhou1 at amd.com>; Li, Candice <Candice.Li at amd.com>
> Subject: [PATCH] drm/amdgpu: Add recovery_lock to save bad pages function
>
> Fix race condition failure during UMC UE injection.
>
> Signed-off-by: Candice Li <candice.li at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 08133de21fdd63..711b5fb26d47d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -1931,10 +1931,12 @@ int amdgpu_ras_save_bad_pages(struct
> amdgpu_device *adev)
> struct ras_err_handler_data *data;
> struct amdgpu_ras_eeprom_control *control;
> int save_count;
> + int ret = 0;
>
> if (!con || !con->eh_data)
> return 0;
>
> + mutex_lock(&con->recovery_lock);
> control = &con->eeprom_control;
> data = con->eh_data;
> save_count = data->count - control->ras_num_recs; @@ -1944,13
[Tao] Since recovery_lock is dedicated to protecting eh_data, can we unlock it here?
> +1946,16 @@ int amdgpu_ras_save_bad_pages(struct amdgpu_device *adev)
> &data->bps[control->ras_num_recs],
> save_count)) {
> dev_err(adev->dev, "Failed to save EEPROM table
> data!");
> - return -EIO;
> + ret = -EIO;
> + goto out;
> }
>
> dev_info(adev->dev, "Saved %d pages to EEPROM table.\n",
> save_count);
> }
>
> - return 0;
> +out:
> + mutex_unlock(&con->recovery_lock);
> + return ret;
> }
>
> /*
> --
> 2.17.1
More information about the amd-gfx
mailing list