[Intel-xe] [PATCH v1 06/12] drm/xe: Rename XE_RESET_FAILED_UEVENT to XE_RESET_REQUIRED_UEVENT.

Welty, Brian brian.welty at intel.com
Fri Oct 20 18:46:57 UTC 2023



On 10/19/2023 9:55 PM, Himal Prasad Ghimiray wrote:
> DEVICE_STATUS=NEEDS_RESET will be used for other reasons apart from
> gt reset failure. Hence use more generic uevent name and provide reason
> for reset along with the uevent.

Looks good to me.
Reviewed-by: Brian Welty <brian.welty at intel.com>

But as I mention in other email, this looks unsafe if 2 GT resets are
happening (and fail) concurrently.  uevent will be overwritten/corrupted.
But can fix separate from this patch.

> 
> Cc: Aravind Iddamsetty <aravind.iddamsetty at linux.intel.com>
> Cc: Brian Welty <brian.welty at intel.com>
> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> ---
>   drivers/gpu/drm/xe/xe_gt.c | 13 +++++++------
>   include/uapi/drm/xe_drm.h  | 17 ++++++++++++-----
>   2 files changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> index 74e1f47bd401..91e0a9a7f1cd 100644
> --- a/drivers/gpu/drm/xe/xe_gt.c
> +++ b/drivers/gpu/drm/xe/xe_gt.c
> @@ -545,16 +545,17 @@ static int do_gt_restart(struct xe_gt *gt)
>   
>   static void xe_uevent_gt_reset_failure(struct pci_dev *pdev, u8 tile_id, u8 gt_id)
>   {
> -	char *reset_event[4];
> +	char *reset_event[5];
>   
> -	reset_event[0] = XE_RESET_FAILED_UEVENT "=NEEDS_RESET";
> -	reset_event[1] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
> -	reset_event[2] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
> -	reset_event[3] = NULL;
> +	reset_event[0] = XE_RESET_REQUIRED_UEVENT;
> +	reset_event[1] = XE_RESET_REQUIRED_UEVENT_REASON_GT;
> +	reset_event[2] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
> +	reset_event[3] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
> +	reset_event[4] = NULL;
>   	kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, reset_event);
>   
> -	kfree(reset_event[1]);
>   	kfree(reset_event[2]);
> +	kfree(reset_event[3]);
>   }
>   
>   static int gt_reset(struct xe_gt *gt)
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index 24bf8f0f52e8..ae1b1c7528d5 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -19,12 +19,19 @@ extern "C" {
>   /**
>    * DOC: uevent generated by xe on it's pci node.
>    *
> - * XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset gt
> - * fails. The value supplied with the event is always "NEEDS_RESET".
> - * Additional information supplied is tile id and gt id of the gt unit for
> - * which reset has failed.
> + * XE_RESET_REQUIRED_UEVENT - Event is generated when device needs reset.
> + * The REASON is provided along with the event for which reset is required.
> + * On the basis of REASONS, additional information might be supplied.
>    */
> -#define XE_RESET_FAILED_UEVENT "DEVICE_STATUS"
> +#define XE_RESET_REQUIRED_UEVENT        "DEVICE_STATUS=NEEDS_RESET"
> +
> +/**
> + * XE_RESET_REQUIRED_UEVENT_REASON_GT - Reason provided to XE_RESET_REQUIRED_UEVENT
> + * incase of gt reset failure. The additional information supplied is tile id and
> + * gt id of the gt unit for which reset has failed.
> + */
> +#define XE_RESET_REQUIRED_UEVENT_REASON_GT    "REASON=GT_RESET_FAILED"
> +
>   
>   /**
>    * struct xe_user_extension - Base class for defining a chain of extensions


More information about the Intel-xe mailing list