[Intel-xe] [PATCH v1 06/12] drm/xe: Rename XE_RESET_FAILED_UEVENT to XE_RESET_REQUIRED_UEVENT.
Francois Dugast
francois.dugast at intel.com
Thu Dec 14 22:07:04 UTC 2023
On Sat, Oct 21, 2023 at 11:22:31AM +0000, Ghimiray, Himal Prasad wrote:
>
>
> ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
> From: Welty, Brian <brian.welty at intel.com>
> Sent: Saturday, October 21, 2023 12:17:01 am
> To: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>;
> intel-xe at lists.freedesktop.org <intel-xe at lists.freedesktop.org>
> Cc: Aravind Iddamsetty <aravind.iddamsetty at linux.intel.com>
> Subject: Re: [PATCH v1 06/12] drm/xe: Rename XE_RESET_FAILED_UEVENT to
> XE_RESET_REQUIRED_UEVENT.
>
>
>
> On 10/19/2023 9:55 PM, Himal Prasad Ghimiray wrote:
> > DEVICE_STATUS=NEEDS_RESET will be used for other reasons apart from
> > gt reset failure. Hence use more generic uevent name and provide reason
> > for reset along with the uevent.
>
> Looks good to me.
> Reviewed-by: Brian Welty <brian.welty at intel.com>
>
> But as I mention in other email, this looks unsafe if 2 GT resets are
> happening (and fail) concurrently. uevent will be overwritten/corrupted.
> But can fix separate from this patch.
>
> Hi Brian,
>
> Thanks for the review and input.
> Will work on the input in separate patch.
>
> BR
> Himal Ghimiray
Hi,
This patch modifies the uAPI, which we are trying to cleanup and finalize. What is the plan
for this series? Would it be possible to extract a minimal patch with the uAPI change only?
Thanks,
Francois
>
> >
> > Cc: Aravind Iddamsetty <aravind.iddamsetty at linux.intel.com>
> > Cc: Brian Welty <brian.welty at intel.com>
> > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt.c | 13 +++++++------
> > include/uapi/drm/xe_drm.h | 17 ++++++++++++-----
> > 2 files changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > index 74e1f47bd401..91e0a9a7f1cd 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.c
> > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > @@ -545,16 +545,17 @@ static int do_gt_restart(struct xe_gt *gt)
> >
> > static void xe_uevent_gt_reset_failure(struct pci_dev *pdev, u8 tile_id, u8
> gt_id)
> > {
> > - char *reset_event[4];
> > + char *reset_event[5];
> >
> > - reset_event[0] = XE_RESET_FAILED_UEVENT "=NEEDS_RESET";
> > - reset_event[1] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
> > - reset_event[2] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
> > - reset_event[3] = NULL;
> > + reset_event[0] = XE_RESET_REQUIRED_UEVENT;
> > + reset_event[1] = XE_RESET_REQUIRED_UEVENT_REASON_GT;
> > + reset_event[2] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
> > + reset_event[3] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
> > + reset_event[4] = NULL;
> > kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, reset_event);
> >
> > - kfree(reset_event[1]);
> > kfree(reset_event[2]);
> > + kfree(reset_event[3]);
> > }
> >
> > static int gt_reset(struct xe_gt *gt)
> > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > index 24bf8f0f52e8..ae1b1c7528d5 100644
> > --- a/include/uapi/drm/xe_drm.h
> > +++ b/include/uapi/drm/xe_drm.h
> > @@ -19,12 +19,19 @@ extern "C" {
> > /**
> > * DOC: uevent generated by xe on it's pci node.
> > *
> > - * XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset gt
> > - * fails. The value supplied with the event is always "NEEDS_RESET".
> > - * Additional information supplied is tile id and gt id of the gt unit for
> > - * which reset has failed.
> > + * XE_RESET_REQUIRED_UEVENT - Event is generated when device needs reset.
> > + * The REASON is provided along with the event for which reset is required.
> > + * On the basis of REASONS, additional information might be supplied.
> > */
> > -#define XE_RESET_FAILED_UEVENT "DEVICE_STATUS"
> > +#define XE_RESET_REQUIRED_UEVENT "DEVICE_STATUS=NEEDS_RESET"
> > +
> > +/**
> > + * XE_RESET_REQUIRED_UEVENT_REASON_GT - Reason provided to
> XE_RESET_REQUIRED_UEVENT
> > + * incase of gt reset failure. The additional information supplied is tile
> id and
> > + * gt id of the gt unit for which reset has failed.
> > + */
> > +#define XE_RESET_REQUIRED_UEVENT_REASON_GT "REASON=GT_RESET_FAILED"
> > +
> >
> > /**
> > * struct xe_user_extension - Base class for defining a chain of extensions
>
More information about the Intel-xe
mailing list