[Intel-xe] [PATCH v1 06/12] drm/xe: Rename XE_RESET_FAILED_UEVENT to XE_RESET_REQUIRED_UEVENT.

Francois Dugast francois.dugast at intel.com
Thu Dec 14 22:07:04 UTC 2023


On Sat, Oct 21, 2023 at 11:22:31AM +0000, Ghimiray, Himal Prasad wrote:
> 
> 
> ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
> From: Welty, Brian <brian.welty at intel.com>
> Sent: Saturday, October 21, 2023 12:17:01 am
> To: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>;
> intel-xe at lists.freedesktop.org <intel-xe at lists.freedesktop.org>
> Cc: Aravind Iddamsetty <aravind.iddamsetty at linux.intel.com>
> Subject: Re: [PATCH v1 06/12] drm/xe: Rename XE_RESET_FAILED_UEVENT to
> XE_RESET_REQUIRED_UEVENT.
> 
> 
> 
> On 10/19/2023 9:55 PM, Himal Prasad Ghimiray wrote:
> > DEVICE_STATUS=NEEDS_RESET will be used for other reasons apart from
> > gt reset failure. Hence use more generic uevent name and provide reason
> > for reset along with the uevent.
> 
> Looks good to me.
> Reviewed-by: Brian Welty <brian.welty at intel.com>
> 
> But as I mention in other email, this looks unsafe if 2 GT resets are
> happening (and fail) concurrently.  uevent will be overwritten/corrupted.
> But can fix separate from this patch.
> 
> Hi Brian,
> 
> Thanks for the review and input. 
> Will work on the input in separate patch.
> 
> BR
> Himal Ghimiray 

Hi,

This patch modifies the uAPI, which we are trying to cleanup and finalize. What is the plan
for this series? Would it be possible to extract a minimal patch with the uAPI change only?

Thanks,
Francois

> 
> >
> > Cc: Aravind Iddamsetty <aravind.iddamsetty at linux.intel.com>
> > Cc: Brian Welty <brian.welty at intel.com>
> > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > ---
> >   drivers/gpu/drm/xe/xe_gt.c | 13 +++++++------
> >   include/uapi/drm/xe_drm.h  | 17 ++++++++++++-----
> >   2 files changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > index 74e1f47bd401..91e0a9a7f1cd 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.c
> > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > @@ -545,16 +545,17 @@ static int do_gt_restart(struct xe_gt *gt)
> >  
> >   static void xe_uevent_gt_reset_failure(struct pci_dev *pdev, u8 tile_id, u8
> gt_id)
> >   {
> > -     char *reset_event[4];
> > +     char *reset_event[5];
> >  
> > -     reset_event[0] = XE_RESET_FAILED_UEVENT "=NEEDS_RESET";
> > -     reset_event[1] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
> > -     reset_event[2] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
> > -     reset_event[3] = NULL;
> > +     reset_event[0] = XE_RESET_REQUIRED_UEVENT;
> > +     reset_event[1] = XE_RESET_REQUIRED_UEVENT_REASON_GT;
> > +     reset_event[2] = kasprintf(GFP_KERNEL, "TILE_ID=%d", tile_id);
> > +     reset_event[3] = kasprintf(GFP_KERNEL, "GT_ID=%d", gt_id);
> > +     reset_event[4] = NULL;
> >        kobject_uevent_env(&pdev->dev.kobj, KOBJ_CHANGE, reset_event);
> >  
> > -     kfree(reset_event[1]);
> >        kfree(reset_event[2]);
> > +     kfree(reset_event[3]);
> >   }
> >  
> >   static int gt_reset(struct xe_gt *gt)
> > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > index 24bf8f0f52e8..ae1b1c7528d5 100644
> > --- a/include/uapi/drm/xe_drm.h
> > +++ b/include/uapi/drm/xe_drm.h
> > @@ -19,12 +19,19 @@ extern "C" {
> >   /**
> >    * DOC: uevent generated by xe on it's pci node.
> >    *
> > - * XE_RESET_FAILED_UEVENT - Event is generated when attempt to reset gt
> > - * fails. The value supplied with the event is always "NEEDS_RESET".
> > - * Additional information supplied is tile id and gt id of the gt unit for
> > - * which reset has failed.
> > + * XE_RESET_REQUIRED_UEVENT - Event is generated when device needs reset.
> > + * The REASON is provided along with the event for which reset is required.
> > + * On the basis of REASONS, additional information might be supplied.
> >    */
> > -#define XE_RESET_FAILED_UEVENT "DEVICE_STATUS"
> > +#define XE_RESET_REQUIRED_UEVENT        "DEVICE_STATUS=NEEDS_RESET"
> > +
> > +/**
> > + * XE_RESET_REQUIRED_UEVENT_REASON_GT - Reason provided to
> XE_RESET_REQUIRED_UEVENT
> > + * incase of gt reset failure. The additional information supplied is tile
> id and
> > + * gt id of the gt unit for which reset has failed.
> > + */
> > +#define XE_RESET_REQUIRED_UEVENT_REASON_GT    "REASON=GT_RESET_FAILED"
> > +
> >  
> >   /**
> >    * struct xe_user_extension - Base class for defining a chain of extensions
> 


More information about the Intel-xe mailing list