[Intel-xe] [PATCH 0/4] RFC: drm/xe/ras: Supporting RAS on XE.

Rodrigo Vivi rodrigo.vivi at kernel.org
Tue May 2 19:58:45 UTC 2023


On Tue, May 02, 2023 at 12:38:21PM +0300, Jani Nikula wrote:
> On Wed, 26 Apr 2023, "Ghimiray, Himal Prasad" <himal.prasad.ghimiray at intel.com> wrote:
> > Hi Jani,
> >
> > Is recommendation to create new .h file for error related registers ?
> > Can I go ahead with adding file xe_gt_error_regs.h (GT, SOC, GSC) which explicitly mentions registers related to error handling ?
> 
> I don't know what the best grouping for this stuff would be. Maybe I'd
> go for grouping by hardware blocks rather than functionality like
> errors. Cc: Lucas, Matt, Rodrigo, just to pick a few names who might
> have a better idea.

I believe the right way is to group by the IP block and/or reset domain,
rather than by functionality.

But Lucas is probably the best one to guide us here. He has some ideas
of tools to generate the regs we use from specs and the organization
might be impacted.

> 
> Just don't dump register macros to a single file that will bloat to
> become unmanageable.
> 
> BR,
> Jani.
> 
> 
> PS. Please also don't top-post on mailing lists.
> 
> 
> 
> >
> > BR
> > Himal Ghimiray
> >
> >
> >> -----Original Message-----
> >> From: Jani Nikula <jani.nikula at linux.intel.com>
> >> Sent: 06 April 2023 17:56
> >> To: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>; intel-
> >> xe at lists.freedesktop.org
> >> Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>
> >> Subject: Re: [Intel-xe] [PATCH 0/4] RFC: drm/xe/ras: Supporting RAS on XE.
> >> 
> >> On Thu, 06 Apr 2023, Himal Prasad Ghimiray
> >> <himal.prasad.ghimiray at intel.com> wrote:
> >> > These patches in series are for adding Reliability, Availability and
> >> > Serviceability support on xe.
> >> > Patches provide the infra for various hardware error counting and
> >> > logging. These error counters will be exposed to userspace in
> >> > subsequent patches.
> >> > In current patches:
> >> > 1) We are adding support to handle new interrupts bits.
> >> > 2) Counting of GT errors.
> >> > 3) Soc/SGunit error counting.
> >> > 4) CSC HW and FW error counting and sending uvent.
> >> >
> >> > Akeem G Abodunrin (1):
> >> >   drm/xe/ras: Add support for reporting CSC HW and FW errors.
> >> >
> >> > Aravind Iddamsetty (2):
> >> >   drm/xe/ras: Log the GT hw errors.
> >> >   drm/xe/ras: Count SOC and SGUNIT errors
> >> >
> >> > Himal Prasad Ghimiray (1):
> >> >   drm/xe: Handle GRF/IC ECC error irq
> >> >
> >> >  drivers/gpu/drm/xe/regs/xe_regs.h    | 244 ++++++++
> >> 
> >> Please don't recreate i915_reg.h in xe. Please add separate regs files like
> >> we've been doing in i915. It's pain to split a monster register file later.
> >> 
> >> BR,
> >> Jani.
> >> 
> >> 
> >> >  drivers/gpu/drm/xe/xe_device.c       |   6 +
> >> >  drivers/gpu/drm/xe/xe_device_types.h |   4 +
> >> >  drivers/gpu/drm/xe/xe_gt.c           |  30 +
> >> >  drivers/gpu/drm/xe/xe_gt_types.h     | 105 ++++
> >> >  drivers/gpu/drm/xe/xe_irq.c          | 824
> >> +++++++++++++++++++++++++++
> >> >  drivers/gpu/drm/xe/xe_pci.c          |   6 +
> >> >  7 files changed, 1219 insertions(+)
> >> 
> >> --
> >> Jani Nikula, Intel Open Source Graphics Center
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center


More information about the Intel-xe mailing list