[Intel-xe] [RFC v2 5/5] drm/xe/RAS: send multicast event on occurrence of an error

Ruhl, Michael J michael.j.ruhl at intel.com
Fri Oct 20 20:40:23 UTC 2023


>-----Original Message-----
>From: Aravind Iddamsetty <aravind.iddamsetty at linux.intel.com>
>Sent: Friday, October 20, 2023 11:59 AM
>To: intel-xe at lists.freedesktop.org; dri-devel at lists.freedesktop.org;
>alexander.deucher at amd.com; airlied at gmail.com; daniel at ffwll.ch;
>joonas.lahtinen at linux.intel.com; ogabbay at kernel.org; Tayar, Tomer (Habana)
><ttayar at habana.ai>; Hawking.Zhang at amd.com;
>Harish.Kasiviswanathan at amd.com; Felix.Kuehling at amd.com;
>Luben.Tuikov at amd.com; Ruhl, Michael J <michael.j.ruhl at intel.com>
>Subject: [RFC v2 5/5] drm/xe/RAS: send multicast event on occurrence of an
>error
>
>Whenever a correctable or an uncorrectable error happens an event is sent
>to the corresponding listeners of these groups.
>
>v2: Rebase

Hi Aravind,

This looks reasonable to me.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl at intel.com>

M

>Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty at linux.intel.com>
>---
> drivers/gpu/drm/xe/xe_hw_error.c | 33
>++++++++++++++++++++++++++++++++
> 1 file changed, 33 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_hw_error.c
>b/drivers/gpu/drm/xe/xe_hw_error.c
>index bab6d4cf0b69..b0befb5e01cb 100644
>--- a/drivers/gpu/drm/xe/xe_hw_error.c
>+++ b/drivers/gpu/drm/xe/xe_hw_error.c
>@@ -786,6 +786,37 @@ xe_soc_hw_error_handler(struct xe_tile *tile, const
>enum hardware_error hw_err)
> 				(HARDWARE_ERROR_MAX << 1) + 1);
> }
>
>+static void
>+generate_netlink_event(struct xe_device *xe, const enum hardware_error
>hw_err)
>+{
>+	struct sk_buff *msg;
>+	void *hdr;
>+
>+	if (!xe->drm.drm_genl_family.module)
>+		return;
>+
>+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
>+	if (!msg) {
>+		drm_dbg_driver(&xe->drm, "couldn't allocate memory for error
>multicast event\n");
>+		return;
>+	}
>+
>+	hdr = genlmsg_put(msg, 0, 0, &xe->drm.drm_genl_family, 0,
>DRM_RAS_CMD_ERROR_EVENT);
>+	if (!hdr) {
>+		drm_dbg_driver(&xe->drm, "mutlicast msg buffer is small\n");
>+		nlmsg_free(msg);
>+		return;
>+	}
>+
>+	genlmsg_end(msg, hdr);
>+
>+	genlmsg_multicast(&xe->drm.drm_genl_family, msg, 0,
>+			  hw_err ?
>+			  DRM_GENL_MCAST_UNCORR_ERR
>+			  : DRM_GENL_MCAST_CORR_ERR,
>+			  GFP_ATOMIC);
>+}
>+
> static void
> xe_hw_error_source_handler(struct xe_tile *tile, const enum hardware_error
>hw_err)
> {
>@@ -849,6 +880,8 @@ xe_hw_error_source_handler(struct xe_tile *tile, const
>enum hardware_error hw_er
> 	}
>
> 	xe_mmio_write32(gt, DEV_ERR_STAT_REG(hw_err), errsrc);
>+
>+	generate_netlink_event(tile_to_xe(tile), hw_err);
> unlock:
> 	spin_unlock_irqrestore(&tile_to_xe(tile)->irq.lock, flags);
> }
>--
>2.25.1



More information about the Intel-xe mailing list