[RFC v2 5/5] drm/xe/RAS: send multicast event on occurrence of an error

Tomer Tayar ttayar at habana.ai
Fri Nov 10 12:27:46 UTC 2023


On 20/10/2023 18:58, Aravind Iddamsetty wrote:
> Whenever a correctable or an uncorrectable error happens an event is sent
> to the corresponding listeners of these groups.
>
> v2: Rebase
>
> Signed-off-by: Aravind Iddamsetty<aravind.iddamsetty at linux.intel.com>
> ---
>   drivers/gpu/drm/xe/xe_hw_error.c | 33 ++++++++++++++++++++++++++++++++
>   1 file changed, 33 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_hw_error.c b/drivers/gpu/drm/xe/xe_hw_error.c
> index bab6d4cf0b69..b0befb5e01cb 100644
> --- a/drivers/gpu/drm/xe/xe_hw_error.c
> +++ b/drivers/gpu/drm/xe/xe_hw_error.c
> @@ -786,6 +786,37 @@ xe_soc_hw_error_handler(struct xe_tile *tile, const enum hardware_error hw_err)
>   				(HARDWARE_ERROR_MAX << 1) + 1);
>   }
>   
> +static void
> +generate_netlink_event(struct xe_device *xe, const enum hardware_error hw_err)
> +{
> +	struct sk_buff *msg;
> +	void *hdr;
> +
> +	if (!xe->drm.drm_genl_family.module)
> +		return;
> +
> +	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
> +	if (!msg) {
> +		drm_dbg_driver(&xe->drm, "couldn't allocate memory for error multicast event\n");
> +		return;
> +	}
> +
> +	hdr = genlmsg_put(msg, 0, 0, &xe->drm.drm_genl_family, 0, DRM_RAS_CMD_ERROR_EVENT);
> +	if (!hdr) {
> +		drm_dbg_driver(&xe->drm, "mutlicast msg buffer is small\n");
> +		nlmsg_free(msg);
> +		return;
> +	}
> +
> +	genlmsg_end(msg, hdr);
> +
> +	genlmsg_multicast(&xe->drm.drm_genl_family, msg, 0,
> +			  hw_err ?
> +			  DRM_GENL_MCAST_UNCORR_ERR
> +			  : DRM_GENL_MCAST_CORR_ERR,
> +			  GFP_ATOMIC);

I agree that hiding/wrapping any netlink/genetlink API/macro with a DRM 
helper would be sometimes redundant,
and that in some cases the specific DRM driver would have to "dirt its 
hands" and deal with netlink (e.g. fill_error_details() in patch #3).
However maybe here a DRM helper would have been useful, so we won't see 
a copy of this sequence in other DRM drivers?

Thanks,
Tomer

> +}
> +
>   static void
>   xe_hw_error_source_handler(struct xe_tile *tile, const enum hardware_error hw_err)
>   {
> @@ -849,6 +880,8 @@ xe_hw_error_source_handler(struct xe_tile *tile, const enum hardware_error hw_er
>   	}
>   
>   	xe_mmio_write32(gt, DEV_ERR_STAT_REG(hw_err), errsrc);
> +
> +	generate_netlink_event(tile_to_xe(tile), hw_err);
>   unlock:
>   	spin_unlock_irqrestore(&tile_to_xe(tile)->irq.lock, flags);
>   }




More information about the dri-devel mailing list