[PATCH v3 1/1] drm/xe/eustall: Add support for EU stall sampling
Dixit, Ashutosh
ashutosh.dixit at intel.com
Mon Sep 9 19:27:05 UTC 2024
On Mon, 09 Sep 2024 00:36:40 -0700, Harish Chegondi wrote:
>
Hi Harish,
A few new comments on just the uapi, which I missed out previously.
> diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> index b6fbe4988f2e..c836227d064f 100644
> --- a/include/uapi/drm/xe_drm.h
> +++ b/include/uapi/drm/xe_drm.h
> @@ -1395,6 +1395,8 @@ struct drm_xe_wait_user_fence {
> enum drm_xe_observation_type {
> /** @DRM_XE_OBSERVATION_TYPE_OA: OA observation stream type */
> DRM_XE_OBSERVATION_TYPE_OA,
> + /** @DRM_XE_OBSERVATION_TYPE_EU_STALL: EU stall sampling observation stream type */
> + DRM_XE_OBSERVATION_TYPE_EU_STALL,
> };
>
> /**
> @@ -1694,6 +1696,138 @@ struct drm_xe_oa_stream_info {
> __u64 reserved[3];
> };
>
> +/**
> + * enum drm_xe_eu_stall_property_id - EU stall data stream property ids.
> + *
> + * These properties are passed to the driver as a chain of
> + * @drm_xe_ext_set_property structures with @property set to these
> + * properties' enums and @value set to the corresponding values of these
> + * properties. @drm_xe_user_extension base.name should be set to
> + * @DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY.
> + */
> +enum drm_xe_eu_stall_property_id {
> +#define DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY 0
> + /**
> + * @DRM_XE_EU_STALL_PROP_BUF_SZ: Per DSS Memory Buffer Size.
> + * Valid values are 128 KB, 256 KB, and 512 KB.
> + */
> + DRM_XE_EU_STALL_PROP_BUF_SZ = 1,
> +
> + /**
> + * @DRM_XE_EU_STALL_PROP_SAMPLE_RATE: Sampling rate
> + * in multiples of 251 cycles. Valid values are 1 to 7.
> + * If the value is 1, sampling interval is 251 cycles.
> + * If the value is 7, sampling interval is 7 x 251 cycles.
> + */
> + DRM_XE_EU_STALL_PROP_SAMPLE_RATE,
> +
> + /**
> + * @DRM_XE_EU_STALL_PROP_POLL_PERIOD: EU stall data
> + * poll period in nanoseconds. should be at least 100000 ns.
> + */
> + DRM_XE_EU_STALL_PROP_POLL_PERIOD,
> +
> + /**
> + * @DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT: Minimum number of
> + * EU stall data rows to be present in the kernel buffer for
> + * poll() to set POLLIN (data present).
> + */
> + DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT,
> +
> + /**
> + * @DRM_XE_EU_STALL_PROP_GT_ID: GT ID of the GT on which
> + * EU stall data will be captured.
> + */
> + DRM_XE_EU_STALL_PROP_GT_ID,
> +
> + /**
> + * @DRM_XE_EU_STALL_PROP_OPEN_DISABLED: A value of 1 will open
> + * the EU stall data stream without enabling EU stall sampling.
> + */
> + DRM_XE_EU_STALL_PROP_OPEN_DISABLED,
> +};
If we add new properties in the future, UMD's would need to be able to
query if a particular kernel supports these new properties. For this, we
need a query even before the EU Stall stream is opened.
Here we should follow the same scheme as in OA and introduce an EU stall
'capabilities'. For how this is done in OA, see 'struct drm_xe_oa_unit' in
xe_drm.h which is returned from DRM_XE_DEVICE_QUERY_OA_UNITS. For example,
after the https://patchwork.freedesktop.org/series/137058/ series, OA
capabilities will look like this:
/** @capabilities: OA capabilities bit-mask */
__u64 capabilities;
#define DRM_XE_OA_CAPS_BASE (1 << 0)
#define DRM_XE_OA_CAPS_SYNCS (1 << 1)
Let's call this DRM_XE_DEVICE_QUERY_EU_STALL for EU stall.
> +
> +/**
> + * struct drm_xe_eu_stall_stream_info - EU stall stream info returned from
> + * @DRM_XE_OBSERVATION_IOCTL_INFO observation stream fd ioctl
> + */
> +struct drm_xe_eu_stall_stream_info {
> + /** @extensions: Pointer to the first extension struct, if any */
> + __u64 extensions;
> +
> + /** @record_size: size of each EU stall data record */
> + __u64 record_size;
If this is only platform specific, not stream specific, we should add this
also to the DRM_XE_DEVICE_QUERY_EU_STALL. If we do this, we won't have a
'struct drm_xe_eu_stall_stream_info', unless we need to return some other
stream specific information in the future.
> +
> + /** @reserved: reserved for future use */
> + __u64 reserved[3];
> +};
Also, in response to the EIO return code, if we ever want to return a
status different from "RECORDS DROPPED" (say a status that GPU was reset
during data collection), we need to support DRM_XE_OBSERVATION_IOCTL_STATUS
for EU stall, similar to how it is done for OA.
IMO, DRM_XE_DEVICE_QUERY_EU_STALL and DRM_XE_OBSERVATION_IOCTL_STATUS need
to be introduced now, as part of this patch. Otherwise UMD's will see error
returns for older kernels, if these are introduced later.
> +
> +/**
> + * struct drm_xe_eu_stall_data_pvc - EU stall data format for PVC
> + *
> + * Bits Field
> + * 0 to 28 IP (addr)
> + * 29 to 36 active count
> + * 37 to 44 other count
> + * 45 to 52 control count
> + * 53 to 60 pipestall count
> + * 61 to 68 send count
> + * 69 to 76 dist_acc count
> + * 77 to 84 sbid count
> + * 85 to 92 sync count
> + * 93 to 100 inst_fetch count
> + */
> +struct drm_xe_eu_stall_data_pvc {
> + __u64 ip_addr:29;
> + __u64 active_count:8;
> + __u64 other_count:8;
> + __u64 control_count:8;
> + __u64 pipestall_count:8;
> + __u64 send_count:8;
> + __u64 dist_acc_count:8;
> + __u64 sbid_count:8;
> + __u64 sync_count:8;
> + __u64 inst_fetch_count:8;
> + __u64 unused_bits:27;
> + __u64 unused[6];
> +} __attribute__((packed));
> +
> +/**
> + * struct drm_xe_eu_stall_data_xe2 - EU stall data format for LNL, BMG
> + *
> + * Bits Field
> + * 0 to 28 IP (addr)
> + * 29 to 36 Tdr count
> + * 37 to 44 other count
> + * 45 to 52 control count
> + * 53 to 60 pipestall count
> + * 61 to 68 send count
> + * 69 to 76 dist_acc count
> + * 77 to 84 sbid count
> + * 85 to 92 sync count
> + * 93 to 100 inst_fetch count
> + * 101 to 108 Active count
> + * 109 to 111 Exid
> + * 112 EndFlag (is always 1)
> + */
> +struct drm_xe_eu_stall_data_xe2 {
> + __u64 ip_addr:29;
> + __u64 tdr_count:8;
> + __u64 other_count:8;
> + __u64 control_count:8;
> + __u64 pipestall_count:8;
> + __u64 send_count:8;
> + __u64 dist_acc_count:8;
> + __u64 sbid_count:8;
> + __u64 sync_count:8;
> + __u64 inst_fetch_count:8;
> + __u64 active_count:8;
> + __u64 ex_id:3;
> + __u64 end_flag:1;
> + __u64 unused_bits:15;
> + __u64 unused[6];
> +} __attribute__((packed));
IMO, these data struct's are uapi between HW and UMD, not between KMD and
UMD, so these structs should *not* be in xe_drm.h. KMD just forwards opaque
records to UMD, UMD uses "outside" knowledge to make sense of these
records.
We have not included such record formats for OA either. Record formats are
documented in Bspec and UMD's know how to interpret the records. Also note
that Linux kernel does not use bit-fields (to avoid issues related to
endianness).
Thanks.
--
Ashutosh
More information about the Intel-xe
mailing list