[Intel-xe] [PATCH 01/10] drm/xe/oa: Introduce OA uapi

Dixit, Ashutosh ashutosh.dixit at intel.com
Mon Aug 21 16:48:33 UTC 2023


On Tue, 15 Aug 2023 19:14:40 -0700, Umesh Nerlige Ramappa wrote:
>

Hi Umesh,

Some more thoughts.

> On Tue, Aug 15, 2023 at 12:29:59PM -0700, Dixit, Ashutosh wrote:
> > On Tue, 08 Aug 2023 15:59:24 -0700, Umesh Nerlige Ramappa wrote:
> >>
> >> This looks good in terms of what existed in i915. I have some comments here
> >> from an improvement perspective. Please let me know what you think.
> >>
> >> On Mon, Aug 07, 2023 at 06:31:50PM -0700, Ashutosh Dixit wrote:
> >> > OA uapi allows userspace to:
> >> > * Read streams of performance counters written by hardware
> >> > * Configure (and reconfigure) which sets of perf counters are captured as
> >> >  part of OA streams
> >> > * Configure other properties (such as format and periodicity) of such
> >> >  captures.
> >> > * Query associated parameters such as OA unit timestamp freq, oa_unit_id's
> >> >  for hw engines and OA ioctl version
> >> >
> >> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit at intel.com>
> >> > ---
> >> > include/uapi/drm/xe_drm.h | 257 +++++++++++++++++++++++++++++++++++++-
> >> > 1 file changed, 256 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> >> > index 86f16d50e9ccc..b4ab07c285245 100644
> >> > --- a/include/uapi/drm/xe_drm.h
> >> > +++ b/include/uapi/drm/xe_drm.h
> >> > @@ -111,6 +111,9 @@ struct xe_user_extension {
> >> > #define DRM_XE_WAIT_USER_FENCE		0x0b
> >> > #define DRM_XE_VM_MADVISE		0x0c
> >> > #define DRM_XE_EXEC_QUEUE_GET_PROPERTY	0x0d
> >> > +#define DRM_XE_OA_OPEN			0x36
> >> > +#define DRM_XE_OA_ADD_CONFIG		0x37
> >> > +#define DRM_XE_OA_REMOVE_CONFIG		0x38
> >>
> >> No holes in ioctl numbers above, as in use 0xe 0xf 0x10 instead of 0x36,
> >> 0x37 and 0x38. Any reason you have left it same as i915?
> >
> > Just wanted to get the patches out on the mailing list soon, having the
> > ioctl numbers same as i915 saves a bunch of IGT changes.
> >
> > Also, I am providing these patches to UMD's to test and the holes allow the
> > ioctls to remain undisturbed even if new ioctls are introduced in non-oa.
> >
> > Even the oa_formats below are the same value as i915 :)
> >
> > But yes this will eventually have to be fixed before merging.
>
> sounds good
>
> >
> >>
> >> >
> >> > /* Must be kept compact -- no holes */
> >> > #define DRM_IOCTL_XE_DEVICE_QUERY		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_DEVICE_QUERY, struct drm_xe_device_query)
> >> > @@ -127,6 +130,9 @@ struct xe_user_extension {
> >> > #define DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY	 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_EXEC_QUEUE_SET_PROPERTY, struct drm_xe_exec_queue_set_property)
> >> > #define DRM_IOCTL_XE_WAIT_USER_FENCE		DRM_IOWR(DRM_COMMAND_BASE + DRM_XE_WAIT_USER_FENCE, struct drm_xe_wait_user_fence)
> >> > #define DRM_IOCTL_XE_VM_MADVISE			 DRM_IOW(DRM_COMMAND_BASE + DRM_XE_VM_MADVISE, struct drm_xe_vm_madvise)
> >> > +#define DRM_IOCTL_XE_OA_OPEN			DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OA_OPEN, struct drm_xe_oa_open_param)
> >> > +#define DRM_IOCTL_XE_OA_ADD_CONFIG		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OA_ADD_CONFIG, struct drm_xe_oa_config)
> >> > +#define DRM_IOCTL_XE_OA_REMOVE_CONFIG		DRM_IOW(DRM_COMMAND_BASE + DRM_XE_OA_REMOVE_CONFIG, __u64)
> >> >
> >> > /**
> >> >  * enum drm_xe_memory_class - Supported memory classes.
> >> > @@ -261,7 +267,8 @@ struct drm_xe_query_config {
> >> > #define XE_QUERY_CONFIG_GT_COUNT		4
> >> > #define XE_QUERY_CONFIG_MEM_REGION_COUNT	5
> >> > #define XE_QUERY_CONFIG_MAX_ENGINE_PRIORITY	6
> >> > -#define XE_QUERY_CONFIG_NUM_PARAM		(XE_QUERY_CONFIG_MAX_ENGINE_PRIORITY + 1)
> >> > +#define XE_QUERY_OA_IOCTL_VERSION		7
> >>
> >> I am thinking we should just have a query for oa_info where things like
> >> revision, oa_timestamp_frequency and other OA specific info like supported
> >> formats (spec - 52198) can be queried by the user.

This is also simplified somewhat if we go with the one char device file per
OA unit I mentioned in the other mail. So we need info about just that one
OA unit, rather than multiplex information about all OA units into the
ioctl.

Not sure if I mentioned earlier that this oa_info is a little bit
complicated otherwise: e.g. oa_timestamp_frequency is per gt, oa_unit_id is
per engine etc. Also we may need to exposed mapping from oa_unit_id to
engines attached to that oa unit.

E.g. if we don't expose oa_format in xe_drm.h (see below) I think oa_info
for each oa unit is something like:

{ oa_unit_id, oa_timestamp_freq, list-of-attached-engines}


> >> Also iirc, revision was just added as a way to determine new
> >> extensions/features to OA, but I don't see that mechanism used by any other
> >> interface here. Maybe we should see how other interfaces are adding
> >> features and detecting their presence and use something similar.
> >
> > I believe that would be doing this using xe_user_extension's, both set and
> > get. See exec_queue_set_property and exec_queue_get_property. Revision
> > etc. won't be needed if done this way.

Actually this is not strictly true. xe_user_extension's make the interface
more extensible but don't really eliminate the need for a version.

But I agree, we need to drop this version. Worst case userland can just use
the kernel version (and maintain a mapping of which OA features are
available in which kernel version if needed)?

Also, for a simple property <key, value> interface, xe_user_extension's
seem a bit of an overkill (but drm_xe_ext_exec_queue_set_property seems to
have done exactly that). To me xe_user_extension seems useful if properties
are associated with bigger data structs.

So I am still thinking about it. Maybe we should check if the convention in
XE is to expose even property <key, value> interfaces with
xe_user_extension. Certainly looks like that.

Or maybe we can separate out the data structs: One for basic OA params and
one for the query params (OAR/MI_RPC) etc. Or we can retain the one
property per data struct approach used in
drm_xe_ext_exec_queue_set_property. Simplest would be to keep the current
struct but add a xe_user_extension field at the top, so that it is the same
as now but can be extended in the future.

> >
> >>
> >> > +#define XE_QUERY_CONFIG_NUM_PARAM		(XE_QUERY_OA_IOCTL_VERSION + 1)
> >> >	/** @info: array of elements containing the config info */
> >> >	__u64 info[];
> >> > };
> >> > @@ -298,6 +305,7 @@ struct drm_xe_query_gts {
> >> >		__u64 native_mem_regions;	/* bit mask of instances from drm_xe_query_mem_usage */
> >> >		__u64 slow_mem_regions;		/* bit mask of instances from drm_xe_query_mem_usage */
> >> >		__u64 inaccessible_mem_regions;	/* bit mask of instances from drm_xe_query_mem_usage */
> >> > +		__u64 oa_timestamp_freq;
> >> >		__u64 reserved[8];
> >> >	} gts[];
> >> > };
> >> > @@ -753,6 +761,7 @@ struct drm_xe_engine_class_instance {
> >> >
> >> >	__u16 engine_instance;
> >> >	__u16 gt_id;
> >> > +	__u16 oa_unit_id;
> >> > };
> >> >
> >> > struct drm_xe_exec_queue_create {
> >> > @@ -1056,6 +1065,252 @@ struct drm_xe_vm_madvise {
> >> >	__u64 reserved[2];
> >> > };
> >> >
> >> > +enum drm_xe_oa_format {
> >> > +	XE_OA_FORMAT_C4_B8 = 7,
> >> > +
> >> > +	/* Gen8+ */
> >> > +	XE_OA_FORMAT_A12,
> >> > +	XE_OA_FORMAT_A12_B8_C8,
> >> > +	XE_OA_FORMAT_A32u40_A4u32_B8_C8,
> >> > +
> >> > +	/* DG2 */
> >> > +	XE_OAR_FORMAT_A32u40_A4u32_B8_C8,
> >> > +	XE_OA_FORMAT_A24u40_A14u32_B8_C8,
> >> > +
> >> > +	/* MTL OAM */
> >> > +	XE_OAM_FORMAT_MPEC8u64_B8_C8,
> >> > +	XE_OAM_FORMAT_MPEC8u32_B8_C8,
> >> > +
> >> > +	XE_OA_FORMAT_MAX	    /* non-ABI */
> >> > +};
> >>
> >> If we support a query to get the supported OA formats, we can do away with
> >> this.
> >
> > OK, yeah get using user_ext's.
> >
> >> Also thinking if we can provide the format definition in such a way
> >> that UMD can just decode the counters from it with some generic code. That
> >> way, we don't have to constantly keep updating UMD code for every new
> >> platforms (may be far-fetched, but could save us a lot of
> >> time). Thoughts?
> >
> > Ok, but I am not sure how to do this.
>
> Just a thought, we can ignore that for now.

I am not sure if exposing valid counter_select values has much use since
userland has metrics files so already knows supported counter_select
values. How about not exposing anything at all regarding this in xe_drm.h?
Userland can just send the counter_select values it already knows about and
XE can just validate these. LNL etc. need a couple of other fields, so we
can just define the bitfields XE is expecting and have userspace fill these
in. XE will validate all of these.

I think we need a little bit of encapsulation (provided the OA format) if
we remove this, but if it causes problems in uapi updates etc. we can
probably get rid of it.

Again, assuming we'll change the python scripts etc. for this. Maybe we
/should/ prefer a clean interface to these userspace modifications?

> >
> >> > +enum drm_xe_oa_property_id {
> >> > +	/**
> >> > +	 * Open the stream for a specific exec queue id (as used with
> >> > +	 * drm_xe_exec). A stream opened for a specific exec queue id this
> >> > +	 * way won't typically require root privileges.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_EXEC_QUEUE_ID = 1,
> >> > +
> >> > +	/**
> >> > +	 * A value of 1 requests the inclusion of raw OA unit reports as
> >> > +	 * part of stream samples.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_SAMPLE_OA,
> >> > +
> >> > +	/**
> >> > +	 * The value specifies which set of OA unit metrics should be
> >> > +	 * configured, defining the contents of any OA unit reports.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_OA_METRICS_SET,
> >> > +
> >> > +	/**
> >> > +	 * The value specifies the size and layout of OA unit reports.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_OA_FORMAT,
> >> > +
> >> > +	/**
> >> > +	 * Specifying this property implicitly requests periodic OA unit
> >> > +	 * sampling and (at least on Haswell) the sampling frequency is derived
> >> > +	 * from this exponent as follows:
> >> > +	 *
> >> > +	 *   80ns * 2^(period_exponent + 1)
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_OA_EXPONENT,
> >> > +
> >> > +	/**
> >> > +	 * Specifying this property is only valid when specify a context to
> >> > +	 * filter with DRM_XE_OA_PROP_ENGINE_ID. Specifying this property
> >> > +	 * will hold preemption of the particular engine we want to gather
> >> > +	 * performance data about.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_HOLD_PREEMPTION,
> >> > +
> >> > +	/**
> >> > +	 * Specifying this pins all contexts to the specified SSEU power
> >> > +	 * configuration for the duration of the recording.
> >> > +	 *
> >> > +	 * This parameter's value is a pointer to a struct
> >> > +	 * drm_xe_gem_context_param_sseu (TBD).
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_GLOBAL_SSEU,
> >> > +
> >> > +	/**
> >> > +	 * This optional parameter specifies the timer interval in nanoseconds
> >> > +	 * at which the xe driver will check the OA buffer for available data.
> >> > +	 * Minimum allowed value is 100 microseconds. A default value is used by
> >> > +	 * the driver if this parameter is not specified. Note that larger timer
> >> > +	 * values will reduce cpu consumption during OA perf captures. However,
> >> > +	 * excessively large values would potentially result in OA buffer
> >> > +	 * overwrites as captures reach end of the OA buffer.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_POLL_OA_PERIOD,
> >> > +
> >> > +	/**
> >> > +	 * Multiple engines may be mapped to the same OA unit. The OA unit is
> >> > +	 * identified by class:instance of any engine mapped to it.
> >> > +	 *
> >> > +	 * This parameter specifies the engine class and must be passed along
> >> > +	 * with DRM_XE_OA_PROP_OA_ENGINE_INSTANCE.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_OA_ENGINE_CLASS,
> >> > +
> >> > +	/**
> >> > +	 * This parameter specifies the engine instance and must be passed along
> >> > +	 * with DRM_XE_OA_PROP_OA_ENGINE_CLASS.
> >> > +	 */
> >> > +	DRM_XE_OA_PROP_OA_ENGINE_INSTANCE,
> >> > +
> >> > +	DRM_XE_OA_PROP_MAX /* non-ABI */
> >> > +};
> >>
> >> Wish we could replace the prop-value interface here. Most ioctls just use a
> >> struct.  We could do so with some room for additional rsvd params or some
> >> room for extensions similar to other interfaces here.
> >
> > Yes again use user_ext's for these.

Already covered the options for these above.

> >
> >>
> >> > +
> >> > +struct drm_xe_oa_open_param {
> >> > +	__u32 flags;
> >> > +#define XE_OA_FLAG_FD_CLOEXEC	BIT(0)
> >> > +#define XE_OA_FLAG_FD_NONBLOCK	BIT(1)
> >> > +#define XE_OA_FLAG_DISABLED	BIT(2)
> >> > +
> >> > +	/** The number of u64 (id, value) pairs */
> >> > +	__u32 num_properties;
> >> > +
> >> > +	/**
> >> > +	 * Pointer to array of u64 (id, value) pairs configuring the stream
> >> > +	 * to open.
> >> > +	 */
> >> > +	__u64 properties_ptr;
> >> > +};
> >> > +
> >> > +/*
> >> > + * Enable data capture for a stream that was either opened in a disabled state
> >> > + * via I915_PERF_FLAG_DISABLED or was later disabled via
> >> > + * I915_PERF_IOCTL_DISABLE.
> >> > + *
> >> > + * It is intended to be cheaper to disable and enable a stream than it may be
> >> > + * to close and re-open a stream with the same configuration.
> >> > + *
> >> > + * It's undefined whether any pending data for the stream will be lost.
> >> > + */
> >> > +#define XE_OA_IOCTL_ENABLE	_IO('i', 0x0)
> >> > +
> >> > +/*
> >> > + * Disable data capture for a stream.
> >> > + *
> >> > + * It is an error to try and read a stream that is disabled.
> >> > + */
> >> > +#define XE_OA_IOCTL_DISABLE	_IO('i', 0x1)
> >> > +
> >> > +/*
> >> > + * Change metrics_set captured by a stream.
> >> > + *
> >> > + * If the stream is bound to a specific context, the configuration change
> >> > + * will performed inline with that context such that it takes effect before
> >> > + * the next execbuf submission.
> >> > + *
> >> > + * Returns the previously bound metrics set id, or a negative error code.
> >> > + */
> >> > +#define XE_OA_IOCTL_CONFIG	_IO('i', 0x2)
> >> > +
> >> > +struct drm_xe_oa_record_header {
> >> > +	__u32 type;
> >> > +	__u16 pad;
> >> > +	__u16 size;
> >> > +};
> >>
> >> Note to myself to ensure we are adding the header per read call now and not
> >> per report.
> >>
> >> > +
> >> > +enum drm_xe_oa_record_type {
> >> > +	/**
> >> > +	 * Samples are the work horse record type whose contents are
> >> > +	 * extensible and defined when opening an xe oa stream based on the
> >> > +	 * given properties.
> >> > +	 *
> >> > +	 * Boolean properties following the naming convention
> >> > +	 * DRM_XE_OA_SAMPLE_xyz_PROP request the inclusion of 'xyz' data in
> >> > +	 * every sample.
> >> > +	 *
> >> > +	 * The order of these sample properties given by userspace has no
> >> > +	 * affect on the ordering of data within a sample. The order is
> >> > +	 * documented here.
> >> > +	 *
> >> > +	 * struct {
> >> > +	 *     struct drm_xe_oa_record_header header;
> >> > +	 *
> >> > +	 *     { u32 oa_report[]; } && DRM_XE_OA_PROP_SAMPLE_OA
> >> > +	 * };
> >> > +	 */
> >> > +	DRM_XE_OA_RECORD_SAMPLE = 1,
> >> > +
> >> > +	/**
> >> > +	 * Indicates that one or more OA reports were not written by the
> >> > +	 * hardware. This can happen for example if an MI_REPORT_PERF_COUNT
> >> > +	 * command collides with periodic sampling - which would be more likely
> >> > +	 * at higher sampling frequencies.
> >> > +	 */
> >> > +	DRM_XE_OA_RECORD_OA_REPORT_LOST = 2,
> >> > +
> >> > +	/**
> >> > +	 * An error occurred that resulted in all pending OA reports being lost.
> >> > +	 */
> >> > +	DRM_XE_OA_RECORD_OA_BUFFER_LOST = 3,
> >> > +
> >> > +	DRM_XE_OA_RECORD_MAX /* non-ABI */
> >> > +};
> >> > +
> >> > +struct drm_xe_oa_config {
> >> > +	/**
> >> > +	 * @uuid:
> >> > +	 *
> >> > +	 * String formatted like "%\08x-%\04x-%\04x-%\04x-%\012x"
> >> > +	 */
> >> > +	char uuid[36];
> >> > +
> >> > +	/**
> >> > +	 * @n_mux_regs:
> >> > +	 *
> >> > +	 * Number of mux regs in &mux_regs_ptr.
> >> > +	 */
> >> > +	__u32 n_mux_regs;
> >> > +
> >> > +	/**
> >> > +	 * @n_boolean_regs:
> >> > +	 *
> >> > +	 * Number of boolean regs in &boolean_regs_ptr.
> >> > +	 */
> >> > +	__u32 n_boolean_regs;
> >> > +
> >> > +	/**
> >> > +	 * @n_flex_regs:
> >> > +	 *
> >> > +	 * Number of flex regs in &flex_regs_ptr.
> >> > +	 */
> >> > +	__u32 n_flex_regs;
> >> > +
> >> > +	/**
> >> > +	 * @mux_regs_ptr:
> >> > +	 *
> >> > +	 * Pointer to tuples of u32 values (register address, value) for mux
> >> > +	 * registers.  Expected length of buffer is (2 * sizeof(u32) *
> >> > +	 * &n_mux_regs).
> >> > +	 */
> >> > +	__u64 mux_regs_ptr;
> >> > +
> >> > +	/**
> >> > +	 * @boolean_regs_ptr:
> >> > +	 *
> >> > +	 * Pointer to tuples of u32 values (register address, value) for mux
> >> > +	 * registers.  Expected length of buffer is (2 * sizeof(u32) *
> >> > +	 * &n_boolean_regs).
> >> > +	 */
> >> > +	__u64 boolean_regs_ptr;
> >> > +
> >> > +	/**
> >> > +	 * @flex_regs_ptr:
> >> > +	 *
> >> > +	 * Pointer to tuples of u32 values (register address, value) for mux
> >> > +	 * registers.  Expected length of buffer is (2 * sizeof(u32) *
> >> > +	 * &n_flex_regs).
> >> > +	 */
> >> > +	__u64 flex_regs_ptr;
> >> > +};
> >>
> >> We could just have n_regs and regs_ptr for the above struct instead of
> >> breaking this down into mux, b and flex. The kernel implementation does not
> >> care because it only checks for these.
> >
> > But the kmd checks for separate addr ranges for mux, b and flex? How to
> > handle that if have just n_regs and regs_ptr?
>
> The per platform ranges will still exist, but the registers will be
> compared against all ranges (mux can be compared against b and flex
> ranges. We don't have too many ranges to be concerned about the comparison
> overhead.
>
> Well.. any changes would impact metrics scripts, so we can leave it as is.

Once again, I think we should prefer a clean uapi over the work for
userspace changes, the uapi is "forever".

> >> The metrics XML processing may need to be tweaked though. Worth it if it
> >> simplifies the KMD code.

Let's make some decisions and close on these things soon.

Thanks.
--
Ashutosh


More information about the Intel-xe mailing list