[Intel-xe] [PATCH 3/3] drm/xe: Correlate engine and cpu timestamps with better accuracy
Dixit, Ashutosh
ashutosh.dixit at intel.com
Fri Aug 11 23:32:32 UTC 2023
On Fri, 04 Aug 2023 14:47:59 -0700, Umesh Nerlige Ramappa wrote:
>
Hi Umesh,
> On Fri, Aug 04, 2023 at 02:32:53PM -0700, Umesh Nerlige Ramappa wrote:
> > +/**
> > + * struct drm_xe_query_cs_cycles - correlate CPU and GPU timestamps
> > + *
> > + * If a query is made with a struct drm_xe_device_query where .query
> > + * is equal to DRM_XE_QUERY_CS_CYCLES, then the reply uses
> > + * struct drm_xe_query_cs_cycles in .data.
> > + *
> > + * struct drm_xe_query_cs_cycles is allocated by the user and .data points to
> > + * this allocated structure. The user must pass .eci and .clockid as inputs to
> > + * this query. eci determines the engine and tile info required to fetch the
> > + * relevant GPU timestamp. clockid is used to return the specific CPU
> > + * timestamp.
> > + *
> > + * The query returns the command streamer cycles and the frequency that can
> > + * be used to calculate the command streamer timestamp. In addition the
> > + * query returns a set of cpu timestamps that indicate when the command
> > + * streamer cycle count was captured.
> > + */
> > +struct drm_xe_query_cs_cycles {
> > + /** Engine for which command streamer cycles is queried. */
> > + struct drm_xe_engine_class_instance eci;
> > +
> > + /** MBZ (pad eci to 64 bit) */
> > + __u16 rsvd;
>
> I need some inputs on the rsvd field here. Looks like struct
> drm_xe_engine_class_instance may need padding (64 bit aligned) if used this
> way. Is this the right way to pad it? Should the padding be moved to
> struct drm_xe_engine_class_instance? OR should struct
> drm_xe_engine_class_instance be packed?
Does it need to be 64 bit aligned? The next field can be an unaligned u64
field, it should be ok, x86 at least should be able to handle it.
But yes, nice to have all struct's 64 bit aligned. IMO the right way to do
it would be to pad 'struct drm_xe_engine_class_instance' to be 64 bit
aligned. Because if 'struct drm_xe_engine_class_instance' changes, the
above '__u16 rsvd' field might not work and is error prone and will also
need to be modified. So better to pad 'struct drm_xe_engine_class_instance'
to be 64 bit aligned.
Thanks.
--
Ashutosh
>
> > +
> > + /**
> > + * Command streamer cycles as read from the command streamer
> > + * register at 0x358 offset.
> > + */
> > + __u64 cs_cycles;
> > +
> > + /** Frequency of the cs cycles in Hz. */
> > + __u64 cs_frequency;
> > +
> > + /**
> > + * CPU timestamp in ns. The timestamp is captured before reading the
> > + * cs_cycles register using the reference clockid set by the user.
> > + */
> > + __u64 cpu_timestamp;
> > +
> > + /**
> > + * Time delta in ns captured around reading the lower dword of the
> > + * cs_cycles register.
> > + */
> > + __u64 cpu_delta;
> > +
> > + /**
> > + * Reference clock id for CPU timestamp. For definition, see
> > + * clock_gettime(2) and perf_event_open(2). Supported clock ids are
> > + * CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW, CLOCK_REALTIME, CLOCK_BOOTTIME,
> > + * CLOCK_TAI.
> > + */
> > + __s32 clockid;
> > +
> > + /** Width of the cs cycle counter in bits. */
> > + __u32 width;
> > +};
> > +
> > /**
> > * struct drm_xe_query_mem_usage - describe memory regions and usage
> > *
> > @@ -395,6 +471,7 @@ struct drm_xe_device_query {
> > #define DRM_XE_DEVICE_QUERY_GTS 3
> > #define DRM_XE_DEVICE_QUERY_HWCONFIG 4
> > #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY 5
> > +#define DRM_XE_QUERY_CS_CYCLES 6
> > /** @query: The type of data to query */
> > __u32 query;
> >
> > @@ -737,24 +814,6 @@ struct drm_xe_exec_queue_set_property {
> > __u64 reserved[2];
> > };
> >
> > -/** struct drm_xe_engine_class_instance - instance of an engine class */
> > -struct drm_xe_engine_class_instance {
> > -#define DRM_XE_ENGINE_CLASS_RENDER 0
> > -#define DRM_XE_ENGINE_CLASS_COPY 1
> > -#define DRM_XE_ENGINE_CLASS_VIDEO_DECODE 2
> > -#define DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE 3
> > -#define DRM_XE_ENGINE_CLASS_COMPUTE 4
> > - /*
> > - * Kernel only class (not actual hardware engine class). Used for
> > - * creating ordered queues of VM bind operations.
> > - */
> > -#define DRM_XE_ENGINE_CLASS_VM_BIND 5
> > - __u16 engine_class;
> > -
> > - __u16 engine_instance;
> > - __u16 gt_id;
> > -};
> > -
> > struct drm_xe_exec_queue_create {
> > #define XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY 0
> > /** @extensions: Pointer to the first extension struct, if any */
> > --
> > 2.38.1
> >
More information about the Intel-xe
mailing list