[Intel-xe] [PATCH 3/3] drm/xe: Correlate engine and cpu timestamps with better accuracy

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Sat Aug 12 00:30:18 UTC 2023


On Fri, Aug 11, 2023 at 04:32:32PM -0700, Dixit, Ashutosh wrote:
>On Fri, 04 Aug 2023 14:47:59 -0700, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>> On Fri, Aug 04, 2023 at 02:32:53PM -0700, Umesh Nerlige Ramappa wrote:
>> > +/**
>> > + * struct drm_xe_query_cs_cycles - correlate CPU and GPU timestamps
>> > + *
>> > + * If a query is made with a struct drm_xe_device_query where .query
>> > + * is equal to DRM_XE_QUERY_CS_CYCLES, then the reply uses
>> > + * struct drm_xe_query_cs_cycles in .data.
>> > + *
>> > + * struct drm_xe_query_cs_cycles is allocated by the user and .data points to
>> > + * this allocated structure. The user must pass .eci and .clockid as inputs to
>> > + * this query. eci determines the engine and tile info required to fetch the
>> > + * relevant GPU timestamp. clockid is used to return the specific CPU
>> > + * timestamp.
>> > + *
>> > + * The query returns the command streamer cycles and the frequency that can
>> > + * be used to calculate the command streamer timestamp. In addition the
>> > + * query returns a set of cpu timestamps that indicate when the command
>> > + * streamer cycle count was captured.
>> > + */
>> > +struct drm_xe_query_cs_cycles {
>> > +	/** Engine for which command streamer cycles is queried. */
>> > +	struct drm_xe_engine_class_instance eci;
>> > +
>> > +	/** MBZ (pad eci to 64 bit) */
>> > +	__u16 rsvd;
>>
>> I need some inputs on the rsvd field here. Looks like struct
>> drm_xe_engine_class_instance may need padding (64 bit aligned) if used this
>> way.  Is this the right way to pad it? Should the padding be moved to
>> struct drm_xe_engine_class_instance? OR should struct
>> drm_xe_engine_class_instance be packed?
>
>Does it need to be 64 bit aligned? The next field can be an unaligned u64
>field, it should be ok, x86 at least should be able to handle it.

Then compiler would potentially insert a pad between eci and the next 
u64. The idea is to avoid an uninitialized padding.

>
>But yes, nice to have all struct's 64 bit aligned. IMO the right way to do
>it would be to pad 'struct drm_xe_engine_class_instance' to be 64 bit
>aligned. Because if 'struct drm_xe_engine_class_instance' changes, the
>above '__u16 rsvd' field might not work and is error prone and will also
>need to be modified. So better to pad 'struct drm_xe_engine_class_instance'
>to be 64 bit aligned.

Agree, padding drm_xe_engine_class_instance makes sense.

Thanks,
Umesh

>
>Thanks.
>--
>Ashutosh
>
>
>>
>> > +
>> > +	/**
>> > +	 * Command streamer cycles as read from the command streamer
>> > +	 * register at 0x358 offset.
>> > +	 */
>> > +	__u64 cs_cycles;
>> > +
>> > +	/** Frequency of the cs cycles in Hz. */
>> > +	__u64 cs_frequency;
>> > +
>> > +	/**
>> > +	 * CPU timestamp in ns. The timestamp is captured before reading the
>> > +	 * cs_cycles register using the reference clockid set by the user.
>> > +	 */
>> > +	__u64 cpu_timestamp;
>> > +
>> > +	/**
>> > +	 * Time delta in ns captured around reading the lower dword of the
>> > +	 * cs_cycles register.
>> > +	 */
>> > +	__u64 cpu_delta;
>> > +
>> > +	/**
>> > +	 * Reference clock id for CPU timestamp. For definition, see
>> > +	 * clock_gettime(2) and perf_event_open(2). Supported clock ids are
>> > +	 * CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW, CLOCK_REALTIME, CLOCK_BOOTTIME,
>> > +	 * CLOCK_TAI.
>> > +	 */
>> > +	__s32 clockid;
>> > +
>> > +	/** Width of the cs cycle counter in bits. */
>> > +	__u32 width;
>> > +};
>> > +
>> > /**
>> >  * struct drm_xe_query_mem_usage - describe memory regions and usage
>> >  *
>> > @@ -395,6 +471,7 @@ struct drm_xe_device_query {
>> > #define DRM_XE_DEVICE_QUERY_GTS		3
>> > #define DRM_XE_DEVICE_QUERY_HWCONFIG	4
>> > #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY	5
>> > +#define DRM_XE_QUERY_CS_CYCLES		6
>> >	/** @query: The type of data to query */
>> >	__u32 query;
>> >
>> > @@ -737,24 +814,6 @@ struct drm_xe_exec_queue_set_property {
>> >	__u64 reserved[2];
>> > };
>> >
>> > -/** struct drm_xe_engine_class_instance - instance of an engine class */
>> > -struct drm_xe_engine_class_instance {
>> > -#define DRM_XE_ENGINE_CLASS_RENDER		0
>> > -#define DRM_XE_ENGINE_CLASS_COPY		1
>> > -#define DRM_XE_ENGINE_CLASS_VIDEO_DECODE	2
>> > -#define DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE	3
>> > -#define DRM_XE_ENGINE_CLASS_COMPUTE		4
>> > -	/*
>> > -	 * Kernel only class (not actual hardware engine class). Used for
>> > -	 * creating ordered queues of VM bind operations.
>> > -	 */
>> > -#define DRM_XE_ENGINE_CLASS_VM_BIND		5
>> > -	__u16 engine_class;
>> > -
>> > -	__u16 engine_instance;
>> > -	__u16 gt_id;
>> > -};
>> > -
>> > struct drm_xe_exec_queue_create {
>> > #define XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY               0
>> >	/** @extensions: Pointer to the first extension struct, if any */
>> > --
>> > 2.38.1
>> >


More information about the Intel-xe mailing list