[Intel-xe] [PATCH 3/3] drm/xe: Correlate engine and cpu timestamps with better accuracy
Umesh Nerlige Ramappa
umesh.nerlige.ramappa at intel.com
Sat Aug 12 00:30:18 UTC 2023
On Fri, Aug 11, 2023 at 04:32:32PM -0700, Dixit, Ashutosh wrote:
>On Fri, 04 Aug 2023 14:47:59 -0700, Umesh Nerlige Ramappa wrote:
>>
>
>Hi Umesh,
>
>> On Fri, Aug 04, 2023 at 02:32:53PM -0700, Umesh Nerlige Ramappa wrote:
>> > +/**
>> > + * struct drm_xe_query_cs_cycles - correlate CPU and GPU timestamps
>> > + *
>> > + * If a query is made with a struct drm_xe_device_query where .query
>> > + * is equal to DRM_XE_QUERY_CS_CYCLES, then the reply uses
>> > + * struct drm_xe_query_cs_cycles in .data.
>> > + *
>> > + * struct drm_xe_query_cs_cycles is allocated by the user and .data points to
>> > + * this allocated structure. The user must pass .eci and .clockid as inputs to
>> > + * this query. eci determines the engine and tile info required to fetch the
>> > + * relevant GPU timestamp. clockid is used to return the specific CPU
>> > + * timestamp.
>> > + *
>> > + * The query returns the command streamer cycles and the frequency that can
>> > + * be used to calculate the command streamer timestamp. In addition the
>> > + * query returns a set of cpu timestamps that indicate when the command
>> > + * streamer cycle count was captured.
>> > + */
>> > +struct drm_xe_query_cs_cycles {
>> > + /** Engine for which command streamer cycles is queried. */
>> > + struct drm_xe_engine_class_instance eci;
>> > +
>> > + /** MBZ (pad eci to 64 bit) */
>> > + __u16 rsvd;
>>
>> I need some inputs on the rsvd field here. Looks like struct
>> drm_xe_engine_class_instance may need padding (64 bit aligned) if used this
>> way. Is this the right way to pad it? Should the padding be moved to
>> struct drm_xe_engine_class_instance? OR should struct
>> drm_xe_engine_class_instance be packed?
>
>Does it need to be 64 bit aligned? The next field can be an unaligned u64
>field, it should be ok, x86 at least should be able to handle it.
Then compiler would potentially insert a pad between eci and the next
u64. The idea is to avoid an uninitialized padding.
>
>But yes, nice to have all struct's 64 bit aligned. IMO the right way to do
>it would be to pad 'struct drm_xe_engine_class_instance' to be 64 bit
>aligned. Because if 'struct drm_xe_engine_class_instance' changes, the
>above '__u16 rsvd' field might not work and is error prone and will also
>need to be modified. So better to pad 'struct drm_xe_engine_class_instance'
>to be 64 bit aligned.
Agree, padding drm_xe_engine_class_instance makes sense.
Thanks,
Umesh
>
>Thanks.
>--
>Ashutosh
>
>
>>
>> > +
>> > + /**
>> > + * Command streamer cycles as read from the command streamer
>> > + * register at 0x358 offset.
>> > + */
>> > + __u64 cs_cycles;
>> > +
>> > + /** Frequency of the cs cycles in Hz. */
>> > + __u64 cs_frequency;
>> > +
>> > + /**
>> > + * CPU timestamp in ns. The timestamp is captured before reading the
>> > + * cs_cycles register using the reference clockid set by the user.
>> > + */
>> > + __u64 cpu_timestamp;
>> > +
>> > + /**
>> > + * Time delta in ns captured around reading the lower dword of the
>> > + * cs_cycles register.
>> > + */
>> > + __u64 cpu_delta;
>> > +
>> > + /**
>> > + * Reference clock id for CPU timestamp. For definition, see
>> > + * clock_gettime(2) and perf_event_open(2). Supported clock ids are
>> > + * CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW, CLOCK_REALTIME, CLOCK_BOOTTIME,
>> > + * CLOCK_TAI.
>> > + */
>> > + __s32 clockid;
>> > +
>> > + /** Width of the cs cycle counter in bits. */
>> > + __u32 width;
>> > +};
>> > +
>> > /**
>> > * struct drm_xe_query_mem_usage - describe memory regions and usage
>> > *
>> > @@ -395,6 +471,7 @@ struct drm_xe_device_query {
>> > #define DRM_XE_DEVICE_QUERY_GTS 3
>> > #define DRM_XE_DEVICE_QUERY_HWCONFIG 4
>> > #define DRM_XE_DEVICE_QUERY_GT_TOPOLOGY 5
>> > +#define DRM_XE_QUERY_CS_CYCLES 6
>> > /** @query: The type of data to query */
>> > __u32 query;
>> >
>> > @@ -737,24 +814,6 @@ struct drm_xe_exec_queue_set_property {
>> > __u64 reserved[2];
>> > };
>> >
>> > -/** struct drm_xe_engine_class_instance - instance of an engine class */
>> > -struct drm_xe_engine_class_instance {
>> > -#define DRM_XE_ENGINE_CLASS_RENDER 0
>> > -#define DRM_XE_ENGINE_CLASS_COPY 1
>> > -#define DRM_XE_ENGINE_CLASS_VIDEO_DECODE 2
>> > -#define DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE 3
>> > -#define DRM_XE_ENGINE_CLASS_COMPUTE 4
>> > - /*
>> > - * Kernel only class (not actual hardware engine class). Used for
>> > - * creating ordered queues of VM bind operations.
>> > - */
>> > -#define DRM_XE_ENGINE_CLASS_VM_BIND 5
>> > - __u16 engine_class;
>> > -
>> > - __u16 engine_instance;
>> > - __u16 gt_id;
>> > -};
>> > -
>> > struct drm_xe_exec_queue_create {
>> > #define XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY 0
>> > /** @extensions: Pointer to the first extension struct, if any */
>> > --
>> > 2.38.1
>> >
More information about the Intel-xe
mailing list