[PATCH i-g-t 3/3] tests/intel/xe_query: Clarify delta engine_cycles calculation

Lucas De Marchi lucas.demarchi at intel.com
Fri Oct 11 03:05:07 UTC 2024


Let the compiler do the wrap around width instead of doing different
calculations depending on if the second value is lower than the first.

The only reason for ts2.engine_cycles to be lower than ts1.engine_cycles
is if a wrap around ocurred. The wrap around should happen
wrt the width reported by the kernel. However that is not really true
and kernel simply reports the value "as is". With Xe2, the value is
actually 64-bits and width is currently reported as 36 (that's being
changed in the kernel side). So, if engine_cycles would have to wrap
around 64 bits for that condition to hold true, and in that case we'd
calculate the delta by shifting (1 << ts2.width), giving a wrong result.

Luckly, wrapping 64 bits means at 19.2Mhz takes forever and is not a
problem in real life.

Anyway, fix the math which also removes the ugly check by b >= a.
Example:

	#include <stdio.h>
	#include <inttypes.h>

	int main(void)
	{
		uint64_t mask = (1ull << 36) - 1;
		uint64_t a, b;

		a = mask - 1;
		b = mask;
		printf("%"PRIu64"\n", (b - a) & mask);

		a = mask;
		b = (mask + 1) & mask;
		printf("%"PRIu64"\n", (b - a) & mask);

		a = UINT64_MAX & mask;
		b = 0 & mask;
		printf("%"PRIu64"\n", (b - a) & mask);

		return 0;
	}

which does the correct wrap around and prints 1 in all cases.

Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
---
 tests/intel/xe_query.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/tests/intel/xe_query.c b/tests/intel/xe_query.c
index 87ddb58bb..1566680e7 100644
--- a/tests/intel/xe_query.c
+++ b/tests/intel/xe_query.c
@@ -15,6 +15,7 @@
 #include <string.h>
 
 #include "igt.h"
+#include "linux_scaffold.h"
 #include "xe_drm.h"
 #include "xe/xe_ioctl.h"
 #include "xe/xe_query.h"
@@ -738,6 +739,7 @@ __engine_cycles(int fd, struct drm_xe_engine_class_instance *hwe)
 #define NUM_SNAPSHOTS 10
 	for (i = 0; i < NUM_SNAPSHOTS * ARRAY_SIZE(clock); i++) {
 		int index = i / NUM_SNAPSHOTS;
+		uint64_t width_mask;
 
 		ts1.eci = *hwe;
 		ts1.clockid = clock[index].id;
@@ -763,12 +765,12 @@ __engine_cycles(int fd, struct drm_xe_engine_class_instance *hwe)
 
 		delta_cpu = ts2.cpu_timestamp - ts1.cpu_timestamp;
 
-		if (ts2.engine_cycles >= ts1.engine_cycles)
-			delta_cs = (ts2.engine_cycles - ts1.engine_cycles) *
-				   NSEC_PER_SEC / eng_ref_clock;
-		else
-			delta_cs = (((1 << ts2.width) - ts2.engine_cycles) + ts1.engine_cycles) *
-				   NSEC_PER_SEC / eng_ref_clock;
+		igt_assert_eq(ts1.width, ts2.width);
+		width_mask = GENMASK_ULL(ts1.width - 1, 0);
+		ts1.engine_cycles &= width_mask;
+		ts2.engine_cycles &= width_mask;
+		delta_cs = ((ts2.engine_cycles - ts1.engine_cycles) & width_mask) *
+			NSEC_PER_SEC / eng_ref_clock;
 
 		calc_freq = (ts2.engine_cycles - ts1.engine_cycles) * NSEC_PER_SEC / delta_cpu;
 
-- 
2.46.2



More information about the igt-dev mailing list