[PATCH i-g-t 3/3] tests/intel/xe_query: Clarify delta engine_cycles calculation

Kamil Konieczny kamil.konieczny at linux.intel.com
Fri Oct 11 17:11:48 UTC 2024


Hi Lucas,
On 2024-10-10 at 22:05:07 -0500, Lucas De Marchi wrote:
> Let the compiler do the wrap around width instead of doing different
> calculations depending on if the second value is lower than the first.
> 
> The only reason for ts2.engine_cycles to be lower than ts1.engine_cycles
> is if a wrap around ocurred. The wrap around should happen
> wrt the width reported by the kernel. However that is not really true
> and kernel simply reports the value "as is". With Xe2, the value is
> actually 64-bits and width is currently reported as 36 (that's being
> changed in the kernel side). So, if engine_cycles would have to wrap
> around 64 bits for that condition to hold true, and in that case we'd
> calculate the delta by shifting (1 << ts2.width), giving a wrong result.
> 
> Luckly, wrapping 64 bits means at 19.2Mhz takes forever and is not a
> problem in real life.
> 
> Anyway, fix the math which also removes the ugly check by b >= a.
> Example:
> 
> 	#include <stdio.h>
> 	#include <inttypes.h>
> 
> 	int main(void)
> 	{
> 		uint64_t mask = (1ull << 36) - 1;
> 		uint64_t a, b;
> 
> 		a = mask - 1;
> 		b = mask;
> 		printf("%"PRIu64"\n", (b - a) & mask);
> 
> 		a = mask;
> 		b = (mask + 1) & mask;
> 		printf("%"PRIu64"\n", (b - a) & mask);
> 
> 		a = UINT64_MAX & mask;
> 		b = 0 & mask;

Just zero is enough.

> 		printf("%"PRIu64"\n", (b - a) & mask);

Instead of repetition here, why not call a function?

void delta(uint64_t a, uint64 b, uint64 mask) {
	printf("%"PRIu64"\n", ((b & mask) - (a & mask)) & mask);
}

	delta(mask - 1,   mask,     mask);
	delta(mask,       mask + 1, mask);
	delta(UINT64_MAX, 0,        mask);

> 
> 		return 0;
> 	}
> 
> which does the correct wrap around and prints 1 in all cases.

Do we need this example code in commit message?

With or without it

Reviewed-by: Kamil Konieczny <kamil.konieczny at linux.intel.com>

> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
>  tests/intel/xe_query.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/intel/xe_query.c b/tests/intel/xe_query.c
> index 87ddb58bb..1566680e7 100644
> --- a/tests/intel/xe_query.c
> +++ b/tests/intel/xe_query.c
> @@ -15,6 +15,7 @@
>  #include <string.h>
>  
>  #include "igt.h"
> +#include "linux_scaffold.h"
>  #include "xe_drm.h"
>  #include "xe/xe_ioctl.h"
>  #include "xe/xe_query.h"
> @@ -738,6 +739,7 @@ __engine_cycles(int fd, struct drm_xe_engine_class_instance *hwe)
>  #define NUM_SNAPSHOTS 10
>  	for (i = 0; i < NUM_SNAPSHOTS * ARRAY_SIZE(clock); i++) {
>  		int index = i / NUM_SNAPSHOTS;
> +		uint64_t width_mask;
>  
>  		ts1.eci = *hwe;
>  		ts1.clockid = clock[index].id;
> @@ -763,12 +765,12 @@ __engine_cycles(int fd, struct drm_xe_engine_class_instance *hwe)
>  
>  		delta_cpu = ts2.cpu_timestamp - ts1.cpu_timestamp;
>  
> -		if (ts2.engine_cycles >= ts1.engine_cycles)
> -			delta_cs = (ts2.engine_cycles - ts1.engine_cycles) *
> -				   NSEC_PER_SEC / eng_ref_clock;
> -		else
> -			delta_cs = (((1 << ts2.width) - ts2.engine_cycles) + ts1.engine_cycles) *
> -				   NSEC_PER_SEC / eng_ref_clock;
> +		igt_assert_eq(ts1.width, ts2.width);
> +		width_mask = GENMASK_ULL(ts1.width - 1, 0);
> +		ts1.engine_cycles &= width_mask;
> +		ts2.engine_cycles &= width_mask;
> +		delta_cs = ((ts2.engine_cycles - ts1.engine_cycles) & width_mask) *
> +			NSEC_PER_SEC / eng_ref_clock;
>  
>  		calc_freq = (ts2.engine_cycles - ts1.engine_cycles) * NSEC_PER_SEC / delta_cpu;
>  
> -- 
> 2.46.2
> 


More information about the igt-dev mailing list