[PATCH i-g-t v2 05/10] tests/intel/xe_drm_fdinfo: Add single engine tests

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Thu Jul 11 19:13:22 UTC 2024


On Thu, Jul 11, 2024 at 08:20:17AM -0500, Lucas De Marchi wrote:
>On Tue, Jul 02, 2024 at 05:25:27PM GMT, Umesh Nerlige Ramappa wrote:
>>Add simple tests that submit work to one engine and measure utilization
>>per class.
>>
>>v2:
>>- Drop measured_usleep since return value is not used
>>- s/parallel engines/parallel submission/ in comment
>>- Use NSEC_PER_SEC for batch_duration_ns
>>- Percent should not be > 100
>
>but as we chatted later in the review, it's actually possible to have it
>greater than 100, so I think we need to leave some room
>
>>- Check utilization for both clients for isolation case
>>
>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>---
>>tests/intel/xe_drm_fdinfo.c | 97 +++++++++++++++++++++++++++++++++++++
>>1 file changed, 97 insertions(+)
>>
>>diff --git a/tests/intel/xe_drm_fdinfo.c b/tests/intel/xe_drm_fdinfo.c
>>index 037f25e53..410c885e7 100644
>>--- a/tests/intel/xe_drm_fdinfo.c
>>+++ b/tests/intel/xe_drm_fdinfo.c
>>@@ -28,6 +28,15 @@
>> * SUBTEST: basic-engine-utilization
>> * Description: Check if basic fdinfo content is present for engine utilization
>> *
>>+ * SUBTEST: drm-idle
>>+ * Description: Check that engines show no load when idle
>>+ *
>>+ * SUBTEST: drm-busy-idle
>>+ * Description: Check that engines show load when idle after busy
>>+ *
>>+ * SUBTEST: drm-busy-idle-isolation
>>+ * Description: Check that engine load does not spill over to other drm clients
>>+ *
>> * SUBTEST: drm-total-resident
>> * Description: Create and compare total and resident memory consumption by client
>> *
>>@@ -42,11 +51,18 @@ IGT_TEST_DESCRIPTION("Read and verify drm client memory consumption and engine u
>>
>>#define BO_SIZE (65536)
>>
>>+/* flag masks */
>>+#define TEST_BUSY		(1 << 0)
>>+#define TEST_TRAILING_IDLE	(1 << 1)
>>+#define TEST_ISOLATION		(1 << 2)
>>+
>>struct pceu_cycles {
>>	uint64_t cycles;
>>	uint64_t total_cycles;
>>};
>>
>>+const unsigned long batch_duration_ns = (1 * NSEC_PER_SEC) / 2;
>>+
>>static const char *engine_map[] = {
>>	"rcs",
>>	"bcs",
>>@@ -464,8 +480,77 @@ spin_ctx_destroy(int fd, struct spin_ctx *ctx)
>>	free(ctx);
>>}
>>
>>+static void
>>+check_results(struct pceu_cycles *s1, struct pceu_cycles *s2,
>>+	      int class, int width, unsigned int flags)
>>+{
>>+	double percent;
>>+
>>+	igt_debug("%s: sample 1: cycles %lu, total_cycles %lu\n",
>>+		  engine_map[class], s1[class].cycles, s1[class].total_cycles);
>>+	igt_debug("%s: sample 2: cycles %lu, total_cycles %lu\n",
>>+		  engine_map[class], s2[class].cycles, s2[class].total_cycles);
>>+
>>+	percent = ((s2[class].cycles - s1[class].cycles) * 100) /
>>+		  ((s2[class].total_cycles + 1) - s1[class].total_cycles);
>>+
>>+	/* for parallel submission scale the busyness with width */
>>+	percent = percent / width;
>>+
>>+	igt_debug("%s: percent: %f\n", engine_map[class], percent);
>>+
>>+	if (flags & TEST_BUSY)
>>+		igt_assert(percent >= 95 && percent <= 100);
>
><= 105 like you had before seems good, otherwise we may have too much
>noise in CI.  What are the typical numbers you're seeing?

I see 101 whenever it fails. Also I see it only when I run batches on 
all classes (all_busy_all/most_busy_all etc.). The difference in 
numerator and denominator deltas is close to 5ms though. Not convinced 
that we should see 100 still because we are ending the spinners (and 
waiting for fences and all) before taking sample 2. Still debugging a 
little more on this one.

>
>>+	else
>>+		igt_assert(!percent);
>>+}
>>+
>>+static void
>>+single(int fd, struct drm_xe_engine_class_instance *hwe, int width, int count,
>>+       unsigned int flags)
>>+{
>>+	struct pceu_cycles pceu1[2][DRM_XE_ENGINE_CLASS_COMPUTE + 1];
>>+	struct pceu_cycles pceu2[2][DRM_XE_ENGINE_CLASS_COMPUTE + 1];
>
>nit: either name it pceu_start/pceu_end:
>
>pceu_start[0] is the pceu for client0 on start
>pceu_start[1] is the pceu for client1 on start
>pceu_end[0] is the pceu for client0 on end
>pceu_end[1] is the pceu for client1 on end
>
>or pceu_client0/pceu_client1 if grouping by client
>
>pceu_client0[0] is the pceu for client0 on start
>pceu_client0[1] is the pceu for client0 on end
>pceu_client1[1] is the pceu for client1 on start
>pceu_client1[2] is the pceu for client1 on end
>
>... this makes it less prone to typos.
>
>But see below as I think 2 samples are not sufficient.
>
>>+	struct spin_ctx *ctx = NULL;
>>+	uint32_t vm;
>>+	int new_fd;
>>+
>>+	if (flags & TEST_ISOLATION)
>>+		new_fd = drm_reopen_driver(fd);
>>+
>>+	vm = xe_vm_create(fd, 0, 0);
>>+	if (flags & TEST_BUSY) {
>>+		ctx = spin_ctx_init(fd, hwe, vm, width, count);
>>+		spin_sync_start(fd, ctx);
>>+	}
>>+
>>+	read_engine_cycles(fd, pceu1[0]);
>>+	if (flags & TEST_ISOLATION)
>>+		read_engine_cycles(new_fd, pceu1[1]);
>>+
>>+	usleep(batch_duration_ns / 1000);
>
>NSEC_PER_USEC
>
>>+	if (flags & TEST_TRAILING_IDLE)
>>+		spin_sync_end(fd, ctx);
>>+
>>+	read_engine_cycles(fd, pceu2[0]);
>
>... and here we have another source of noise.... when doing
>TEST_TRAILING_IDLE, should we read the first sample after the spin
>ended, sleep, and read the second sample?

That would always make utilization = 0% (numerator delta is 0). We 
should take sample 1 when the spinner has started and sample 2 when it 
has ended (for PCEU).

TEST_TRAILING_IDLE is only valid when TEST_BUSY is also specified.  
Maybe I should enforce that in the test. This is the flow for common 
flag configs:

1) flag = TEST_BUSY | TEST_TRAILING_IDLE
- wait till spinner starts
- take sample 1
- usleep
- end the batch
- take sample 2.
- check_results (expected 100% util)

2) flag = TEST_BUSY
- wait till spinner starts
- take sample 1
- usleep
- take sample 2.
- end the batch
- check_results (expected 100% util)

3) flag = 0 means truly IDLE since there is no spinner.

>
>but then how are we testing all at the same time here? I can't see any
>igt_assert() for the trailing idle test and it will actually impact the
>busy test since it will wait the spin to end.... Am I missing anything
>here?

Case (2) above does not exist for PCEU tests because CTX_TIMESTAMP is 
only updates when the context switches out, which means we must end the 
spinner before taking the second sample (or preempt it, but ending is 
simpler). This series only has tests for case (1) and (3) implemented.

Case (2) is applicable for a completely different feature which is still 
wip - per engine utilization via kernel perf framework.

>I think we need to add a third sample when testing trailing idle
>and ensure:
>
>considering the indexing as sample[client][idx]
>
>	sample[0][1] - sample[0][0] == 100% if BUSY else 0%
>	sample[0][2] - sample[0][1] == 0%  -> idle on trailing test
>
>and for isolation:
>
>	sample[1][1] - sample[0][0] == 0%
>	sample[1][2] - sample[1][1] == 0%
>
>>+	if (flags & TEST_ISOLATION)
>>+		read_engine_cycles(new_fd, pceu2[1]);
>>+
>>+	check_results(pceu1[0], pceu2[0], hwe->engine_class, width, flags);

This check_results will check for TEST_BUSY case.

>>+
>>+	if (flags & TEST_ISOLATION) {
>>+		check_results(pceu1[1], pceu2[1], hwe->engine_class, width, 0);

This check_results will check for idle case (flags = 0).

>>+		close(new_fd);
>>+	}
>>+
>>+	spin_sync_end(fd, ctx);
>>+	spin_ctx_destroy(fd, ctx);
>
>if !BUSY you are not even calling spin_ctx_init()

:) yeah, sorry about the readability here. The spin_sync_end and 
spin_ctx_destroy bail out if the ctx is NULL which is the case for 
!BUSY. I can add and explicit check to not call these instead.

Thanks,
Umesh
>
>Lucas De Marchi
>
>>+	xe_vm_destroy(fd, vm);
>>+}
>>+
>>igt_main
>>{
>>+	struct drm_xe_engine_class_instance *hwe;
>>	int xe;
>>
>>	igt_fixture {
>>@@ -484,6 +569,18 @@ igt_main
>>	igt_subtest("basic-engine-utilization")
>>		basic_engine_utilization(xe);
>>
>>+	igt_subtest("drm-idle")
>>+		xe_for_each_engine(xe, hwe)
>>+			single(xe, hwe, 1, 1, 0);
>>+
>>+	igt_subtest("drm-busy-idle")
>>+		xe_for_each_engine(xe, hwe)
>>+			single(xe, hwe, 1, 1, TEST_BUSY | TEST_TRAILING_IDLE);
>>+
>>+	igt_subtest("drm-busy-idle-isolation")
>>+		xe_for_each_engine(xe, hwe)
>>+			single(xe, hwe, 1, 1, TEST_BUSY | TEST_TRAILING_IDLE | TEST_ISOLATION);
>>+
>>	igt_describe("Create and compare total and resident memory consumption by client");
>>	igt_subtest("drm-total-resident")
>>		test_total_resident(xe);
>>-- 
>>2.38.1
>>


More information about the igt-dev mailing list