<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Hi Kamil, <br>
</p>
<div class="moz-cite-prefix">On 6/20/2024 4:20 PM, Kamil Konieczny
wrote:<br>
</div>
<blockquote type="cite" cite="mid:20240620142007.ywhmpjokrnnpbhnv@kamilkon-DESK.igk.intel.com">
<pre class="moz-quote-pre" wrap="">Hi Nirmoy,
On 2024-06-18 at 19:32:47 +0200, Nirmoy Das wrote:
[PATCH i-g-t v2] tests/intel/xe_exec_store: Add basic_inst_benchmark
-----------------------------------------------------^
You should use '-', also your subtest has 'store', so:
[PATCH i-g-t v2] tests/intel/xe_exec_store: Add basic-store-benchmark</pre>
</blockquote>
<p>Will do that</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:20240620142007.ywhmpjokrnnpbhnv@kamilkon-DESK.igk.intel.com">
<pre class="moz-quote-pre" wrap="">
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Add basic_inst_benchmark to benchmark this basic operation
for BO sizes to get basic understanding how long it takes
bind a BO and run simple GPU command on it.
This not a CI test but rather for developer to identify various
bottleneck/regression in BO binding.
Signed-off-by: Nirmoy Das <a class="moz-txt-link-rfc2396E" href="mailto:nirmoy.das@intel.com"><nirmoy.das@intel.com></a>
---
tests/intel/xe_exec_store.c | 110 ++++++++++++++++++++++++++++++------
1 file changed, 92 insertions(+), 18 deletions(-)
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
This didn't work on my ATS-M machine:
sudo build/tests/xe_exec_store --r basic-store-benchmark
IGT-Version: 1.28-g64414e4da (x86_64) (Linux: 6.9.0-rc6-xe-public-513ea833c201+ x86_64)
Using IGT_SRANDOM=1718890917 for randomisation
Opened device: /dev/dri/card0
Starting subtest: basic-store-benchmark
Starting dynamic subtest: WC
Dynamic subtest WC: SUCCESS (0.000s)
(xe_exec_store:15881) xe/xe_ioctl-CRITICAL: Test assertion failure function xe_bo_create_caching, file ../lib/xe/xe_ioctl.c:311:
(xe_exec_store:15881) xe/xe_ioctl-CRITICAL: Failed assertion: __xe_bo_create_caching(fd, vm, size, placement, flags, cpu_caching, &handle) == 0
(xe_exec_store:15881) xe/xe_ioctl-CRITICAL: Last errno: 22, Invalid argument
(xe_exec_store:15881) xe/xe_ioctl-CRITICAL: error: -1 != 0</pre>
</blockquote>
<p>I guess WC cache is not allowed on dGFX, dmesg should let us
know. I will check and fix it. Thanks for trying it out.</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:20240620142007.ywhmpjokrnnpbhnv@kamilkon-DESK.igk.intel.com">
<pre class="moz-quote-pre" wrap="">
Also, where are benchmarks? Do you print them only after
WC and WB runs? </pre>
</blockquote>
It should something like: <br>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">sudo<span style="mso-spacerun:yes"> </span>~/igt-gpu-tools/build/tests/xe_exec_store
--run basic-store-benchmark</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">IGT-Version:
1.28-g2ed908c0b (x86_64) (Linux: 6.10.0-rc2-xe+ x86_64)</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Using
IGT_SRANDOM=1718739607 for randomisation</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Opened
device: /dev/dri/card0</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Starting
subtest: basic-store-benchmark</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Starting
dynamic subtest: WC</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Dynamic
subtest WC: SUCCESS (0.000s)</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_4K: 8 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_2M: 7 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_64M: 15 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_128M: 16 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_256M: 33 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_1G: 110 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Starting
dynamic subtest: WB</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Dynamic
subtest WB: SUCCESS (0.000s)</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_4K: 4 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_2M: 5 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_64M: 16 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_128M: 29 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_256M: 49 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Time
taken for size SZ_1G: 176 ms</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US"><br>
</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US"><br>
</p>
<p style="margin:0in;font-family:Calibri;font-size:11.0pt" lang="en-US">Btw I realized --dyn doesn't work, the test executes
WC always even with --dyn :/ any ideas ?<br>
</p>
<blockquote type="cite" cite="mid:20240620142007.ywhmpjokrnnpbhnv@kamilkon-DESK.igk.intel.com">
<pre class="moz-quote-pre" wrap="">See also one nit below.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">diff --git a/tests/intel/xe_exec_store.c b/tests/intel/xe_exec_store.c
index c872c22d5..4b1c619e0 100644
--- a/tests/intel/xe_exec_store.c
+++ b/tests/intel/xe_exec_store.c
@@ -93,15 +93,10 @@ static void persistance_batch(struct data *data, uint64_t addr)
data->addr = batch_addr;
}
-/**
- * SUBTEST: basic-store
- * Description: Basic test to verify store dword.
- * SUBTEST: basic-cond-batch
- * Description: Basic test to verify cond batch end instruction.
- * SUBTEST: basic-all
- * Description: Test to verify store dword on all available engines.
- */
-static void basic_inst(int fd, int inst_type, struct drm_xe_engine_class_instance *eci)
+
+static void basic_inst_size(int fd, int inst_type,
+ struct drm_xe_engine_class_instance *eci,
+ uint16_t cpu_caching, size_t bo_size)
{
struct drm_xe_sync sync[2] = {
{ .type = DRM_XE_SYNC_TYPE_SYNCOBJ, .flags = DRM_XE_SYNC_FLAG_SIGNAL, },
@@ -117,7 +112,6 @@ static void basic_inst(int fd, int inst_type, struct drm_xe_engine_class_instanc
uint32_t exec_queue;
uint32_t bind_engine;
uint32_t syncobj;
- size_t bo_size;
int value = 0x123456;
uint64_t addr = 0x100000;
uint32_t bo = 0;
@@ -127,12 +121,16 @@ static void basic_inst(int fd, int inst_type, struct drm_xe_engine_class_instanc
sync[1].handle = syncobj;
vm = xe_vm_create(fd, 0, 0);
- bo_size = sizeof(*data);
- bo_size = xe_bb_size(fd, bo_size);
- bo = xe_bo_create(fd, vm, bo_size,
- vram_if_possible(fd, eci->gt_id),
- DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
+ if (cpu_caching)
+ bo = xe_bo_create_caching(fd, vm, bo_size,
+ vram_if_possible(fd, eci->gt_id),
+ DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM,
+ cpu_caching);
+ else
+ bo = xe_bo_create(fd, vm, bo_size,
+ vram_if_possible(fd, eci->gt_id),
+ DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
exec_queue = xe_exec_queue_create(fd, vm, eci, 0);
bind_engine = xe_bind_exec_queue_create(fd, vm, 0);
@@ -167,6 +165,66 @@ static void basic_inst(int fd, int inst_type, struct drm_xe_engine_class_instanc
xe_vm_destroy(fd, vm);
}
+
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Remove extra line.</pre>
</blockquote>
<p>Will remove it.</p>
<p><br>
</p>
<p>Thanks!</p>
<p>Nirmoy<br>
</p>
<blockquote type="cite" cite="mid:20240620142007.ywhmpjokrnnpbhnv@kamilkon-DESK.igk.intel.com">
<pre class="moz-quote-pre" wrap="">
Regards,
Kamil
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">+/**
+ * SUBTEST: basic-store
+ * Description: Basic test to verify store dword.
+ * SUBTEST: basic-cond-batch
+ * Description: Basic test to verify cond batch end instruction.
+ * SUBTEST: basic-all
+ * Description: Test to verify store dword on all available engines.
+ */
+static void basic_inst(int fd, int inst_type,
+ struct drm_xe_engine_class_instance *eci,
+ uint16_t cpu_caching)
+{
+ size_t bo_size;
+
+ bo_size = sizeof(struct data);
+ bo_size = xe_bb_size(fd, bo_size);
+
+ basic_inst_size(fd, inst_type, eci, cpu_caching, bo_size);
+}
+
+/**
+ * SUBTEST: basic-store-benchmark
+ * Description: Basic test to verify time taken for doing store dword with various size.
+ */
+static void basic_inst_benchmark(int fd, int inst_type,
+ struct drm_xe_engine_class_instance *eci,
+ uint16_t cpu_caching)
+{
+ struct {
+ size_t size;
+ const char *name;
+ } sizes[] = {
+ {SZ_4K, "SZ_4K"},
+ {SZ_2M, "SZ_2M"},
+ {SZ_64M, "SZ_64M"},
+ {SZ_128M, "SZ_128M"},
+ {SZ_256M, "SZ_256M"},
+ {SZ_1G, "SZ_1G"}
+ };
+
+ struct timeval start, end;
+ long seconds, useconds, mtime;
+
+ for (size_t i = 0; i < ARRAY_SIZE(sizes); ++i) {
+ size_t bo_size = sizes[i].size;
+ const char *size_name = sizes[i].name;
+
+ gettimeofday(&start, NULL);
+ basic_inst_size(fd, inst_type, eci, cpu_caching, bo_size);
+ gettimeofday(&end, NULL);
+
+ seconds = end.tv_sec - start.tv_sec;
+ useconds = end.tv_usec - start.tv_usec;
+ mtime = ((seconds) * 1000 + useconds / 1000.0) + 0.5;
+
+ igt_info("Time taken for size %s: %ld ms\n", size_name, mtime);
+ }
+}
+
#define PAGES 1
#define NCACHELINES (4096/64)
/**
@@ -342,12 +400,28 @@ igt_main
igt_subtest("basic-store") {
engine = xe_engine(fd, 1);
- basic_inst(fd, STORE, &engine->instance);
+ basic_inst(fd, COND_BATCH, &engine->instance, 0);
+ }
+
+ igt_subtest_with_dynamic("basic-store-benchmark") {
+ struct dyn {
+ const char *name;
+ int cache;
+ } tests[] = {
+ {"WC", DRM_XE_GEM_CPU_CACHING_WC},
+ {"WB", DRM_XE_GEM_CPU_CACHING_WB}
+ };
+
+ for (int i = 0; i < ARRAY_SIZE(tests); i++) {
+ igt_dynamic_f("%s", tests[i].name);
+ engine = xe_engine(fd, 1);
+ basic_inst_benchmark(fd, STORE, &engine->instance, tests[i].cache);
+ }
}
igt_subtest("basic-cond-batch") {
engine = xe_engine(fd, 1);
- basic_inst(fd, COND_BATCH, &engine->instance);
+ basic_inst(fd, COND_BATCH, &engine->instance, 0);
}
igt_subtest_with_dynamic("basic-all") {
@@ -356,7 +430,7 @@ igt_main
xe_engine_class_string(hwe->engine_class),
hwe->engine_instance,
hwe->gt_id);
- basic_inst(fd, STORE, hwe);
+ basic_inst(fd, STORE, hwe, 0);
}
}
--
2.42.0
</pre>
</blockquote>
</blockquote>
</body>
</html>