[PATCH] RFC: tests/xe_svm: basic svm test
Zeng, Oak
oak.zeng at intel.com
Mon Jan 15 19:42:03 UTC 2024
Hi Krishna,
See two more comments
Oak
> -----Original Message-----
> From: Zeng, Oak
> Sent: Thursday, January 11, 2024 5:27 PM
> To: Bommu, Krishnaiah <krishnaiah.bommu at intel.com>; igt-
> dev at lists.freedesktop.org; Chehab, Mauro <mauro.chehab at intel.com>
> Cc: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>
> Subject: RE: [PATCH] RFC: tests/xe_svm: basic svm test
>
>
>
> > -----Original Message-----
> > From: Bommu, Krishnaiah <krishnaiah.bommu at intel.com>
> > Sent: Thursday, January 11, 2024 4:43 AM
> > To: igt-dev at lists.freedesktop.org
> > Cc: Bommu, Krishnaiah <krishnaiah.bommu at intel.com>; Zeng, Oak
> > <oak.zeng at intel.com>; Ghimiray, Himal Prasad
> > <himal.prasad.ghimiray at intel.com>
> > Subject: [PATCH] RFC: tests/xe_svm: basic svm test
> >
> > Test will verify SVM basic functionality, by allocating a memory with
> > system allocator(malloc()) and accessing in GPU.
> >
> > Kernel changes:
> > https://lore.kernel.org/dri-devel/20231221043812.3783313-1-
> > oak.zeng at intel.com/
> >
> > Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu at intel.com>
> > Cc: Oak Zeng <oak.zeng at intel.com>
> > Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > ---
> > tests/intel/xe_svm.c | 136
> > +++++++++++++++++++++++++++++++++++++++++++
> > tests/meson.build | 1 +
> > 2 files changed, 137 insertions(+)
> > create mode 100644 tests/intel/xe_svm.c
> >
> > diff --git a/tests/intel/xe_svm.c b/tests/intel/xe_svm.c
> > new file mode 100644
> > index 000000000..f82cb8fc7
> > --- /dev/null
> > +++ b/tests/intel/xe_svm.c
> > @@ -0,0 +1,136 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2021 Intel Corporation
>
> 2024
>
> > + */
> > +
> > +/**
> > + * TEST: Basic tests for svm functionality
> > + * Category: Hardware building block
> > + * Sub-category: svm-basic
> > + * Functionality: svm
>
> I don't know much about the categories definition of igt test. I feel like this
> doesn't belong to hardware building block. Svm is mainly a software feature. Of
> course it depends on hw features such as gpu recoverable page fault to work.
>
> Maybe the sub-cate is svm?
>
> @Chehab, Mauro should be able to better categorize the test.
>
> > + */
> > +
> > +#include "igt.h"
> > +#include "lib/igt_syncobj.h"
> > +#include "lib/intel_reg.h"
> > +#include "xe_drm.h"
> > +
> > +#include "xe/xe_ioctl.h"
> > +#include "xe/xe_query.h"
> > +#include <string.h>
> > +
> > +/**
> > + * SUBTEST: svm-basic
> > + * Description: Run svm basic test only once
>
> Test basic svm functionality: malloc memory and directly access it from gpu.
>
> > + * Test category: functionality test
> > + */
> > +
> > +static void
> > +test_exec(int fd)
> Test_svm_basic
>
> > +{
> > + uint32_t vm;
> > + uint64_t addr = 0x1a0000;
> > +#define USER_FENCE_VALUE 0xdeadbeefdeadbeefull
> > + struct drm_xe_sync sync[1] = {
> > + { .type = DRM_XE_SYNC_TYPE_USER_FENCE, .flags =
> > DRM_XE_SYNC_FLAG_SIGNAL,
>
> Make above two lines for better reading
>
> > + .timeline_value = USER_FENCE_VALUE },
> > + };
> > + struct drm_xe_exec exec = {
> > + .num_batch_buffer = 1,
> > + .num_syncs = 1,
> > + .syncs = to_user_pointer(sync),
> > + };
> > + uint32_t exec_queues;
>
> Exec_queue
> > + uint32_t bind_exec_queues;
>
> Bind_exec_queue
> > + uint64_t batch_offset;
> > + uint64_t batch_addr;
> > + size_t bo_size;
> > + uint32_t bo = 0;
> > + struct {
> > + uint32_t batch[16];
> > + uint64_t pad;
> > + uint64_t vm_sync;
>
>
> Rename vm_sync to bind_sync
>
> Actually I think we can simplify this test by not using user fence. The purpose of
> this test is to test svm. So we can use xe_vm_bind_sync function directly instead
> of the _async function. The _sync function internally use syncobject (vs user
> fence) for the synchronization. So we don't need to care about user fence in this
> test.
>
> Same thing with the xe_exec. We can use xe_exec_wait instead.
Please ignore my comments above. I think we should still use user fence instead of a syncobjects. The reason is, syncobject internally use dma-fence, but dma-fence doesn't work for long run compute work. Even though svm is supposed to work for both long run and short run workload, our priority at the first stage is long run workload. So lets continue to use userptr.
You also need to create the vm with LR and FAULT flag enabled, see below.
>
> Oak
>
> > + uint64_t exec_sync;
> > + uint32_t data;
> > + } *data;
> > + int b;
> > + uint64_t dst = (uint64_t *)malloc(4);
> > +
> > + vm = xe_vm_create(fd, 0, 0);
Vm need to be created at least with DRM_XE_VM_CREATE_FLAG_FAULT_MODE flag. SVM depends on GPU page fault to work.
We should also set the DRM_XE_VM_CREATE_FLAG_LR_MODE flag also, as we want to prioritize long run compute workload in this stage. Without DRM_XE_VM_CREATE_FLAG_LR_MODE flag, we don't have to use userptr. Syncobj should work for short run workload. Since we want prioritize LR, lets continue to use userptr.
Oak
> > + bo_size = sizeof(*data) ;
> > + bo_size = ALIGN(bo_size, xe_get_default_alignment(fd));
> > +
> > + bo = xe_bo_create(fd, 0, bo_size,
> > + all_memory_regions(fd) |
> > + vram_if_possible(fd, 0),
> > + DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM);
> > +
> > + data = xe_bo_map(fd, bo, bo_size);
> > + memset(data, 0, bo_size);
> > +
> > + exec_queues = xe_exec_queue_create_class(fd, vm,
> > DRM_XE_ENGINE_CLASS_COPY);
> > + bind_exec_queues = xe_bind_exec_queue_create(fd, vm, 0);
> > +
> > + sync[0].addr = to_user_pointer(&data[0].vm_sync);
> > +
> > + xe_vm_bind_async_flags(fd, vm, bind_exec_queues, bo, 0,
> > + addr, bo_size, sync, 1,
> > + DRM_XE_VM_BIND_FLAG_IMMEDIATE);
> > +
> > +#define ONE_SEC MS_TO_NS(1000)
> > + xe_wait_ufence(fd, &data[0].vm_sync, USER_FENCE_VALUE,
> > bind_exec_queues, ONE_SEC);
> > + data[0].vm_sync = 0;
> > +
> > + batch_offset = (char *)&data[0].batch - (char *)data;
> > + batch_addr = addr + batch_offset;
> > +
> > + b = 0;
> > + data[0].batch[b++] = MI_STORE_DWORD_IMM_GEN4;
> > + data[0].batch[b++] = dst;
> > + data[0].batch[b++] = dst >> 32;
> > + data[0].batch[b++] = 0xc0ffee;
> > + data[0].batch[b++] = MI_BATCH_BUFFER_END;
> > + igt_assert(b <= ARRAY_SIZE(data[0].batch));
> > +
> > + sync[0].addr = addr + (char *)&data[0].exec_sync - (char *)data;
> > +
> > + exec.exec_queue_id = exec_queues;
> > + exec.address = batch_addr;
> > + xe_exec(fd, &exec);
> > +
> > + xe_wait_ufence(fd, &data[0].exec_sync, USER_FENCE_VALUE,
> > exec_queues, ONE_SEC);
> > +
> > + sync[0].addr = to_user_pointer(&data[0].vm_sync);
> > +
> > + xe_vm_unbind_async(fd, vm, bind_exec_queues, 0, addr, bo_size, sync,
> > 1);
> > +
> > + xe_wait_ufence(fd, &data[0].vm_sync, USER_FENCE_VALUE,
> > bind_exec_queues, ONE_SEC);
> > +
> > + igt_assert_eq(dst, 0xc0ffee);
> > +
> > + if (exec_queues)
> > + xe_exec_queue_destroy(fd, exec_queues);
> > + if (bind_exec_queues)
> > + xe_exec_queue_destroy(fd, bind_exec_queues);
> > +
> > + free(data);
> > + free(dst);
> > + gem_close(fd, bo);
> > +
> > + xe_vm_destroy(fd, vm);
> > +}
> > +
> > +igt_main
> > +{
> > + int fd;
> > +
> > + igt_fixture {
> > + fd = drm_open_driver(DRIVER_XE);
> > + }
> > +
> > + igt_subtest_f("svm-basic")
> > + test_exec(fd);
> > +
> > + igt_fixture
> > + drm_close_driver(fd);
> > +}
> > diff --git a/tests/meson.build b/tests/meson.build
> > index a6a8498e2..13cc7acae 100644
> > --- a/tests/meson.build
> > +++ b/tests/meson.build
> > @@ -313,6 +313,7 @@ intel_xe_progs = [
> > 'xe_spin_batch',
> > 'xe_sysfs_defaults',
> > 'xe_sysfs_scheduler',
> > + 'xe_svm',
> > ]
> >
> > msm_progs = [
> > --
> > 2.25.1
More information about the igt-dev
mailing list