[PATCH i-g-t 1/2] tests/amdgpu: userq skip waiting for signal fence

Fri May 16 14:20:27 UTC 2025

[AMD Official Use Only - AMD Internal Distribution Only]

Hi Sunil,

Assert will be triggered if the test fails. If the test passes, then it will be reported as pass.

Fwm packet preemption should not skip fences, this is tested here.

Thank you,
Yogesh

________________________________
From: Khatri, Sunil <Sunil.Khatri at amd.com>
Sent: Friday, May 16, 2025 4:08 PM
To: Mohan Marimuthu, Yogesh <Yogesh.Mohanmarimuthu at amd.com>; Khatri, Sunil <Sunil.Khatri at amd.com>; igt-dev at lists.freedesktop.org <igt-dev at lists.freedesktop.org>
Cc: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>
Subject: Re: [PATCH i-g-t 1/2] tests/amdgpu: userq skip waiting for signal fence



On 5/16/2025 3:47 PM, Khatri, Sunil wrote:


On 5/16/2025 3:33 PM, Mohan Marimuthu, Yogesh wrote:

[AMD Official Use Only - AMD Internal Distribution Only]

Hi Sunil,

I did not test with true as logically the code flow does not change if skip_signal is true.
The indentation is off because of the editor I had used. I will fix in the next patch update.

If you are planning to use this in a new test case then we should handle/declare the test as success on failure as this is a negative test case. So this needs to be handled while writing a new test case.

Also here i am not sure what we are achieving with this test case. What is observed here till now is if we skip the amdgpu_cs_syncobj_wait function then test always fail because when we check the memory the values expected are not written in memory. cpu execution is faster that gpu writing in memory and we read the memory before gpu has written it and we declare the test as fail.

If this is a negative test case where expectation is failure but and assert will be triggered here and test will be failed eventually but for negative test the test eventually should be reported as pass. That part isnt handled here.


@Jesse

Have you validated with a different timeout value because when i wrote it at first place i observed that with some decent timeout values we get timeout hit before we get the signal due to fence signal and thats why we gave this big value to make sure the value is written as per expectations.


Regards
Sunil Khatri


Regards
Sunil Khatri

Thank you,
Yogesh

________________________________
From: Khatri, Sunil <Sunil.Khatri at amd.com><mailto:Sunil.Khatri at amd.com>
Sent: Friday, May 16, 2025 11:12 AM
To: Mohan Marimuthu, Yogesh <Yogesh.Mohanmarimuthu at amd.com><mailto:Yogesh.Mohanmarimuthu at amd.com>; igt-dev at lists.freedesktop.org<mailto:igt-dev at lists.freedesktop.org> <igt-dev at lists.freedesktop.org><mailto:igt-dev at lists.freedesktop.org>
Subject: Re: [PATCH i-g-t 1/2] tests/amdgpu: userq skip waiting for signal fence


@yogesh

Functionally code looks good to me but i have a question,

Have you validated the any test case with the skip flag set to true. Test should not fail with that as its a negative test.
How the igt test work here is that when you return from the amdgpu_user_queue_submit function the next function checks the value written in the memory and test might fail if the values dont match.


Also, Indentation seems little off to me.


On 5/15/2025 12:13 PM, Mohan Marimuthu, Yogesh wrote:

[Public]


[Public]

For negative test cases where the job will not complete need to
skip signal fence. Pass a flag to amdgpu_user_queue_submit() to
skip signal fence wait.

Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu at amd.com><mailto:yogesh.mohanmarimuthu at amd.com>
---
 lib/amdgpu/amd_command_submission.c |  2 +-
 lib/amdgpu/amd_compute.c            |  2 +-
 lib/amdgpu/amd_userq.c              | 36 +++++++++++++++--------------
 lib/amdgpu/amd_userq.h              |  2 +-
 tests/amdgpu/amd_basic.c            |  4 ++--
 tests/amdgpu/amd_cs_nop.c           |  2 +-
 6 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/lib/amdgpu/amd_command_submission.c b/lib/amdgpu/amd_command_submission.c
index 80d03a498..74091da5a 100644
--- a/lib/amdgpu/amd_command_submission.c
+++ b/lib/amdgpu/amd_command_submission.c
@@ -68,7 +68,7 @@ int amdgpu_test_exec_cs_helper(amdgpu_device_handle device, unsigned int ip_type
      memcpy(ring_ptr, ring_context->pm4, ring_context->pm4_dw * sizeof(*ring_context->pm4));

      if (user_queue)
-           amdgpu_user_queue_submit(device, ring_context, ip_type, ib_result_mc_address);
+           amdgpu_user_queue_submit(device, ring_context, ip_type, ib_result_mc_address, false);
Indentation here
      else {
            ring_context->ib_info.ib_mc_address = ib_result_mc_address;
            ring_context->ib_info.size = ring_context->pm4_dw;
diff --git a/lib/amdgpu/amd_compute.c b/lib/amdgpu/amd_compute.c
index 95bfa53aa..008186049 100644
--- a/lib/amdgpu/amd_compute.c
+++ b/lib/amdgpu/amd_compute.c
@@ -91,7 +91,7 @@ void amdgpu_command_submission_compute_nop(amdgpu_device_handle device, bool use

            if (user_queue) {
                  amdgpu_user_queue_submit(device, ring_context, AMD_IP_COMPUTE,
-                                    ib_result_mc_address);
+                                    ib_result_mc_address, false);
            } else {
                  memset(&ib_info, 0, sizeof(struct amdgpu_cs_ib_info));
                  ib_info.ib_mc_address = ib_result_mc_address;
diff --git a/lib/amdgpu/amd_userq.c b/lib/amdgpu/amd_userq.c
index 50d058609..727df8222 100644
--- a/lib/amdgpu/amd_userq.c
+++ b/lib/amdgpu/amd_userq.c
@@ -127,7 +127,7 @@ int amdgpu_timeline_syncobj_wait(amdgpu_device_handle device_handle,
 }

 void amdgpu_user_queue_submit(amdgpu_device_handle device, struct amdgpu_ring_context *ring_context,
-                       unsigned int ip_type, uint64_t mc_address)
+                       unsigned int ip_type, uint64_t mc_address, bool skip_signal)
Indentation here too.
 {
      int r;
      uint32_t control = ring_context->pm4_dw;
@@ -166,22 +166,24 @@ void amdgpu_user_queue_submit(amdgpu_device_handle device, struct amdgpu_ring_co
      /* Update the door bell */
      ring_context->doorbell_cpu[DOORBELL_INDEX] = *ring_context->wptr_cpu;

-     /* Add a fence packet for signal */
-     syncarray[0] = ring_context->timeline_syncobj_handle;
-     signal_data.queue_id = ring_context->queue_id;
-     signal_data.syncobj_handles = (uintptr_t)syncarray;
-     signal_data.num_syncobj_handles = 1;
-     signal_data.bo_read_handles = 0;
-     signal_data.bo_write_handles = 0;
-     signal_data.num_bo_read_handles = 0;
-     signal_data.num_bo_write_handles = 0;
-
-     r = amdgpu_userq_signal(device, &signal_data);
-     igt_assert_eq(r, 0);
+     if (!skip_signal) {
+           /* Add a fence packet for signal */
+           syncarray[0] = ring_context->timeline_syncobj_handle;
+           signal_data.queue_id = ring_context->queue_id;
+           signal_data.syncobj_handles = (uintptr_t)syncarray;
+           signal_data.num_syncobj_handles = 1;
+           signal_data.bo_read_handles = 0;
+           signal_data.bo_write_handles = 0;
+           signal_data.num_bo_read_handles = 0;
+           signal_data.num_bo_write_handles = 0;
+
+           r = amdgpu_userq_signal(device, &signal_data);
+           igt_assert_eq(r, 0);

-     r = amdgpu_cs_syncobj_wait(device, &ring_context->timeline_syncobj_handle, 1, INT64_MAX,
-                          DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL, NULL);
-     igt_assert_eq(r, 0);
+           r = amdgpu_cs_syncobj_wait(device, &ring_context->timeline_syncobj_handle, 1, INT64_MAX,
+                                                                 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL, NULL);
Indentation.
+           igt_assert_eq(r, 0);
+     }
 }

 void amdgpu_user_queue_destroy(amdgpu_device_handle device_handle, struct amdgpu_ring_context *ctxt,
@@ -456,7 +458,7 @@ int amdgpu_timeline_syncobj_wait(amdgpu_device_handle device_handle,
 }

 void amdgpu_user_queue_submit(amdgpu_device_handle device, struct amdgpu_ring_context *ring_context,
-     unsigned int ip_type, uint64_t mc_address)
+     unsigned int ip_type, uint64_t mc_address, bool skip_signal)
 {
 }

diff --git a/lib/amdgpu/amd_userq.h b/lib/amdgpu/amd_userq.h
index b29e97ccf..dc39c1ca4 100644
--- a/lib/amdgpu/amd_userq.h
+++ b/lib/amdgpu/amd_userq.h
@@ -50,6 +50,6 @@ void amdgpu_user_queue_destroy(amdgpu_device_handle device_handle, struct amdgpu
                         unsigned int ip_type);

 void amdgpu_user_queue_submit(amdgpu_device_handle device, struct amdgpu_ring_context *ring_context,
-                       unsigned int ip_type, uint64_t mc_address);
+                       unsigned int ip_type, uint64_t mc_address, bool skip_signal);

 #endif
diff --git a/tests/amdgpu/amd_basic.c b/tests/amdgpu/amd_basic.c
index 97a08a9a3..914d27909 100644
--- a/tests/amdgpu/amd_basic.c
+++ b/tests/amdgpu/amd_basic.c
@@ -607,7 +607,7 @@ amdgpu_sync_dependency_test(amdgpu_device_handle device_handle, bool user_queue)
      if (user_queue) {
            ring_context->pm4_dw = ib_info.size;
            amdgpu_user_queue_submit(device_handle, ring_context, ip_block->type,
-                              ib_result_mc_address);
+                              ib_result_mc_address, false);
      } else {
            r = amdgpu_cs_submit(context_handle[1], 0, &ibs_request, 1);
      }
@@ -647,7 +647,7 @@ amdgpu_sync_dependency_test(amdgpu_device_handle device_handle, bool user_queue)
      if (user_queue) {
            ring_context->pm4_dw = ib_info.size;
            amdgpu_user_queue_submit(device_handle, ring_context, ip_block->type,
-                             ib_info.ib_mc_address);
+                              ib_info.ib_mc_address, false);
      } else {
            r = amdgpu_cs_submit(context_handle[0], 0, &ibs_request, 1);
            igt_assert_eq(r, 0);
diff --git a/tests/amdgpu/amd_cs_nop.c b/tests/amdgpu/amd_cs_nop.c
index 268bc9201..658c8d050 100644
--- a/tests/amdgpu/amd_cs_nop.c
+++ b/tests/amdgpu/amd_cs_nop.c
@@ -108,7 +108,7 @@ static void nop_cs(amdgpu_device_handle device,
                  if (user_queue) {
                        ring_context->pm4_dw = ib_info.size;
                        amdgpu_user_queue_submit(device, ring_context, ip_type,
-                                          ib_info.ib_mc_address);
+                                          ib_info.ib_mc_address, false);
Not sure if its the mail editor or what i see its shifted by one. But make sure indentation is correct and you run checkpatch.pl.
                        igt_assert_eq(r, 0);
                  } else {
                        r = amdgpu_cs_submit(context, 0, &ibs_request, 1);
--
2.43.0


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/igt-dev/attachments/20250516/a2f8bccb/attachment-0001.htm>