[PATCH i-g-t] test/amdgpu: fix unknow test issue for amdgpu queue test

vitaly prosyak vprosyak at amd.com
Wed Aug 28 02:05:53 UTC 2024


Thanks for catching this! You're correct; I've reverted my request to remove |sh_mem != NULL| since the |igt_fixture| isn't executed when the |--list-subtests| parameter is passed. I'll merge the changes tomorrow. Thanks again!


Vitaly

On 2024-08-27 22:00, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Vitaly,
>
> -----Original Message-----
> From: Prosyak, Vitaly <Vitaly.Prosyak at amd.com>
> Sent: Wednesday, August 28, 2024 9:51 AM
> To: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; Kamil Konieczny <kamil.konieczny at linux.intel.com>; igt-dev at lists.freedesktop.org
> Cc: Prosyak, Vitaly <Vitaly.Prosyak at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
> Subject: Re: [PATCH i-g-t] test/amdgpu: fix unknow test issue for amdgpu queue test
>
> Hi Jesse,
>
> The changes look good.
>
> Could you please remove the condition check for sh_mem? This check is redundant because we already have igt_require(sh_mem != NULL); in the igt_fixture.
>
>
> when we run sudo ./tests/amdgpu/amd_queue_reset --list-subtests, the sh_mem is NULL, and it should not call set_next_test_to_skip.
>
> if remove the check for sh_mem, it will have segmentation fault, like this:
>
> jenkins at image-update:~/workspace/tools/igt-gpu-tools/6code/igt-gpu-tools/build$ sudo ./tests/amdgpu/amd_queue_reset --list-subtests
> amdgpu-COMPUTE-CMD_STREAM_EXEC_INVALID_PACKET_LENGTH
> amdgpu-COMPUTE-CMD_STREAM_EXEC_INVALID_OPCODE
> amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_PROGRAM_ADDR
> amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_USER_DATA
> amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_SHADER
> amdgpu-GFX-CMD_STREAM_EXEC_INVALID_PACKET_LENGTH
> amdgpu-GFX-CMD_STREAM_EXEC_INVALID_OPCODE
> amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_PROGRAM_ADDR
> amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_USER_DATA
> amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_SHADER
> Received signal SIGSEGV.
> Stack trace:
>  #0 [fatal_sig_handler+0x17b]
>  #1 [__sigaction+0x50]
>  #2 [__igt_unique____real_main1025+0x27e]
>  #3 [main+0x2d]
>  #4 [__libc_init_first+0x90]
>  #5 [__libc_start_main+0x80]
>  #6 [_start+0x25]
> Segmentation fault
>
> Thanks
> Jesse
>
> With that adjustment, the patch is:
>
> Reviewed-by: Vitaly Prosyak <vitaly.prosyak at amd.com>
>
> Thanks
>
>
> Vitaly
>
>
>
>
> On 2024-08-27 03:54, Zhang, Jesse(Jie) wrote:
>> [AMD Official Use Only - AMD Internal Distribution Only]
>>
>> Hi Kamil
>>
>> -----Original Message-----
>> From: Kamil Konieczny <kamil.konieczny at linux.intel.com>
>> Sent: Tuesday, August 27, 2024 3:24 PM
>> To: igt-dev at lists.freedesktop.org
>> Cc: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; Prosyak, Vitaly
>> <Vitaly.Prosyak at amd.com>; Deucher, Alexander
>> <Alexander.Deucher at amd.com>; Koenig, Christian
>> <Christian.Koenig at amd.com>
>> Subject: Re: [PATCH i-g-t] test/amdgpu: fix unknow test issue for
>> amdgpu queue test
>>
>> Hi Jesse.zhang,
>> On 2024-08-27 at 13:19:32 +0800, Jesse.zhang at amd.com wrote:
>>> Queue reset does not exit properly when executing unknown subtests.
>>> Because other processes are still functioning.
>>>
>>> It should exit the other three processes (test, background, and
>>> monitor) for this case.
>>>
>>> Cc: Vitaly Prosyak <vitaly.prosyak at amd.com>
>>> Cc: Alex Deucher <alexander.deucher at amd.com>
>>> Cc: Christian Koenig <christian.koenig at amd.com>
>>>
>>> Signed-off-by: Jesse Zhang <jesse.zhang at amd.com>
>>> ---
>>>  tests/amdgpu/amd_queue_reset.c | 10 ++++++++--
>>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tests/amdgpu/amd_queue_reset.c
>>> b/tests/amdgpu/amd_queue_reset.c index 60208e085..85408e3ff 100644
>>> --- a/tests/amdgpu/amd_queue_reset.c
>>> +++ b/tests/amdgpu/amd_queue_reset.c
>>> @@ -70,6 +70,7 @@ struct shmbuf {
>>>       int count;
>>>       bool sub_test_completed;
>>>       bool sub_test_is_skipped;
>>> +     bool sub_test_is_existed;
>>>       unsigned int test_flags;
>>>       int test_error_code;
>>>       bool reset_completed;
>>> @@ -148,6 +149,7 @@ skip_sub_test(struct shmbuf *sh_mem)  {
>>>       sem_wait(&sh_mem->sem_state_mutex);
>>>       sh_mem->sub_test_is_skipped = true;
>>> +     sh_mem->sub_test_is_existed = true;
>>>       sem_post(&sh_mem->sem_state_mutex);
>>>  }
>> Do you re-implement igt infra?
>>
>> Hi Kamil
>>
>> No, in the queue reset test, we start three processes (test process,
>> background process, and monitoring process) when running any test (including unknown tests, such as such as:  sudo amd_queue_reset --run-subtest amdgpu_testxxx).
>>
>> The known process can exit with the other three processes.
>>
>> The unknown process can exit, but the other processes will not exit.
>>
>> This patch fixes the issue of other processes exiting in the unknown case.
>>
>> Regards
>> Jesse
>>
>> Regards,
>> Kamil
>>
>>> @@ -327,6 +329,7 @@ static void set_next_test_to_run(struct shmbuf *sh_mem, unsigned int error,
>>>       sh_mem->good_job.ip = ip_good;
>>>       sh_mem->good_job.ring_id = ring_id_good;
>>>       sh_mem->sub_test_is_skipped = false;
>>> +     sh_mem->sub_test_is_existed = true;
>>>       sem_post(&sh_mem->sem_state_mutex);
>>>
>>>       //sync and wait for complete
>>> @@ -405,6 +408,7 @@ shared_mem_create(struct shmbuf **ppbuf)
>>>       shmp->sub_test_completed = false;
>>>       shmp->reset_completed = false;
>>>       shmp->sub_test_is_skipped = false;
>>> +     shmp->sub_test_is_existed = false;
>>>
>>>       *ppbuf = shmp;
>>>       return shm_fd;
>>> @@ -1128,7 +1132,6 @@ igt_main
>>>                       create_contexts(device, &arr_context_handle, const_num_of_tests);
>>>               else if (process == PROCESS_BACKGROUND)
>>>                       fd_shm = shared_mem_open(&sh_mem);
>>> -
>>>               igt_require(fd_shm != -1);
>>>               igt_require(sh_mem != NULL);
>>>
>>> @@ -1136,7 +1139,6 @@ igt_main
>>>                       process, sh_mem, const_num_of_tests, info[0].hw_ip_version_major,
>>>                       &monitor_child, &test_child);
>>>       }
>>> -
>>>       for (int i = 0; i < ARRAY_SIZE(ip_tests); i++) {
>>>               reset_rings_numbers(&ring_id_good, &ring_id_bad, &ring_id_job_good, &ring_id_job_bad);
>>>               for (struct dynamic_test *it = &arr_err[0]; it->name;
>>> it++) { @@
>>> -1154,6 +1156,10 @@ igt_main
>>>                       }
>>>               }
>>>       }
>>> +
> Please, remove
>
> sh_mem
>
>>> +     if (sh_mem &&( !sh_mem->sub_test_is_existed))
>>> +             set_next_test_to_skip(sh_mem);
>>> +
>>>       igt_fixture {
>>>               if (process == PROCESS_TEST) {
>>>                       waitpid(monitor_child, &monitorExitMethod, 0);
>>> --
>>> 2.25.1
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/igt-dev/attachments/20240827/842af6dd/attachment.htm>


More information about the igt-dev mailing list