[PATCH i-g-t] test/amdgpu: fix unknow test issue for amdgpu queue test
Zhang, Jesse(Jie)
Jesse.Zhang at amd.com
Wed Aug 28 02:00:46 UTC 2024
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Vitaly,
-----Original Message-----
From: Prosyak, Vitaly <Vitaly.Prosyak at amd.com>
Sent: Wednesday, August 28, 2024 9:51 AM
To: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; Kamil Konieczny <kamil.konieczny at linux.intel.com>; igt-dev at lists.freedesktop.org
Cc: Prosyak, Vitaly <Vitaly.Prosyak at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
Subject: Re: [PATCH i-g-t] test/amdgpu: fix unknow test issue for amdgpu queue test
Hi Jesse,
The changes look good.
Could you please remove the condition check for sh_mem? This check is redundant because we already have igt_require(sh_mem != NULL); in the igt_fixture.
when we run sudo ./tests/amdgpu/amd_queue_reset --list-subtests, the sh_mem is NULL, and it should not call set_next_test_to_skip.
if remove the check for sh_mem, it will have segmentation fault, like this:
jenkins at image-update:~/workspace/tools/igt-gpu-tools/6code/igt-gpu-tools/build$ sudo ./tests/amdgpu/amd_queue_reset --list-subtests
amdgpu-COMPUTE-CMD_STREAM_EXEC_INVALID_PACKET_LENGTH
amdgpu-COMPUTE-CMD_STREAM_EXEC_INVALID_OPCODE
amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_PROGRAM_ADDR
amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_USER_DATA
amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_SHADER
amdgpu-GFX-CMD_STREAM_EXEC_INVALID_PACKET_LENGTH
amdgpu-GFX-CMD_STREAM_EXEC_INVALID_OPCODE
amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_PROGRAM_ADDR
amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_USER_DATA
amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_SHADER
Received signal SIGSEGV.
Stack trace:
#0 [fatal_sig_handler+0x17b]
#1 [__sigaction+0x50]
#2 [__igt_unique____real_main1025+0x27e]
#3 [main+0x2d]
#4 [__libc_init_first+0x90]
#5 [__libc_start_main+0x80]
#6 [_start+0x25]
Segmentation fault
Thanks
Jesse
With that adjustment, the patch is:
Reviewed-by: Vitaly Prosyak <vitaly.prosyak at amd.com>
Thanks
Vitaly
On 2024-08-27 03:54, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Kamil
>
> -----Original Message-----
> From: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> Sent: Tuesday, August 27, 2024 3:24 PM
> To: igt-dev at lists.freedesktop.org
> Cc: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; Prosyak, Vitaly
> <Vitaly.Prosyak at amd.com>; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Koenig, Christian
> <Christian.Koenig at amd.com>
> Subject: Re: [PATCH i-g-t] test/amdgpu: fix unknow test issue for
> amdgpu queue test
>
> Hi Jesse.zhang,
> On 2024-08-27 at 13:19:32 +0800, Jesse.zhang at amd.com wrote:
>> Queue reset does not exit properly when executing unknown subtests.
>> Because other processes are still functioning.
>>
>> It should exit the other three processes (test, background, and
>> monitor) for this case.
>>
>> Cc: Vitaly Prosyak <vitaly.prosyak at amd.com>
>> Cc: Alex Deucher <alexander.deucher at amd.com>
>> Cc: Christian Koenig <christian.koenig at amd.com>
>>
>> Signed-off-by: Jesse Zhang <jesse.zhang at amd.com>
>> ---
>> tests/amdgpu/amd_queue_reset.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/amdgpu/amd_queue_reset.c
>> b/tests/amdgpu/amd_queue_reset.c index 60208e085..85408e3ff 100644
>> --- a/tests/amdgpu/amd_queue_reset.c
>> +++ b/tests/amdgpu/amd_queue_reset.c
>> @@ -70,6 +70,7 @@ struct shmbuf {
>> int count;
>> bool sub_test_completed;
>> bool sub_test_is_skipped;
>> + bool sub_test_is_existed;
>> unsigned int test_flags;
>> int test_error_code;
>> bool reset_completed;
>> @@ -148,6 +149,7 @@ skip_sub_test(struct shmbuf *sh_mem) {
>> sem_wait(&sh_mem->sem_state_mutex);
>> sh_mem->sub_test_is_skipped = true;
>> + sh_mem->sub_test_is_existed = true;
>> sem_post(&sh_mem->sem_state_mutex);
>> }
> Do you re-implement igt infra?
>
> Hi Kamil
>
> No, in the queue reset test, we start three processes (test process,
> background process, and monitoring process) when running any test (including unknown tests, such as such as: sudo amd_queue_reset --run-subtest amdgpu_testxxx).
>
> The known process can exit with the other three processes.
>
> The unknown process can exit, but the other processes will not exit.
>
> This patch fixes the issue of other processes exiting in the unknown case.
>
> Regards
> Jesse
>
> Regards,
> Kamil
>
>> @@ -327,6 +329,7 @@ static void set_next_test_to_run(struct shmbuf *sh_mem, unsigned int error,
>> sh_mem->good_job.ip = ip_good;
>> sh_mem->good_job.ring_id = ring_id_good;
>> sh_mem->sub_test_is_skipped = false;
>> + sh_mem->sub_test_is_existed = true;
>> sem_post(&sh_mem->sem_state_mutex);
>>
>> //sync and wait for complete
>> @@ -405,6 +408,7 @@ shared_mem_create(struct shmbuf **ppbuf)
>> shmp->sub_test_completed = false;
>> shmp->reset_completed = false;
>> shmp->sub_test_is_skipped = false;
>> + shmp->sub_test_is_existed = false;
>>
>> *ppbuf = shmp;
>> return shm_fd;
>> @@ -1128,7 +1132,6 @@ igt_main
>> create_contexts(device, &arr_context_handle, const_num_of_tests);
>> else if (process == PROCESS_BACKGROUND)
>> fd_shm = shared_mem_open(&sh_mem);
>> -
>> igt_require(fd_shm != -1);
>> igt_require(sh_mem != NULL);
>>
>> @@ -1136,7 +1139,6 @@ igt_main
>> process, sh_mem, const_num_of_tests, info[0].hw_ip_version_major,
>> &monitor_child, &test_child);
>> }
>> -
>> for (int i = 0; i < ARRAY_SIZE(ip_tests); i++) {
>> reset_rings_numbers(&ring_id_good, &ring_id_bad, &ring_id_job_good, &ring_id_job_bad);
>> for (struct dynamic_test *it = &arr_err[0]; it->name;
>> it++) { @@
>> -1154,6 +1156,10 @@ igt_main
>> }
>> }
>> }
>> +
Please, remove
sh_mem
>> + if (sh_mem &&( !sh_mem->sub_test_is_existed))
>> + set_next_test_to_skip(sh_mem);
>> +
>> igt_fixture {
>> if (process == PROCESS_TEST) {
>> waitpid(monitor_child, &monitorExitMethod, 0);
>> --
>> 2.25.1
>>
More information about the igt-dev
mailing list