[PATCH i-g-t] test/amdgpu: fix unknow test issue for amdgpu queue test

Zhang, Jesse(Jie) Jesse.Zhang at amd.com
Wed Aug 28 02:00:46 UTC 2024


[AMD Official Use Only - AMD Internal Distribution Only]

Hi Vitaly,

-----Original Message-----
From: Prosyak, Vitaly <Vitaly.Prosyak at amd.com>
Sent: Wednesday, August 28, 2024 9:51 AM
To: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; Kamil Konieczny <kamil.konieczny at linux.intel.com>; igt-dev at lists.freedesktop.org
Cc: Prosyak, Vitaly <Vitaly.Prosyak at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
Subject: Re: [PATCH i-g-t] test/amdgpu: fix unknow test issue for amdgpu queue test

Hi Jesse,

The changes look good.

Could you please remove the condition check for sh_mem? This check is redundant because we already have igt_require(sh_mem != NULL); in the igt_fixture.


when we run sudo ./tests/amdgpu/amd_queue_reset --list-subtests, the sh_mem is NULL, and it should not call set_next_test_to_skip.

if remove the check for sh_mem, it will have segmentation fault, like this:

jenkins at image-update:~/workspace/tools/igt-gpu-tools/6code/igt-gpu-tools/build$ sudo ./tests/amdgpu/amd_queue_reset --list-subtests
amdgpu-COMPUTE-CMD_STREAM_EXEC_INVALID_PACKET_LENGTH
amdgpu-COMPUTE-CMD_STREAM_EXEC_INVALID_OPCODE
amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_PROGRAM_ADDR
amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_USER_DATA
amdgpu-COMPUTE-BACKEND_SE_GC_SHADER_INVALID_SHADER
amdgpu-GFX-CMD_STREAM_EXEC_INVALID_PACKET_LENGTH
amdgpu-GFX-CMD_STREAM_EXEC_INVALID_OPCODE
amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_PROGRAM_ADDR
amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_USER_DATA
amdgpu-GFX-BACKEND_SE_GC_SHADER_INVALID_SHADER
Received signal SIGSEGV.
Stack trace:
 #0 [fatal_sig_handler+0x17b]
 #1 [__sigaction+0x50]
 #2 [__igt_unique____real_main1025+0x27e]
 #3 [main+0x2d]
 #4 [__libc_init_first+0x90]
 #5 [__libc_start_main+0x80]
 #6 [_start+0x25]
Segmentation fault

Thanks
Jesse

With that adjustment, the patch is:

Reviewed-by: Vitaly Prosyak <vitaly.prosyak at amd.com>

Thanks


Vitaly




On 2024-08-27 03:54, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Kamil
>
> -----Original Message-----
> From: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> Sent: Tuesday, August 27, 2024 3:24 PM
> To: igt-dev at lists.freedesktop.org
> Cc: Zhang, Jesse(Jie) <Jesse.Zhang at amd.com>; Prosyak, Vitaly
> <Vitaly.Prosyak at amd.com>; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Koenig, Christian
> <Christian.Koenig at amd.com>
> Subject: Re: [PATCH i-g-t] test/amdgpu: fix unknow test issue for
> amdgpu queue test
>
> Hi Jesse.zhang,
> On 2024-08-27 at 13:19:32 +0800, Jesse.zhang at amd.com wrote:
>> Queue reset does not exit properly when executing unknown subtests.
>> Because other processes are still functioning.
>>
>> It should exit the other three processes (test, background, and
>> monitor) for this case.
>>
>> Cc: Vitaly Prosyak <vitaly.prosyak at amd.com>
>> Cc: Alex Deucher <alexander.deucher at amd.com>
>> Cc: Christian Koenig <christian.koenig at amd.com>
>>
>> Signed-off-by: Jesse Zhang <jesse.zhang at amd.com>
>> ---
>>  tests/amdgpu/amd_queue_reset.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/amdgpu/amd_queue_reset.c
>> b/tests/amdgpu/amd_queue_reset.c index 60208e085..85408e3ff 100644
>> --- a/tests/amdgpu/amd_queue_reset.c
>> +++ b/tests/amdgpu/amd_queue_reset.c
>> @@ -70,6 +70,7 @@ struct shmbuf {
>>       int count;
>>       bool sub_test_completed;
>>       bool sub_test_is_skipped;
>> +     bool sub_test_is_existed;
>>       unsigned int test_flags;
>>       int test_error_code;
>>       bool reset_completed;
>> @@ -148,6 +149,7 @@ skip_sub_test(struct shmbuf *sh_mem)  {
>>       sem_wait(&sh_mem->sem_state_mutex);
>>       sh_mem->sub_test_is_skipped = true;
>> +     sh_mem->sub_test_is_existed = true;
>>       sem_post(&sh_mem->sem_state_mutex);
>>  }
> Do you re-implement igt infra?
>
> Hi Kamil
>
> No, in the queue reset test, we start three processes (test process,
> background process, and monitoring process) when running any test (including unknown tests, such as such as:  sudo amd_queue_reset --run-subtest amdgpu_testxxx).
>
> The known process can exit with the other three processes.
>
> The unknown process can exit, but the other processes will not exit.
>
> This patch fixes the issue of other processes exiting in the unknown case.
>
> Regards
> Jesse
>
> Regards,
> Kamil
>
>> @@ -327,6 +329,7 @@ static void set_next_test_to_run(struct shmbuf *sh_mem, unsigned int error,
>>       sh_mem->good_job.ip = ip_good;
>>       sh_mem->good_job.ring_id = ring_id_good;
>>       sh_mem->sub_test_is_skipped = false;
>> +     sh_mem->sub_test_is_existed = true;
>>       sem_post(&sh_mem->sem_state_mutex);
>>
>>       //sync and wait for complete
>> @@ -405,6 +408,7 @@ shared_mem_create(struct shmbuf **ppbuf)
>>       shmp->sub_test_completed = false;
>>       shmp->reset_completed = false;
>>       shmp->sub_test_is_skipped = false;
>> +     shmp->sub_test_is_existed = false;
>>
>>       *ppbuf = shmp;
>>       return shm_fd;
>> @@ -1128,7 +1132,6 @@ igt_main
>>                       create_contexts(device, &arr_context_handle, const_num_of_tests);
>>               else if (process == PROCESS_BACKGROUND)
>>                       fd_shm = shared_mem_open(&sh_mem);
>> -
>>               igt_require(fd_shm != -1);
>>               igt_require(sh_mem != NULL);
>>
>> @@ -1136,7 +1139,6 @@ igt_main
>>                       process, sh_mem, const_num_of_tests, info[0].hw_ip_version_major,
>>                       &monitor_child, &test_child);
>>       }
>> -
>>       for (int i = 0; i < ARRAY_SIZE(ip_tests); i++) {
>>               reset_rings_numbers(&ring_id_good, &ring_id_bad, &ring_id_job_good, &ring_id_job_bad);
>>               for (struct dynamic_test *it = &arr_err[0]; it->name;
>> it++) { @@
>> -1154,6 +1156,10 @@ igt_main
>>                       }
>>               }
>>       }
>> +
Please, remove

sh_mem

>> +     if (sh_mem &&( !sh_mem->sub_test_is_existed))
>> +             set_next_test_to_skip(sh_mem);
>> +
>>       igt_fixture {
>>               if (process == PROCESS_TEST) {
>>                       waitpid(monitor_child, &monitorExitMethod, 0);
>> --
>> 2.25.1
>>


More information about the igt-dev mailing list