[PATCH i-g-t v10] tests/intel/xe_exec_capture: Add xe_exec_capture test
Dong, Zhanjun
zhanjun.dong at intel.com
Wed Jan 8 23:50:36 UTC 2025
On 2025-01-08 1:23 p.m., Cavitt, Jonathan wrote:
> -----Original Message-----
> From: igt-dev <igt-dev-bounces at lists.freedesktop.org> On Behalf Of Zhanjun Dong
> Sent: Tuesday, January 7, 2025 4:04 PM
> To: igt-dev at lists.freedesktop.org
> Cc: Dong, Zhanjun <zhanjun.dong at intel.com>
> Subject: [PATCH i-g-t v10] tests/intel/xe_exec_capture: Add xe_exec_capture test
>>
>> Submit cmds to GPU to cause engine reset, check generated devcoredump
>> register dump, check against expected values or within the range.
>>
>> Signed-off-by: Zhanjun Dong <zhanjun.dong at intel.com>
>>
>> ---
>>
>> Changes from prior revs:
>> v10:- Move job timeout save/restore out of subtest, to avoid being bypassed
>> by failed assertion
>> Save/restore job timeout for each engine class
>> Remove testing on multiple GPUs, to be put back after further discussion.
>> v9:- Reduced job timeout to 2 seconds to speedup test
>> Add info print to show test is running on single/multiple GPU
>> v8:- Move change list below ---
>> v7:- Fix typo and removed unused macros
>> v6:- Adjust start_line to start from 0
>> Use 7 bit engine_cid, start with random number
>> Add ioerror detect on fgets
>> Reorgnize the regular expression
>> Remove unnecessary radom seed init
>> v5:- Detect devcoredump matches the testing engine
>> Engine will run with random cid
>> v4:- Support runs on multiple GPU
>> Load all devcoredump content to buffer
>> Alloc line buffer dynamic vs static global memory
>> Changed to igt_assert_f to provide more info if failed
>> v3:- Remove call to bash and awk
>> Add regular express parse
>> Detect devcoredump through card index
>> Add devcoredump removal check
>> v2:- Fix CI.build error
>> Add multiple GPU card support
>> ---
>> tests/intel/xe_exec_capture.c | 519 ++++++++++++++++++++++++++++++++++
>> tests/meson.build | 1 +
>> 2 files changed, 520 insertions(+)
>> create mode 100644 tests/intel/xe_exec_capture.c
>>
>> diff --git a/tests/intel/xe_exec_capture.c b/tests/intel/xe_exec_capture.c
>> new file mode 100644
>> index 000000000..b0642b406
>> --- /dev/null
>> +++ b/tests/intel/xe_exec_capture.c
>> @@ -0,0 +1,519 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2024 Intel Corporation
>> + */
>> +
>> +/**
>> + * TEST: Basic tests for GuC based register capture
>> + * Category: Core
>> + * Mega feature: General Core features
>> + * Sub-category: CMD submission
>> + * Functionality: Debug
>> + * Test category: functionality test
>> + */
>> +
>> +#include <ctype.h>
>> +#include <fcntl.h>
...
>> +
>> +igt_main
>> +{
>> + int xe;
>> + struct drm_xe_engine_class_instance *hwe;
>> + u64 timeouts[DRM_XE_ENGINE_CLASS_VM_BIND] = {0};
>> +
>> + igt_fixture {
>> + xe = drm_open_driver(DRIVER_XE);
>> + xe_for_each_engine(xe, hwe) {
>> + /* Skip kernel only classes */
>> + if (hwe->engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND)
>> + continue;
>> + /* Skip classes already set */
>> + if (timeouts[hwe->engine_class])
>> + continue;
>> + /* Save original timeout value */
>> + timeouts[hwe->engine_class] = xe_sysfs_get_job_timeout_ms(xe, hwe);
>> + /* Reduce timeout value to speedup test */
>> + xe_sysfs_set_job_timeout_ms(xe, hwe, CAPTURE_JOB_TIMEOUT);
>> +
>> + igt_debug("Reduced %s class timeout from %ld to %d\n",
>> + xe_engine_class_name(hwe->engine_class),
>> + timeouts[hwe->engine_class], CAPTURE_JOB_TIMEOUT);
>> + }
>> + }
>> +
>> + igt_subtest("reset")
>> + test_card(xe);
>> +
>> + igt_fixture {
>> + xe_for_each_engine(xe, hwe) {
>> + /* Skip kernel only classes */
>> + if (hwe->engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND)
>> + continue;
>> + /* Skip classes already set */
>> + if (timeouts[hwe->engine_class] == 0)
>> + continue;
>> + /* Restore original timeout value */
>> + xe_sysfs_set_job_timeout_ms(xe, hwe, timeouts[hwe->engine_class]);
>
> We should probably assert that we're correctly setting the timeout to the expected
> value after writing, and abort further IGT testing if the sysfs setting is improperly
> reset:
>
> """
> xe_for_each_engine(xe, hwe) {
> u64 store, timeout;
>
> /* Skip kernel only classes */
> if (hwe->engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND)
> continue;
>
> timeout = timeouts[hwe->engine_class]);
> /* Skip classes already set */
> if (!timeout)
> continue;
>
> /* Restore original timeout value */
> xe_sysfs_set_job_timeout_ms(xe, hwe, timeout);
>
> /* Assert successful restore */
> store = xe_sysfs_get_job_timeout_ms(xe, hwe);
> igt_abort_on_f(timeout != store,
> " job_timeout_ms not restored!\n");
>
> timeouts[hwe->engine_class] = 0;
> }
> """
>
> I recognize that igt_sysfs_set_u64 is internally performing an assert on the write,
> but we need to perform an abort here for the following reasons:
> 1) A failure to restore the job_timeout_ms value can cause spurious test failures
> later in IGT execution, which we need to abort to prevent.
> 2) The assert on the write only determines that the write occurred, not necessarily
> that it was entirely successful (I.E. if we attempt to write 100 ms into job_timeout_ms,
> we could get a partial write of only 10 ms or 1 ms and not catch it).
Good point, it is important to abort the IGT test if restore failed,
save us from debug on false alarm.
>
> I don't see any reason to block on this patch beyond this, however, so just add
> the necessary assert, and this is
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt
Thanks for review.
I will post next rev as recommended.
Regards,
Zhanjun Dong
>
>> + igt_debug("Restored %s class timeout to %ld\n",
>> + xe_engine_class_name(hwe->engine_class),
>> + timeouts[hwe->engine_class]);
>> +
>> + timeouts[hwe->engine_class] = 0;
>> + }
>> +
>> + drm_close_driver(xe);
>> + }
>> +}
>> diff --git a/tests/meson.build b/tests/meson.build
>> index 89bba6454..895d911f8 100644
>> --- a/tests/meson.build
>> +++ b/tests/meson.build
>> @@ -286,6 +286,7 @@ intel_xe_progs = [
>> 'xe_exec_atomic',
>> 'xe_exec_balancer',
>> 'xe_exec_basic',
>> + 'xe_exec_capture',
>> 'xe_exec_compute_mode',
>> 'xe_exec_fault_mode',
>> 'xe_exec_mix_modes',
>> --
>> 2.34.1
>>
>>
More information about the igt-dev
mailing list