[PATCH i-g-t v10] tests/intel/xe_exec_capture: Add xe_exec_capture test

Dong, Zhanjun zhanjun.dong at intel.com
Wed Jan 8 23:50:36 UTC 2025



On 2025-01-08 1:23 p.m., Cavitt, Jonathan wrote:
> -----Original Message-----
> From: igt-dev <igt-dev-bounces at lists.freedesktop.org> On Behalf Of Zhanjun Dong
> Sent: Tuesday, January 7, 2025 4:04 PM
> To: igt-dev at lists.freedesktop.org
> Cc: Dong, Zhanjun <zhanjun.dong at intel.com>
> Subject: [PATCH i-g-t v10] tests/intel/xe_exec_capture: Add xe_exec_capture test
>>
>> Submit cmds to GPU to cause engine reset, check generated devcoredump
>> register dump, check against expected values or within the range.
>>
>> Signed-off-by: Zhanjun Dong <zhanjun.dong at intel.com>
>>
>> ---
>>
>> Changes from prior revs:
>>   v10:- Move job timeout save/restore out of subtest, to avoid being bypassed
>>         by failed assertion
>>         Save/restore job timeout for each engine class
>>         Remove testing on multiple GPUs, to be put back after further discussion.
>>   v9:-  Reduced job timeout to 2 seconds to speedup test
>>         Add info print to show test is running on single/multiple GPU
>>   v8:-  Move change list below ---
>>   v7:-  Fix typo and removed unused macros
>>   v6:-  Adjust start_line to start from 0
>>         Use 7 bit engine_cid, start with random number
>>         Add ioerror detect on fgets
>>         Reorgnize the regular expression
>>         Remove unnecessary radom seed init
>>   v5:-  Detect devcoredump matches the testing engine
>>         Engine will run with random cid
>>   v4:-  Support runs on multiple GPU
>>         Load all devcoredump content to buffer
>>         Alloc line buffer dynamic vs static global memory
>>         Changed to igt_assert_f to provide more info if failed
>>   v3:-  Remove call to bash and awk
>>         Add regular express parse
>>         Detect devcoredump through card index
>>         Add devcoredump removal check
>>   v2:-  Fix CI.build error
>>         Add multiple GPU card support
>> ---
>>   tests/intel/xe_exec_capture.c | 519 ++++++++++++++++++++++++++++++++++
>>   tests/meson.build             |   1 +
>>   2 files changed, 520 insertions(+)
>>   create mode 100644 tests/intel/xe_exec_capture.c
>>
>> diff --git a/tests/intel/xe_exec_capture.c b/tests/intel/xe_exec_capture.c
>> new file mode 100644
>> index 000000000..b0642b406
>> --- /dev/null
>> +++ b/tests/intel/xe_exec_capture.c
>> @@ -0,0 +1,519 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2024 Intel Corporation
>> + */
>> +
>> +/**
>> + * TEST: Basic tests for GuC based register capture
>> + * Category: Core
>> + * Mega feature: General Core features
>> + * Sub-category: CMD submission
>> + * Functionality: Debug
>> + * Test category: functionality test
>> + */
>> +
>> +#include <ctype.h>
>> +#include <fcntl.h>
...

>> +
>> +igt_main
>> +{
>> +	int xe;
>> +	struct drm_xe_engine_class_instance *hwe;
>> +	u64 timeouts[DRM_XE_ENGINE_CLASS_VM_BIND] = {0};
>> +
>> +	igt_fixture {
>> +		xe = drm_open_driver(DRIVER_XE);
>> +		xe_for_each_engine(xe, hwe) {
>> +			/* Skip kernel only classes */
>> +			if (hwe->engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND)
>> +				continue;
>> +			/* Skip classes already set */
>> +			if (timeouts[hwe->engine_class])
>> +				continue;
>> +			/* Save original timeout value */
>> +			timeouts[hwe->engine_class] = xe_sysfs_get_job_timeout_ms(xe, hwe);
>> +			/* Reduce timeout value to speedup test */
>> +			xe_sysfs_set_job_timeout_ms(xe, hwe, CAPTURE_JOB_TIMEOUT);
>> +
>> +			igt_debug("Reduced %s class timeout from %ld to %d\n",
>> +				  xe_engine_class_name(hwe->engine_class),
>> +				  timeouts[hwe->engine_class], CAPTURE_JOB_TIMEOUT);
>> +		}
>> +	}
>> +
>> +	igt_subtest("reset")
>> +		test_card(xe);
>> +
>> +	igt_fixture {
>> +		xe_for_each_engine(xe, hwe) {
>> +			/* Skip kernel only classes */
>> +			if (hwe->engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND)
>> +				continue;
>> +			/* Skip classes already set */
>> +			if (timeouts[hwe->engine_class] == 0)
>> +				continue;
>> +			/* Restore original timeout value */
>> +			xe_sysfs_set_job_timeout_ms(xe, hwe, timeouts[hwe->engine_class]);
> 
> We should probably assert that we're correctly setting the timeout to the expected
> value after writing, and abort further IGT testing if the sysfs setting is improperly
> reset:
> 
> """
> 		xe_for_each_engine(xe, hwe) {
> 			u64 store, timeout;
> 
> 			/* Skip kernel only classes */
> 			if (hwe->engine_class >= DRM_XE_ENGINE_CLASS_VM_BIND)
> 				continue;
> 
> 			timeout = timeouts[hwe->engine_class]);
> 			/* Skip classes already set */
> 			if (!timeout)
> 				continue;
> 
> 			/* Restore original timeout value */
> 			xe_sysfs_set_job_timeout_ms(xe, hwe, timeout);
> 
> 			/* Assert successful restore */
> 			store = xe_sysfs_get_job_timeout_ms(xe, hwe);
> 			igt_abort_on_f(timeout != store,
> 				       " job_timeout_ms not restored!\n");
> 
> 			timeouts[hwe->engine_class] = 0;
> 		}
> """
> 
> I recognize that igt_sysfs_set_u64 is internally performing an assert on the write,
> but we need to perform an abort here for the following reasons:
> 1) A failure to restore the job_timeout_ms value can cause spurious test failures
> later in IGT execution, which we need to abort to prevent.
> 2) The assert on the write only determines that the write occurred, not necessarily
> that it was entirely successful (I.E. if we attempt to write 100 ms into job_timeout_ms,
> we could get a partial write of only 10 ms or 1 ms and not catch it).
Good point, it is important to abort the IGT test if restore failed, 
save us from debug on false alarm.

> 
> I don't see any reason to block on this patch beyond this, however, so just add
> the necessary assert, and this is
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> -Jonathan Cavitt

Thanks for review.

I will post next rev as recommended.

Regards,
Zhanjun Dong

> 
>> +			igt_debug("Restored %s class timeout to %ld\n",
>> +				  xe_engine_class_name(hwe->engine_class),
>> +				  timeouts[hwe->engine_class]);
>> +
>> +			timeouts[hwe->engine_class] = 0;
>> +		}
>> +
>> +		drm_close_driver(xe);
>> +	}
>> +}
>> diff --git a/tests/meson.build b/tests/meson.build
>> index 89bba6454..895d911f8 100644
>> --- a/tests/meson.build
>> +++ b/tests/meson.build
>> @@ -286,6 +286,7 @@ intel_xe_progs = [
>>   	'xe_exec_atomic',
>>   	'xe_exec_balancer',
>>   	'xe_exec_basic',
>> +	'xe_exec_capture',
>>   	'xe_exec_compute_mode',
>>   	'xe_exec_fault_mode',
>>   	'xe_exec_mix_modes',
>> -- 
>> 2.34.1
>>
>>



More information about the igt-dev mailing list