[PATCH v9] tests/intel/xe_exec_capture: Add xe_exec_capture test

Teres Alexis, Alan Previn alan.previn.teres.alexis at intel.com
Wed Dec 11 22:08:38 UTC 2024


Just re-RB-ing after the recent addition for the change to set engine execution time manually before running the tests
on each engine in order to limit the execution time of this test:

Reviewed-by: Alan Previn <alan.previn.teres.alexis at intel.com>


On Fri, 2024-12-06 at 14:59 -0800, Dong, Zhanjun wrote:
> Submit cmds to the GPU that result in a GuC engine reset and check that
> devcoredump register dump is generated, by the GuC, and includes the
> full register range.
> 
> Signed-off-by: Zhanjun Dong <zhanjun.dong at intel.com>
> Cc: Alan Previn <alan.previn.teres.alexis at intel.com>
> Cc: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> ---
> Changes from prior revs:
>  v9:-  Reduced job timeout to 2 seconds to speedup test
>        Add info print to show test is running on single/multiple GPU
>  v8:-  Move change list below ---


alan: I just reviewed the difference of the last two revs (diff of diff
farther below):
with that change, we hope it will address Kamil's concern by reducing the execution
time dramatically. IIRC Zhanjun couldn't designate any subtest to declare pass
or fail without ensuring multiple engines are executed-on back to back since the
test needs to ensure that XE-KMD is catching the correct guc-error-dump for the
exact batch on the exact engine we expect it to capture amidst multiple back to back
runs of different-batches-same-engine vs different-engines. (the test uses the ring
buffer batch buffer address as a way to differentiate and determine precisely).


28a29
> +#include "igt_sysfs.h"
37a39,40
> +#define CAPTURE_JOB_TIMEOUT		2000
> +#define JOB_TIMOUT_ENTRY		"job_timeout_ms"
83a87,109
> +static u64
> +xe_sysfs_get_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci)
> +{
> +	int engine_fd = -1;
> +	u64 ret;
> +
> +	engine_fd = xe_sysfs_engine_open(fd, eci->gt_id, eci->engine_class);
> +	ret = igt_sysfs_get_u64(engine_fd, JOB_TIMOUT_ENTRY);
> +	close(engine_fd);
> +
> +	return ret;
> +}
> +
> +static void xe_sysfs_set_job_timeout_ms(int fd, struct drm_xe_engine_class_instance *eci,
> +					u64 timeout)
> +{
> +	int engine_fd = -1;
> +
> +	engine_fd = xe_sysfs_engine_open(fd, eci->gt_id, eci->engine_class);
> +	igt_sysfs_set_u64(engine_fd, JOB_TIMOUT_ENTRY, CAPTURE_JOB_TIMEOUT);
> +	close(engine_fd);
> +}
> +

...

> 	xe_for_each_engine(fd, hwe) {
> 		/*
> 		 * To test devcoredump register data, the test batch address is
> 		 * used to compare with the dump, address bit 40 to 46 act as
> 		 * context id, which start with an random number, increased 1
> 		 * per engine. By this way, the address is unique for each
> 		 * engine, and start with an random number on each run.
> 		 */
> 		const u64 addr = BASE_ADDRESS | ((u64)(engine_cid++ % CID_ADDRESS_MASK) <<
> 						 ADDRESS_SHIFT);
413a440
> +		u64 job_timeout = xe_sysfs_get_job_timeout_ms(fd, hwe);
417a445,447
> +		/* Reduce timeout value to speedup test */
> +		xe_sysfs_set_job_timeout_ms(fd, hwe, CAPTURE_JOB_TIMEOUT);
> +
419a450,452
> +		/* Restore timeout value */
> +		xe_sysfs_set_job_timeout_ms(fd, hwe, job_timeout);
> +
460a494,495
> +			igt_info("Running test on multiple GPU\n");
> +
473a509
> +			igt_info("Running test on single GPU\n");



More information about the igt-dev mailing list