[PATCH i-g-t v4 1/5] tests/gem_ctx_exec: Fail on unsuccessful preempt timeout update

Kamil Konieczny kamil.konieczny at linux.intel.com
Thu Jul 18 12:05:51 UTC 2024


Hi Janusz,
On 2024-07-18 at 10:55:12 +0200, Janusz Krzysztofik wrote:
> CI reports the following failures from basic-nohangcheck subtest:
> 
> (gem_ctx_exec:1115) CRITICAL: Test assertion failure function nohangcheck_hostile, file ../../../usr/src/igt-gpu-tools/tests/intel/gem_ctx_exec.c:374:
> (gem_ctx_exec:1115) CRITICAL: Failed assertion: err == 0
> (gem_ctx_exec:1115) CRITICAL: Last errno: 2, No such file or directory
> (gem_ctx_exec:1115) CRITICAL: Hostile unpreemptable context was not cancelled immediately upon closure
> 
> The subtest sets 50 ms preempt timeout on each engine before proceding
> with submission of spins, then it waits up to 1 second for those spins to
> be terminated.  However, dump of engines' debugfs data performed by the
> subtest after the failure shows preempt timeouts still at their default
> values: 7500 ms on rcs0 and 640 ms on other class engines.  Dmesg records
> confirm preemption timeouts triggered on other engines after 640 ms and
> not on rcs0 within the 1 second limit.
> 
> As a first step, let the subtest verify return values of function calls
> supposed to update the preempt timeouts with the new values.  If failed
> on any engine then report that at debug level as a useful hint displayed
> when the test times out on waiting for spin termination.
> 
> v2: No changes.
> v3: Don't fail on unsuccessful update of preempt_timeout_ms, older
>     platforms don't support it but can still succeed.
> 
> Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/6268
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>

LGTM,

Reviewed-by: Kamil Konieczny <kamil.konieczny at linux.intel.com>


> ---
>  tests/intel/gem_ctx_exec.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/intel/gem_ctx_exec.c b/tests/intel/gem_ctx_exec.c
> index d6aa8ba0aa..f3e252d10e 100644
> --- a/tests/intel/gem_ctx_exec.c
> +++ b/tests/intel/gem_ctx_exec.c
> @@ -308,8 +308,7 @@ static void nohangcheck_hostile(int i915)
>  	igt_hang_t hang;
>  	int fence = -1;
>  	const intel_ctx_t *ctx;
> -	int err = 0;
> -	int dir;
> +	int dir, err;
>  	uint64_t ahnd;
>  
>  	/*
> @@ -333,8 +332,11 @@ static void nohangcheck_hostile(int i915)
>  		int new;
>  
>  		/* Set a fast hang detection for a dead context */
> -		gem_engine_property_printf(i915, e->name,
> -					   "preempt_timeout_ms", "%d", 50);
> +		err = gem_engine_property_printf(i915, e->name,
> +						 "preempt_timeout_ms", "%d", 50);
> +		igt_debug_on_f(err < 0,
> +			       "%s preempt_timeout_ms update failed: %d\n",
> +			       e->name, err);
>  
>  		spin = __igt_spin_new(i915,
>  				      .ahnd = ahnd,
> @@ -362,6 +364,7 @@ static void nohangcheck_hostile(int i915)
>  	intel_ctx_destroy(i915, ctx);
>  	igt_assert(fence != -1);
>  
> +	err = 0;
>  	if (sync_fence_wait(fence, MSEC_PER_SEC)) { /* 640ms preempt-timeout */
>  		igt_debugfs_dump(i915, "i915_engine_info");
>  		err = -ETIME;
> -- 
> 2.45.2
> 


More information about the Intel-gfx mailing list