[PATCH i-g-t v4 1/5] tests/gem_ctx_exec: Fail on unsuccessful preempt timeout update
Kamil Konieczny
kamil.konieczny at linux.intel.com
Thu Jul 18 12:05:51 UTC 2024
Hi Janusz,
On 2024-07-18 at 10:55:12 +0200, Janusz Krzysztofik wrote:
> CI reports the following failures from basic-nohangcheck subtest:
>
> (gem_ctx_exec:1115) CRITICAL: Test assertion failure function nohangcheck_hostile, file ../../../usr/src/igt-gpu-tools/tests/intel/gem_ctx_exec.c:374:
> (gem_ctx_exec:1115) CRITICAL: Failed assertion: err == 0
> (gem_ctx_exec:1115) CRITICAL: Last errno: 2, No such file or directory
> (gem_ctx_exec:1115) CRITICAL: Hostile unpreemptable context was not cancelled immediately upon closure
>
> The subtest sets 50 ms preempt timeout on each engine before proceding
> with submission of spins, then it waits up to 1 second for those spins to
> be terminated. However, dump of engines' debugfs data performed by the
> subtest after the failure shows preempt timeouts still at their default
> values: 7500 ms on rcs0 and 640 ms on other class engines. Dmesg records
> confirm preemption timeouts triggered on other engines after 640 ms and
> not on rcs0 within the 1 second limit.
>
> As a first step, let the subtest verify return values of function calls
> supposed to update the preempt timeouts with the new values. If failed
> on any engine then report that at debug level as a useful hint displayed
> when the test times out on waiting for spin termination.
>
> v2: No changes.
> v3: Don't fail on unsuccessful update of preempt_timeout_ms, older
> platforms don't support it but can still succeed.
>
> Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/6268
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
LGTM,
Reviewed-by: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> ---
> tests/intel/gem_ctx_exec.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/tests/intel/gem_ctx_exec.c b/tests/intel/gem_ctx_exec.c
> index d6aa8ba0aa..f3e252d10e 100644
> --- a/tests/intel/gem_ctx_exec.c
> +++ b/tests/intel/gem_ctx_exec.c
> @@ -308,8 +308,7 @@ static void nohangcheck_hostile(int i915)
> igt_hang_t hang;
> int fence = -1;
> const intel_ctx_t *ctx;
> - int err = 0;
> - int dir;
> + int dir, err;
> uint64_t ahnd;
>
> /*
> @@ -333,8 +332,11 @@ static void nohangcheck_hostile(int i915)
> int new;
>
> /* Set a fast hang detection for a dead context */
> - gem_engine_property_printf(i915, e->name,
> - "preempt_timeout_ms", "%d", 50);
> + err = gem_engine_property_printf(i915, e->name,
> + "preempt_timeout_ms", "%d", 50);
> + igt_debug_on_f(err < 0,
> + "%s preempt_timeout_ms update failed: %d\n",
> + e->name, err);
>
> spin = __igt_spin_new(i915,
> .ahnd = ahnd,
> @@ -362,6 +364,7 @@ static void nohangcheck_hostile(int i915)
> intel_ctx_destroy(i915, ctx);
> igt_assert(fence != -1);
>
> + err = 0;
> if (sync_fence_wait(fence, MSEC_PER_SEC)) { /* 640ms preempt-timeout */
> igt_debugfs_dump(i915, "i915_engine_info");
> err = -ETIME;
> --
> 2.45.2
>
More information about the Intel-gfx
mailing list