[Intel-gfx] [PATCH i-g-t v3 3/3] tests/gem_exec_fence: Restore pre-hang checks in *await-hang scenarios
Mauro Carvalho Chehab
mauro.chehab at linux.intel.com
Wed Aug 17 13:01:29 UTC 2022
On Fri, 12 Aug 2022 11:53:46 +0200
Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com> wrote:
> Commit c8f6aaf32d83 "tests/gem_exec_fence: Check stored values only for
> valid workloads" resolved an issue, observed in *await-hang scenarios,
> where a fence exposed by an invalid spin batch was signaled asynchronously
> to pending checks for depended test batches still waiting for that fence.
> Those checks have been disabled, weakening those scenarios.
>
> This change restores the pre-hang checks to the extent possible when the
> invalid spin batch may trigger an immediate reset. If we are lucky enough
> to take a snapshot of the object supposed to be still not modified by
> store batches after we confirm that the spin batch has started and before
> the fence is signaled, we use that copy to verify if the fence dependent
> batches are still blocked. Running the *await-hang subtests multiple
> times in CI should build our confidence in their results.
>
> v2: preserve checking the pipeline runs ahead of the hang (Chris)
> v3: use a more simple 'best effort' approach suggested by Chris
>
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Mauro Carvalho Chehab <mauro.carvalho.chehab at intel.com>
LGTM.
Reviewed-by: Mauro Carvalho Chehab <mchehab at kernel.org>
> ---
> tests/i915/gem_exec_fence.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/tests/i915/gem_exec_fence.c b/tests/i915/gem_exec_fence.c
> index 78d83460f7..f24bebdb7d 100644
> --- a/tests/i915/gem_exec_fence.c
> +++ b/tests/i915/gem_exec_fence.c
> @@ -21,6 +21,7 @@
> * IN THE SOFTWARE.
> */
>
> +#include <string.h>
> #include <sys/ioctl.h>
> #include <sys/poll.h>
> #include <sys/signal.h>
> @@ -307,12 +308,12 @@ static void test_fence_await(int fd, const intel_ctx_t *ctx,
> const struct intel_execution_engine2 *e,
> unsigned flags)
> {
> + uint64_t scratch_offset, ahnd = get_reloc_ahnd(fd, ctx->id);
> const struct intel_execution_engine2 *e2;
> uint32_t scratch = gem_create(fd, 4096);
> + uint32_t *out, tmp[4096 / sizeof(*out)];
> igt_spin_t *spin;
> - uint32_t *out;
> - uint64_t scratch_offset, ahnd = get_reloc_ahnd(fd, ctx->id);
> - int i;
> + int i, n;
>
> scratch_offset = get_offset(ahnd, scratch, 4096, 0);
>
> @@ -353,11 +354,20 @@ static void test_fence_await(int fd, const intel_ctx_t *ctx,
> /* Long, but not too long to anger preemption disable checks */
> usleep(50 * 1000); /* 50 ms, typical preempt reset is 150+ms */
>
> + /*
> + * Check for invalidly completing the task early.
> + * In -hang variants, invalid spin batch may trigger an immediate reset,
> + * then we are able to verify if store batches haven't been started yet
> + * only if the fence of the spin batch is still busy.
> + * Just run *await-hang subtest multiple times to build confidence.
> + */
> + memcpy(tmp, out, (i + 1) * sizeof(*out));
> + if (fence_busy(spin->out_fence)) {
> + for (n = 0; n <= i; n++)
> + igt_assert_eq_u32(tmp[n], 0);
> + }
> if ((flags & HANG) == 0) {
> - /* Check for invalidly completing the task early */
> igt_assert(fence_busy(spin->out_fence));
> - for (int n = 0; n <= i; n++)
> - igt_assert_eq_u32(out[n], 0);
>
> igt_spin_end(spin);
> }
More information about the Intel-gfx
mailing list