[PATCH v2 i-g-t 3/6] tests/intel/xe_sriov_flr: Add skeleton for clear and isolation tests

Marcin Bernatowicz marcin.bernatowicz at linux.intel.com
Mon Oct 21 20:07:34 UTC 2024


Introduce a skeleton with basic structures for subchecks execution and
a `verify_flr` template method to orchestrate the verification of
Functional Level Reset (FLR) across multiple Virtual Functions (VFs).

The goal is to reduce runtime by limiting the total number of FLRs. Instead
of repeating the FLR process for each subcheck (clear-lmem, clear-ggtt,
clear-scratch-regs, clear-media-scratch-regs), a single FLR is issued.
Afterward, all subchecks verify if any failures occurred and report the
results accordingly. The proposed skeleton ensures that while one subcheck
may stop due to failure or a skip condition, other subchecks can continue
execution.

Concrete subcheck implementations (clear-lmem, clear-ggtt,
clear-scratch-regs, clear-media-scratch-regs) will be introduced
in subsequent patches.

Proposed IGT tests (will report each subcheck's status):

flr-vf1-clear
    Verifies that LMEM, GGTT, and SCRATCH_REGS are properly cleared on VF1
    (with only VF1 enabled) following a Function Level Reset (FLR). This
    test can be included in the BAT (Basic Acceptance Test) suite.

flr-each-isolation
    Sequentially performs FLR on each VF to verify isolation and
    clearing of LMEM, GGTT, and SCRATCH_REGS on the reset VF only.
    This test is better suited for FULL runs.

v2: Correct subtest run type, use uppercase for GT (Lukasz)
    Add set_skip_reason, set_fail_reason helpers for readability

Signed-off-by: Marcin Bernatowicz <marcin.bernatowicz at linux.intel.com>
Cc: Adam Miszczak <adam.miszczak at linux.intel.com>
Cc: C V Narasimha <narasimha.c.v at intel.com>
Cc: Jakub Kolakowski <jakub1.kolakowski at intel.com>
Cc: K V P Satyanarayana <satyanarayana.k.v.p at intel.com>
Cc: Lukasz Laguna <lukasz.laguna at intel.com>
Cc: Michał Wajdeczko <michal.wajdeczko at intel.com>
Cc: Michał Winiarski <michal.winiarski at intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski at intel.com>
Cc: Tomasz Lis <tomasz.lis at intel.com>
---
 tests/intel/xe_sriov_flr.c | 331 +++++++++++++++++++++++++++++++++++++
 tests/meson.build          |   1 +
 2 files changed, 332 insertions(+)
 create mode 100644 tests/intel/xe_sriov_flr.c

diff --git a/tests/intel/xe_sriov_flr.c b/tests/intel/xe_sriov_flr.c
new file mode 100644
index 000000000..a9830e274
--- /dev/null
+++ b/tests/intel/xe_sriov_flr.c
@@ -0,0 +1,331 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2024 Intel Corporation. All rights reserved.
+ */
+
+#include "drmtest.h"
+#include "igt_core.h"
+#include "igt_sriov_device.h"
+
+/**
+ * TEST: xe_sriov_flr
+ * Category: Core
+ * Mega feature: SR-IOV
+ * Sub-category: Reset tests
+ * Functionality: FLR
+ * Description: Examine behavior of SR-IOV VF FLR
+ *
+ * SUBTEST: flr-vf1-clear
+ * Run type: BAT
+ * Description:
+ *   Verifies that LMEM, GGTT, and SCRATCH_REGS are properly cleared
+ *   on VF1 following a Function Level Reset (FLR).
+ *
+ * SUBTEST: flr-each-isolation
+ * Run type: FULL
+ * Description:
+ *   Sequentially performs FLR on each VF to verify isolation and
+ *   clearing of LMEM, GGTT, and SCRATCH_REGS on the reset VF only.
+ */
+
+IGT_TEST_DESCRIPTION("Xe tests for SR-IOV VF FLR (Functional Level Reset)");
+
+const char *SKIP_REASON = "SKIP";
+
+/**
+ * struct subcheck_data - Base structure for subcheck data.
+ *
+ * This structure serves as a foundational data model for various subchecks. It is designed
+ * to be extended by more specific subcheck structures as needed. The structure includes
+ * essential information about the subcheck environment and conditions, which are used
+ * across different testing operations.
+ *
+ * @pf_fd: File descriptor for the Physical Function.
+ * @num_vfs: Number of Virtual Functions (VFs) enabled and under test. This count is
+ *           used to iterate over and manage the VFs during the testing process.
+ * @gt: GT under test. This identifier is used to specify a particular GT
+ *      for operations when GT-specific testing is required.
+ * @stop_reason: Pointer to a string that indicates why a subcheck should skip or fail.
+ *               This field is crucial for controlling the flow of subcheck execution.
+ *               If set, it should prevent further execution of the current subcheck,
+ *               allowing subcheck operations to check this field and return early if
+ *               a skip or failure condition is indicated. This mechanism ensures
+ *               that while one subcheck may stop due to a failure or a skip condition,
+ *               other subchecks can continue execution.
+ *
+ * Example usage:
+ * A typical use of this structure involves initializing it with the necessary test setup
+ * parameters, checking the `stop_reason` field before proceeding with each subcheck operation,
+ * and using `pf_fd`, `num_vfs`, and `gt` as needed based on the specific subcheck requirements.
+ */
+struct subcheck_data {
+	int pf_fd;
+	int num_vfs;
+	int gt;
+	char *stop_reason;
+};
+
+/**
+ * struct subcheck - Defines operations for managing a subcheck scenario.
+ *
+ * This structure holds function pointers for the key operations required
+ * to manage the lifecycle of a subcheck scenario. It is used by the `verify_flr`
+ * function, which acts as a template method, to call these operations in a
+ * specific sequence.
+ *
+ * @data: Shared data necessary for all operations in the subcheck.
+ *
+ * @name: Name of the subcheck operation, used for identification and reporting.
+ *
+ * @init: Initialize the subcheck environment.
+ *   Sets up the initial state required for the subcheck, including preparing
+ *   resources and ensuring the system is ready for testing.
+ *   @param data: Shared data needed for initialization.
+ *
+ * @prepare_vf: Prepare subcheck data for a specific VF.
+ *   Called for each VF before FLR is performed. It might involve marking
+ *   specific memory regions or setting up PTE addresses.
+ *   @param vf_id: Identifier of the VF being prepared.
+ *   @param data: Shared common data.
+ *
+ * @verify_vf: Verify the state of a VF after FLR.
+ *   Checks the VF's state post FLR to ensure the expected results,
+ *   such as verifying that only the FLRed VF has its state reset.
+ *   @param vf_id: Identifier of the VF to verify.
+ *   @param flr_vf_id: Identifier of the VF that underwent FLR.
+ *   @param data: Shared common data.
+ *
+ * @cleanup: Clean up the subcheck environment.
+ *   Releases resources and restores the system to its original state
+ *   after the subchecks, ensuring no resource leaks and preparing the system
+ *   for subsequent tests.
+ *   @param data: Shared common data.
+ */
+struct subcheck {
+	struct subcheck_data *data;
+	const char *name;
+	void (*init)(struct subcheck_data *data);
+	void (*prepare_vf)(int vf_id, struct subcheck_data *data);
+	void (*verify_vf)(int vf_id, int flr_vf_id, struct subcheck_data *data);
+	void (*cleanup)(struct subcheck_data *data);
+};
+
+__attribute__((format(printf, 3, 0)))
+static void set_stop_reason_v(struct subcheck_data *data, const char *prefix,
+			      const char *format, va_list args)
+{
+	char *formatted_message;
+	int result;
+
+	if (igt_warn_on_f(data->stop_reason, "Stop reason already set\n"))
+		return;
+
+	result = vasprintf(&formatted_message, format, args);
+	igt_assert_neq(result, -1);
+
+	result = asprintf(&data->stop_reason, "%s : %s", prefix,
+			  formatted_message);
+	igt_assert_neq(result, -1);
+
+	free(formatted_message);
+}
+
+__attribute__((format(printf, 2, 3)))
+static void set_skip_reason(struct subcheck_data *data, const char *format, ...)
+{
+	va_list args;
+
+	va_start(args, format);
+	set_stop_reason_v(data, SKIP_REASON, format, args);
+	va_end(args);
+}
+
+__attribute__((format(printf, 2, 3)))
+static void set_fail_reason(struct subcheck_data *data, const char *format, ...)
+{
+	va_list args;
+
+	va_start(args, format);
+	set_stop_reason_v(data, "FAIL", format, args);
+	va_end(args);
+}
+
+static bool subcheck_can_proceed(const struct subcheck *check)
+{
+	return !check->data->stop_reason;
+}
+
+static int count_subchecks_with_stop_reason(struct subcheck *checks, int num_checks)
+{
+	int subchecks_with_stop_reason = 0;
+
+	for (int i = 0; i < num_checks; ++i)
+		if (!subcheck_can_proceed(&checks[i]))
+			subchecks_with_stop_reason++;
+
+	return subchecks_with_stop_reason;
+}
+
+static bool no_subchecks_can_proceed(struct subcheck *checks, int num_checks)
+{
+	return count_subchecks_with_stop_reason(checks, num_checks) == num_checks;
+}
+
+static bool is_subcheck_skipped(struct subcheck *subcheck)
+{
+	return subcheck->data && subcheck->data->stop_reason &&
+	       !strncmp(SKIP_REASON, subcheck->data->stop_reason, strlen(SKIP_REASON));
+}
+
+static void subchecks_report_results(struct subcheck *checks, int num_checks)
+{
+	int fails = 0, skips = 0;
+
+	for (int i = 0; i < num_checks; ++i) {
+		if (checks[i].data->stop_reason) {
+			if (is_subcheck_skipped(&checks[i])) {
+				igt_info("%s: %s", checks[i].name,
+					 checks[i].data->stop_reason);
+				skips++;
+			} else {
+				igt_critical("%s: %s", checks[i].name,
+					     checks[i].data->stop_reason);
+				fails++;
+			}
+		} else {
+			igt_info("%s: SUCCESS\n", checks[i].name);
+		}
+	}
+
+	igt_fail_on_f(fails, "%d out of %d checks failed\n", fails, num_checks);
+	igt_skip_on(skips == num_checks);
+}
+
+/**
+ * verify_flr - Orchestrates the verification of Function Level Reset (FLR)
+ *              across multiple Virtual Functions (VFs).
+ *
+ * This function performs FLR on each VF to ensure that only the reset VF has
+ * its state cleared, while other VFs remain unaffected. It handles initialization,
+ * preparation, verification, and cleanup for each test operation defined in `checks`.
+ *
+ * @pf_fd: File descriptor for the Physical Function (PF).
+ * @num_vfs: Total number of Virtual Functions (VFs) to test.
+ * @checks: Array of subchecks.
+ * @num_checks: Number of subchecks.
+ *
+ * Detailed Workflow:
+ * - Initializes and prepares VFs for testing.
+ * - Iterates through each VF, performing FLR, and verifies that only
+ *   the reset VF is affected while others remain unchanged.
+ * - Reinitializes test data for the FLRed VF if there are more VFs to test.
+ * - Continues the process until all VFs are tested.
+ * - Handles any test failures or early exits, cleans up, and reports results.
+ *
+ * A timeout is used to wait for FLR operations to complete.
+ */
+static void verify_flr(int pf_fd, int num_vfs, struct subcheck *checks, int num_checks)
+{
+	const int wait_flr_ms = 200;
+	int i, vf_id, flr_vf_id = -1;
+
+	igt_sriov_disable_driver_autoprobe(pf_fd);
+	igt_sriov_enable_vfs(pf_fd, num_vfs);
+	if (igt_warn_on(!igt_sriov_device_reset_exists(pf_fd, 1)))
+		goto disable_vfs;
+	/* Refresh PCI state */
+	if (igt_warn_on(igt_pci_system_reinit()))
+		goto disable_vfs;
+
+	for (i = 0; i < num_checks; ++i)
+		checks[i].init(checks[i].data);
+
+	for (vf_id = 1; vf_id <= num_vfs; ++vf_id)
+		for (i = 0; i < num_checks; ++i)
+			if (subcheck_can_proceed(&checks[i]))
+				checks[i].prepare_vf(vf_id, checks[i].data);
+
+	if (no_subchecks_can_proceed(checks, num_checks))
+		goto cleanup;
+
+	flr_vf_id = 1;
+
+	do {
+		if (igt_warn_on_f(!igt_sriov_device_reset(pf_fd, flr_vf_id),
+				  "Initiating VF%u FLR failed\n", flr_vf_id))
+			goto cleanup;
+
+		/* assume FLR is finished after wait_flr_ms */
+		usleep(wait_flr_ms * 1000);
+
+		for (vf_id = 1; vf_id <= num_vfs; ++vf_id)
+			for (i = 0; i < num_checks; ++i)
+				if (subcheck_can_proceed(&checks[i]))
+					checks[i].verify_vf(vf_id, flr_vf_id, checks[i].data);
+
+		/* reinitialize test data for FLRed VF */
+		if (flr_vf_id < num_vfs)
+			for (i = 0; i < num_checks; ++i)
+				if (subcheck_can_proceed(&checks[i]))
+					checks[i].prepare_vf(flr_vf_id, checks[i].data);
+
+		if (no_subchecks_can_proceed(checks, num_checks))
+			goto cleanup;
+
+	} while (++flr_vf_id <= num_vfs);
+
+cleanup:
+	for (i = 0; i < num_checks; ++i)
+		checks[i].cleanup(checks[i].data);
+
+disable_vfs:
+	igt_sriov_disable_vfs(pf_fd);
+
+	if (flr_vf_id > 1 || no_subchecks_can_proceed(checks, num_checks))
+		subchecks_report_results(checks, num_checks);
+	else
+		igt_skip("No checks executed\n");
+}
+
+static void clear_tests(int pf_fd, int num_vfs)
+{
+	verify_flr(pf_fd, num_vfs, NULL, 0);
+}
+
+igt_main
+{
+	int pf_fd;
+	bool autoprobe;
+
+	igt_fixture {
+		pf_fd = drm_open_driver(DRIVER_XE);
+		igt_require(igt_sriov_is_pf(pf_fd));
+		igt_require(igt_sriov_get_enabled_vfs(pf_fd) == 0);
+		autoprobe = igt_sriov_is_driver_autoprobe_enabled(pf_fd);
+	}
+
+	igt_describe("Verify LMEM, GGTT, and SCRATCH_REGS are properly cleared after VF1 FLR");
+	igt_subtest("flr-vf1-clear") {
+		clear_tests(pf_fd, 1);
+	}
+
+	igt_describe("Perform sequential FLR on each VF, verifying that LMEM, GGTT, and SCRATCH_REGS are cleared only on the reset VF.");
+	igt_subtest("flr-each-isolation") {
+		unsigned int total_vfs = igt_sriov_get_total_vfs(pf_fd);
+
+		igt_require(total_vfs > 1);
+
+		clear_tests(pf_fd, total_vfs > 3 ? 3 : total_vfs);
+	}
+
+	igt_fixture {
+		igt_sriov_disable_vfs(pf_fd);
+		/* abort to avoid execution of next tests with enabled VFs */
+		igt_abort_on_f(igt_sriov_get_enabled_vfs(pf_fd) > 0, "Failed to disable VF(s)");
+		autoprobe ? igt_sriov_enable_driver_autoprobe(pf_fd) :
+			    igt_sriov_disable_driver_autoprobe(pf_fd);
+		igt_abort_on_f(autoprobe != igt_sriov_is_driver_autoprobe_enabled(pf_fd),
+			       "Failed to restore sriov_drivers_autoprobe value\n");
+		close(pf_fd);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 34b87b125..2724c7a9a 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -315,6 +315,7 @@ intel_xe_progs = [
 	'xe_vm',
 	'xe_waitfence',
 	'xe_spin_batch',
+	'xe_sriov_flr',
 	'xe_sysfs_defaults',
 	'xe_sysfs_preempt_timeout',
 	'xe_sysfs_scheduler',
-- 
2.31.1



More information about the igt-dev mailing list