[PATCH v3 1/3] tests/kms_vrr: Bucketize refresh rate tolerance

Thu Apr 3 05:14:48 UTC 2025

Reduce false failures while preserving timing accuracy. Introduce
a small tolerance buffer based on refresh rate which accounts for
HW/SW latency without compromising validation on HRR panel.
Although an imperical number but already IGT is living with that.
This also ensures that asked refresh rate is not too off and always
catch the real HW/software issues.

--v2:
Refresh rate criteria changes.

--v3:
Comment changes (Uma).

Signed-off-by: Mitul Golani <mitulkumar.ajitkumar.golani at intel.com>
---
 tests/kms_vrr.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 47 insertions(+), 3 deletions(-)

diff --git a/tests/kms_vrr.c b/tests/kms_vrr.c
index c4bb30f6a..83f326a4d 100644
--- a/tests/kms_vrr.c
+++ b/tests/kms_vrr.c
@@ -410,6 +410,52 @@ do_flip(data_t *data, igt_fb_t *fb)
 	igt_reset_timeout();
 }
 
+static void
+calculate_tolerance(uint64_t *threshold_hi, uint64_t *threshold_lo, uint64_t rates_ns)
+{
+	uint32_t refresh_rate = NSECS_PER_SEC / rates_ns;
+
+	if (refresh_rate < 0)
+		return;
+
+	/*
+	 * Current IGT implementation follows this sequence:
+	 * 1. Perform a page flip (`do_flip`).
+	 * 2. Wait for the flip completion event.
+	 * 3. Compare the timestamp of the flip completion event with the previous frame’s
+	 *    completion timestamp.
+	 * 4. Adjust CPU cycle burning based on the relative frame time.
+	 *
+	 * If a flip completes too early or too late, it is marked as out of tolerance.
+	 * As a result, additional CPU cycles are burned to match the `target_ns`.
+	 * Even if the next frame is on time, the total frame time now includes:
+	 * Burned CPU cycle time (from the previous frame) + Flip completion event time.
+	 * This leads to miscalculation, causing **false out-of-range detections**.
+	 * The impact is more significant on High Refresh Rate (HRR) panels, where:
+	 * The allowed tolerance window is smaller and more correction time is required.
+	 * i.e. for 210hz (4.762ms), allowed range is 209hz(4.784ms) to 211hz(4.739ms).
+	 * This comes just 23 microsecond tolerance, which is much lesser
+	 * for accounting HW/SW latency, CPU burn cycle latency and correction logic
+	 * applied in igt for validation.
+	 *
+	 * To address this implement a Bucketing Strategy:
+	 * Provide a small tolerance buffer to allow IGT tests to account for correction.
+	 * Based on range of asked refresh rate. This prevents excessive failures due to minor
+	 * timing adjustments.
+	 */
+
+	if (refresh_rate <= 120) {
+		*threshold_hi = NSECS_PER_SEC / (((float)NSECS_PER_SEC / rates_ns) + 1);
+		*threshold_lo = NSECS_PER_SEC / (((float)NSECS_PER_SEC / rates_ns) - 1);
+	} else if (refresh_rate >= 120 && refresh_rate <= 240) {
+		*threshold_hi = NSECS_PER_SEC / (((float)NSECS_PER_SEC / rates_ns) + 5);
+		*threshold_lo = NSECS_PER_SEC / (((float)NSECS_PER_SEC / rates_ns) - 5);
+	} else {
+		*threshold_hi = NSECS_PER_SEC / (((float)NSECS_PER_SEC / rates_ns) + 10);
+		*threshold_lo = NSECS_PER_SEC / (((float)NSECS_PER_SEC / rates_ns) - 10);
+	}
+}
+
 /*
  * Flips at the given rate and measures against the expected value.
  * Returns the pass rate as a percentage from 0 - 100.
@@ -439,9 +485,7 @@ flip_and_measure(data_t *data, igt_output_t *output, enum pipe pipe,
 		else
 			exp_rate_ns = vtest_ns.max;
 
-		/* Allow ~1 Hz deviation for different reasons causing delay. */
-		threshold_hi[i] = NSECS_PER_SEC / (((float)NSECS_PER_SEC / exp_rate_ns) + 1);
-		threshold_lo[i] = NSECS_PER_SEC / (((float)NSECS_PER_SEC / exp_rate_ns) - 1);
+		calculate_tolerance(&threshold_hi[i], &threshold_lo[i], exp_rate_ns);
 
 		igt_info("Requested rate[%d]: %"PRIu64" ns, Expected rate between: %"PRIu64" ns to %"PRIu64" ns\n",
 				i, rates_ns[i], threshold_hi[i], threshold_lo[i]);
-- 
2.48.1