[Intel-gfx] [PATCH i-g-t 2/2] tests/perf_pmu: Add tests for engine queued stat

Wed Nov 22 12:56:04 UTC 2017

Quoting Tvrtko Ursulin (2017-11-22 12:47:05)
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> Simple test to check correct queue-depth is reported per engine.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
>  tests/perf_pmu.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 79 insertions(+)
> 
> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> index 8585ed7bcee8..17f0afca6fe1 100644
> --- a/tests/perf_pmu.c
> +++ b/tests/perf_pmu.c
> @@ -87,6 +87,17 @@ static uint64_t pmu_read_single(int fd)
>         return data[0];
>  }
>  
> +static uint64_t pmu_sample_single(int fd, uint64_t *val)
> +{
> +       uint64_t data[2];
> +
> +       igt_assert_eq(read(fd, data, sizeof(data)), sizeof(data));
> +
> +       *val = data[0];
> +
> +       return data[1];
> +}
> +
>  static void pmu_read_multi(int fd, unsigned int num, uint64_t *val)
>  {
>         uint64_t buf[2 + num];
> @@ -655,6 +666,65 @@ multi_client(int gem_fd, const struct intel_execution_engine2 *e)
>         assert_within_epsilon(val[1], slept, tolerance);
>  }
>  
> +static double calc_queued(uint64_t d_val, uint64_t d_ns)
> +{
> +       return (double)d_val * 1e9 * I915_SAMPLE_QUEUED_SCALE / d_ns;
> +}
> +
> +static void
> +queued(int gem_fd, const struct intel_execution_engine2 *e)
> +{
> +       const unsigned long duration_ns = 500e6;

0.5s.

> +       igt_spin_t *spin[2];
> +       uint64_t val[2];
> +       uint64_t ts[2];
> +       int fd;
> +
> +       fd = open_pmu(I915_PMU_ENGINE_QUEUED(e->class, e->instance));
> +
> +       /*
> +        * First check on an idle engine.
> +        */
> +       ts[0] = pmu_sample_single(fd, &val[0]);
> +       usleep(duration_ns / 3000);
> +       ts[1] = pmu_sample_single(fd, &val[1]);
> +       assert_within_epsilon(calc_queued(val[1] - val[0], ts[1] - ts[0]),
> +                             0.0, tolerance);
> +
> +       /*
> +        * First spin batch will be immediately executing.
> +        */
> +       spin[0] = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> +       igt_spin_batch_set_timeout(spin[0], duration_ns);
> +
> +       ts[0] = pmu_sample_single(fd, &val[0]);
> +       usleep(duration_ns / 3000);
> +       ts[1] = pmu_sample_single(fd, &val[1]);
> +       assert_within_epsilon(calc_queued(val[1] - val[0], ts[1] - ts[0]),
> +                             1.0, tolerance);
> +

What I would like here is a for(n=1; n < 10; n++)
where max_n is chosen so that we terminate within 5s, changing sample
intervals to match if we want to increase N.

Hmm.

for (n = 1; n < 10; n++)
	ctx = gem_context_create()
	for (m = 0; m < n; m++)
		...etc...

(We probably either want to measure ring_size and avoid that, or use a
timeout that interrupts the last execbuf... Ok, that's better overall.)

And have qd geometrically increase. Basically just want to avoid hitting
magic numbers inside HW, ELSP/guc depth of 2 being the first magic
number we want to miss.
-Chris