[igt-dev] [PATCH i-g-t 2/3] tests/perf_pmu: More busy measurement tightening

Thu Feb 1 16:48:33 UTC 2018

Quoting Tvrtko Ursulin (2018-02-01 16:37:29)
> 
> On 01/02/2018 12:59, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-02-01 12:47:45)
> >> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >> Where we use measured sleeps, take PMU samples immediately before and
> >> after and look at their delta in order to minimize the effect of any
> >> test setup delays.
> > 
> > The system and pmu were meant to be idle at the start of the test,
> > right? So val should always be zero?
> 
> Yes, but there is a time delay between starting the counters and 
> applying busyness. For instance, busy-check-all, current version:
> 
>         ... pmu open somewhere before ...
> 
>         spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
>         slept = measured_usleep(batch_duration_ns / 1000);
>         pmu_read_multi(fd[0], num_engines, val);
> 
> In this case the slept value vs the read busyness will miss a tiny bit 
> between igt_spin_batch_new to measured_usleep. Probably minimal indeed, 
> but I thought just for extra safety to take explicit initial read just 
> before the sleep, so:
> 
>         spin = igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
>         pmu_read_multi(fd[0], num_engines, tval[0]);

Time gap here as well. How do we know this is better than before?

>         slept = measured_usleep(batch_duration_ns / 1000);
>         pmu_read_multi(fd[0], num_engines, tval[1]);
> 
> More importantly, it is a potentially larger time delta in tests which 
> open multiple counters after starting the spinner. Like 
> most_busy_check_all for instance:
> 
>         ... start spin batch...
> 
>         for (i = 0; i < num_engines; i++)
>                 fd[i] = open_group(val[i], fd[0]);
> 
>         slept = measured_usleep(batch_duration_ns / 1000);
>         pmu_read_multi(fd[0], num_engines, val);
> 
> So the counter value relative to slept value will include time spent 
> opening num_engines event. Once again change to take an explicit initial 
> value just before the sleep looked reasonable to me.

I was working on open being minimal delay and insignificant. I have no
idea what the relative costs are. That would be nice to know.

The issue I have is that the scheduler can preempt us at time (so
other than the argument one is quicker and so gives less systematic
error in the ideal case), we are at the mercy of the scheduler which can
inject unknown sleeps between any point. I fear we may need RT, mlocking
and more?, but would much rather avoid it.

Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>

(in exchange for a small test to benchmark open_(single|group),
read_(single|group), pretty please :)
-Chris