[Intel-gfx] [PATCH i-g-t 2/2] tests/perf_pmu: Exercise busy stats and lite-restore
Chris Wilson
chris at chris-wilson.co.uk
Wed Jan 10 13:45:07 UTC 2018
Quoting Tvrtko Ursulin (2018-01-10 10:37:51)
>
> On 09/01/2018 21:39, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-01-09 16:16:21)
> >> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >> While developing a fix for an accounting hole in busy stats we realized
> >> lite-restore is a potential edge case which would be interesting to check
> >> is properly handled.
> >>
> >> It is unfortnately quite timing sensitive to hit lite-restore in the
> >> fashion test needs, so downside of this test is that it sufferes from a
> >> high rate of false negatives.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >> ---
> >> tests/perf_pmu.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> 1 file changed, 85 insertions(+)
> >>
> >> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
> >> index e1f449d48808..7db5059d202e 100644
> >> --- a/tests/perf_pmu.c
> >> +++ b/tests/perf_pmu.c
> >> @@ -192,6 +192,84 @@ busy_start(int gem_fd, const struct intel_execution_engine2 *e)
> >> gem_quiescent_gpu(gem_fd);
> >> }
> >>
> >> +/*
> >> + * This test has a potentially low rate of catching the issue it is trying to
> >> + * catch. Or in other words, quite high rate of false negative successes. We
> >> + * will depend on the CI systems running it a lot to detect issues.
> >> + */
> >> +static void
> >> +busy_double_start(int gem_fd, const struct intel_execution_engine2 *e)
> >> +{
> >> + unsigned long slept;
> >> + igt_spin_t *spin[2];
> >> + uint64_t val, val2;
> >> + bool execlists;
> >> + uint32_t ctx;
> >> + int fd;
> >> +
> >> + ctx = gem_context_create(gem_fd);
> >> +
> >> + if (gem_has_execlists(gem_fd)) {
> >> + /*
> >> + * Defeat the busy stats delayed disable, we need to guarantee
> >> + * we are the first user.
> >> + */
> >> + execlists = true;
> >> + sleep(2);
> >> + } else {
> >> + execlists = false;
> >> + }
> >> +
> >> + /*
> >> + * Submit two contexts, with a pause in between targeting the ELSP
> >> + * re-submission in execlists mode. Make sure busyness is correctly
> >> + * reported with the engine busy, and after the engine went idle.
> >> + */
> >> + spin[0] = __igt_spin_batch_new(gem_fd, 0, e2ring(gem_fd, e), 0);
> >> + usleep(500000);
> >> + spin[1] = __igt_spin_batch_new(gem_fd, ctx, e2ring(gem_fd, e), 0);
> >
> > If you want to avoid the second spinner, you can resubmit the first with
> > a second context.
>
> I know you like that but I prefer the pedestrian way. :)
>
> >> + /*
> >> + * Open PMU as fast as possible after the second spin batch in attempt
> >> + * to be faster than the driver handling lite-restore.
> >> + */
> >> + fd = open_pmu(I915_PMU_ENGINE_BUSY(e->class, e->instance));
> >> +
> >> + slept = measured_usleep(batch_duration_ns / 1000);
> >> + val = pmu_read_single(fd);
> >> +
> >> + igt_spin_batch_end(spin[0]);
> >> + igt_spin_batch_end(spin[1]);
> >> +
> >> + gem_sync(gem_fd, spin[0]->handle);
> >> + gem_sync(gem_fd, spin[1]->handle);
> >> +
> >> + /* Wait for context complete to avoid accounting residual busyness. */
> >> + if (execlists)
> >> + usleep(100000);
> >
> > Then you want a gem_quiescent_gpu() instead of the gem_sync()? Though
> > I'm scratching my head here as to what you mean. If there's an issue
> > here, won't we see it in our other measurements as well?
>
> Primarily because we want to measure idle after busy here and not in
> other tests, *and*, with execlists time between user interrupt and
> context save counts as busyness. So without this delay it is possible to
> observe a few micro-second of engine busyness in the below check.
Ok, so with a gem_quiescent_gpu() we shouldn't need an arbitrary sleep,
as we are stating the gpu is then idle (ELSP drained etc).
> >> + val2 = pmu_read_single(fd);
> >> + usleep(batch_duration_ns / 1000);
> >> + val2 = pmu_read_single(fd) - val2;
> >
> > But the engine is idle at this point? Why sleep for execlists and not
> > legacy? Isn't the execlists case where we know that just by waiting for
> > idle (gem_quiescent_gpu() not an arbitrary usleep) then the pmu counter
> > is idle. It's legacy where we would have to wait for an sample interval
> > following gpu idle to be sure the pmu is idle.
>
> Hm legacy and sampling.. I think we don't have to wait since there is no
> context save there is the idleness comes straight with the user
> interrupt, or no?
Hmm. Right. Since the sample after completion doesn't add any time to
the counter, you only need to ensure that pmu_read_single() is after the
seqno completion (i.e. gem_sync) to assert it should increment anymore.
Ok. I think you can just remove the
> >> + gem_sync(gem_fd, spin[0]->handle);
> >> + gem_sync(gem_fd, spin[1]->handle);
> >> +
> >> + /* Wait for context complete to avoid accounting residual busyness. */
> >> + if (execlists)
> >> + usleep(100000);
with a gem_quiescent_gpu() and everybody is happy.
With that tweak,
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
though I'd be happy if the initial sleep(2) was unconditional (pending
suitable information exported from the kernel).
-Chris
More information about the Intel-gfx
mailing list