[Intel-gfx] [PATCH 2/2] igt/gem_exec_nop: clarify & extend output from parallel execution test
John Harrison
John.C.Harrison at Intel.com
Mon Aug 22 14:39:17 UTC 2016
On 03/08/2016 16:36, Dave Gordon wrote:
> To make sense of the output of the parallel execution test (preferably
> without reading the source!), we need to see the various measurements
> that it makes, specifically: time/batch on each engine separately, total
> time across all engines sequentially, and the time/batch when the work
> is distributed over all engines in parallel.
>
> Since we know the per-batch time on the slowest engine (which will
> determine the minimum possible execution time of any equal-split
> parallel test), we can also calculate a new figure representing the
> degree to which work on the faster engines is overlapped with that on
> the slowest engine, and therefore does not contribute to the total time.
> Here we choose to present it as a percentage, with parallel-time==serial
> time giving 0% overlap, up to parallel-time==slowest-engine-
> time/n_engines being 100%. Note that negative values are possible;
> values greater than 100% may also be possible, although less likely.
>
> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
> ---
> tests/gem_exec_nop.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c
> index c2bd472..05aa383 100644
> --- a/tests/gem_exec_nop.c
> +++ b/tests/gem_exec_nop.c
> @@ -137,7 +137,9 @@ static void all(int fd, uint32_t handle, int timeout)
> if (ignore_engine(fd, engine))
> continue;
>
> - time = nop_on_ring(fd, handle, engine, 1, &count) / count;
> + time = nop_on_ring(fd, handle, engine, 2, &count) / count;
> + igt_info("%s: %'lu cycles: %.3fus/batch\n",
> + e__->name, count, time*1e6);
> if (time > max) {
> name = e__->name;
> max = time;
> @@ -148,8 +150,9 @@ static void all(int fd, uint32_t handle, int timeout)
> engines[nengine++] = engine;
> }
> igt_require(nengine);
> - igt_info("Maximum execution latency on %s, %.3fus, total %.3fus per cycle\n",
> - name, max*1e6, sum*1e6);
> + igt_info("Slowest engine was %s, %.3fus/batch\n", name, max*1e6);
> + igt_info("Total for all %d engines is %.3fus per cycle, average %.3fus/batch\n",
> + nengine, sum*1e6, sum*1e6/nengine);
>
> memset(&obj, 0, sizeof(obj));
> obj.handle = handle;
> @@ -187,8 +190,10 @@ static void all(int fd, uint32_t handle, int timeout)
> igt_assert_eq(intel_detect_and_clear_missed_interrupts(fd), 0);
>
> time = elapsed(&start, &now) / count;
> - igt_info("All (%d engines): %'lu cycles, average %.3fus per cycle\n",
> - nengine, count, 1e6*time);
> + igt_info("All %d engines (parallel/%d): %'lu cycles, "
> + "average %.3fus/batch, overlap %.1f%\n",
> + nengine, BURST, count,
> + 1e6*time, 100*(sum-time)/(sum-(max/nengine)));
>
> /* The rate limiting step is how fast the slowest engine can
> * its queue of requests, if we wait upon a full ring all dispatch
I'm not entirely convinced about the overlap calculation. The other info
is definitely useful though.
Reviewed-by: John Harrison <john.c.harrison at intel.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20160822/01085026/attachment.html>
More information about the Intel-gfx
mailing list