[Intel-gfx] [PATCH 1/2] igt/gem_exec_nop: add burst submission to parallel execution test

John Harrison John.C.Harrison at Intel.com
Mon Aug 22 14:28:48 UTC 2016


On 03/08/2016 16:36, Dave Gordon wrote:
> The parallel execution test in gem_exec_nop chooses a pessimal
> distribution of work to multiple engines; specifically, it
> round-robins one batch to each engine in turn. As the workloads
> are trivial (NOPs), this results in each engine becoming idle
> between batches. Hence parallel submission is seen to take LONGER
> than the same number of batches executed sequentially.
>
> If on the other hand we send enough work to each engine to keep
> it busy until the next time we add to its queue, (i.e. round-robin
> some larger number of batches to each engine in turn) then we can
> get true parallel execution and should find that it is FASTER than
> sequential execuion.
>
> By experiment, burst sizes of between 8 and 256 are sufficient to
> keep multiple engines loaded, with the optimum (for this trivial
> workload) being around 64. This is expected to be lower (possibly
> as low as one) for more realistic (heavier) workloads.
>
> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
> ---
>   tests/gem_exec_nop.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c
> index 9b89260..c2bd472 100644
> --- a/tests/gem_exec_nop.c
> +++ b/tests/gem_exec_nop.c
> @@ -166,14 +166,17 @@ static void all(int fd, uint32_t handle, int timeout)
>   	gem_sync(fd, handle);
>   	intel_detect_and_clear_missed_interrupts(fd);
>   
> +#define	BURST	64
> +
>   	count = 0;
>   	clock_gettime(CLOCK_MONOTONIC, &start);
>   	do {
> -		for (int loop = 0; loop < 1024; loop++) {
> +		for (int loop = 0; loop < 1024/BURST; loop++) {
>   			for (int n = 0; n < nengine; n++) {
>   				execbuf.flags &= ~ENGINE_FLAGS;
>   				execbuf.flags |= engines[n];
> -				gem_execbuf(fd, &execbuf);
> +				for (int b = 0; b < BURST; ++b)
> +					gem_execbuf(fd, &execbuf);
>   			}
>   		}
>   		count += nengine * 1024;

Would be nice to have the burst size configurable but either way...

Reviewed-by: John Harrison <john.c.harrison at intel.com>



More information about the Intel-gfx mailing list