[Intel-gfx] [PATCH 1/5] drm/i915/selftests: Force bonded submission to overlap
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Nov 22 12:55:10 UTC 2019
On 22/11/2019 11:21, Chris Wilson wrote:
> Bonded request submission is designed to allow requests to execute in
> parallel as laid out by the user. If the master request is already
> finished before its bonded pair is submitted, the pair were not destined
> to run in parallel and we lose the information about the master engine
> to dictate selection of the secondary. If the second request was
> required to be run on a particular engine in a virtual set, that should
> have been specified, rather than left to the whims of a random
> unconnected requests!
>
> In the selftest, I made the mistake of not ensuring the master would
> overlap with its bonded pairs, meaning that it could indeed complete
> before we submitted the bonds. Those bonds were then free to select any
> available engine in their virtual set, and not the one expected by the
> test.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
> drivers/gpu/drm/i915/gt/selftest_lrc.c | 62 ++++++++++++++++++++++++--
> 1 file changed, 59 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> index 2baeedd5953f..e0ea930bee19 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
> @@ -3081,15 +3081,60 @@ static int bond_virtual_engine(struct intel_gt *gt,
> struct i915_gem_context *ctx;
> struct i915_request *rq[16];
> enum intel_engine_id id;
> + struct igt_spinner spin;
> unsigned long n;
> int err;
>
> + /*
> + * A set of bonded requests is intended to be run concurrently
> + * across a number of engines. We use one request per-engine
> + * and a magic fence to schedule each of the bonded requests
> + * at the same time. A consequence of our currently scheduler
> + * is that we only move requests to the HW ready queue when
> + * the request becomes ready, that is when all of its prerequisite
> + * fences have been signaled. As one of those fences is the master
> + * submit fence, there is a delay on all secondary fences as the
> + * HW may be currently busy. Equally, as all the requests are
> + * independent, they may have other fences that delay individual
> + * request submission to HW. Ergo, we do not guarantee that
> + * all requests are immediately submitted to HW at the same time,
> + * just that if the rules are abided by, they are ready at the
> + * same time as the first is submitted. Userspace can embed semaphores
> + * in its batch to ensure parallel execution of phases as it requires.
> + * Though naturally it gets requested that perhaps the scheduler should
> + * take care of parallel execution, even across preemption events
> + * on different HW. (The proper answer is of course "lalalala".)
> + *
> + * With the submit-fence, we have identified three possible phases
> + * of synchronisation depending on the master fence: queued (not
> + * ready), ready or executing, signaled. The first two are quite
> + * simple and checked below. However, the signaled master fence
> + * handling is contentious. Currently we do not distinguish between
> + * a signaled fence and an expired fence, as once signaled it does
> + * not convey any information about the previous execution, it may
> + * be freed and hence checking later it may not exist at all. Ergo
> + * we currently do not apply the bonding constraint for an already
> + * signaled fence, as our expectation is that it should not constrain
> + * the secondaries and is outside of the scope of the bonded request
> + * API (i.e. all requests are meant to be running in parallel). As
> + * it imposes no constraint, and is effectively a no-op, we do not
> + * check below as normal execution flows are checked extensively above.
> + *
> + * XXX Is the degenerate handling of signaled submit fences the
> + * expected behaviour for userpace?
> + */
> +
> GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
>
> - ctx = kernel_context(gt->i915);
> - if (!ctx)
> + if (igt_spinner_init(&spin, gt))
> return -ENOMEM;
>
> + ctx = kernel_context(gt->i915);
> + if (!ctx) {
> + err = -ENOMEM;
> + goto err_spin;
> + }
> +
> err = 0;
> rq[0] = ERR_PTR(-ENOMEM);
> for_each_engine(master, gt, id) {
> @@ -3100,7 +3145,7 @@ static int bond_virtual_engine(struct intel_gt *gt,
>
> memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
>
> - rq[0] = igt_request_alloc(ctx, master);
> + rq[0] = spinner_create_request(&spin, ctx, master, MI_NOOP);
> if (IS_ERR(rq[0])) {
> err = PTR_ERR(rq[0]);
> goto out;
> @@ -3113,10 +3158,17 @@ static int bond_virtual_engine(struct intel_gt *gt,
> &fence,
> GFP_KERNEL);
> }
> +
> i915_request_add(rq[0]);
> if (err < 0)
> goto out;
>
> + if (!(flags & BOND_SCHEDULE) &&
> + !igt_wait_for_spinner(&spin, rq[0])) {
> + err = -EIO;
> + goto out;
> + }
> +
> for (n = 0; n < nsibling; n++) {
> struct intel_context *ve;
>
> @@ -3164,6 +3216,8 @@ static int bond_virtual_engine(struct intel_gt *gt,
> }
> }
> onstack_fence_fini(&fence);
> + intel_engine_flush_submission(master);
> + igt_spinner_end(&spin);
>
> if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
> pr_err("Master request did not execute (on %s)!\n",
> @@ -3201,6 +3255,8 @@ static int bond_virtual_engine(struct intel_gt *gt,
> err = -EIO;
>
> kernel_context_close(ctx);
> +err_spin:
> + igt_spinner_fini(&spin);
> return err;
> }
>
>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Regards,
Tvrtko
More information about the Intel-gfx
mailing list