[Intel-gfx] [PATCH 6/9] drm/i915/selftests: Force bonded submission to overlap

Fri Nov 22 10:03:49 UTC 2019

Quoting Tvrtko Ursulin (2019-11-22 09:34:34)
> 
> On 20/11/2019 13:29, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-11-20 13:18:27)
> >>
> >> On 20/11/2019 12:59, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2019-11-20 12:55:42)
> >>>>
> >>>> On 20/11/2019 09:32, Chris Wilson wrote:
> >>>>> Bonded request submission is designed to allow requests to execute in
> >>>>> parallel as laid out by the user. If the master request is already
> >>>>> finished before its bonded pair is submitted, the pair were not destined
> >>>>> to run in parallel and we lose the information about the master engine
> >>>>> to dictate selection of the secondary. If the second request was
> >>>>> required to be run on a particular engine in a virtual set, that should
> >>>>> have been specified, rather than left to the whims of a random
> >>>>> unconnected requests!
> >>>>>
> >>>>> In the selftest, I made the mistake of not ensuring the master would
> >>>>> overlap with its bonded pairs, meaning that it could indeed complete
> >>>>> before we submitted the bonds. Those bonds were then free to select any
> >>>>> available engine in their virtual set, and not the one expected by the
> >>>>> test.
> >>>>
> >>>> There is a submit await which ensures master is not runnable before
> >>>> bonded pairs are submitted. Why was that not enough? Are the sporadic
> >>>> test failures?
> >>>
> >>> One test is using the submit_await, the other does not. It takes the
> >>> background retire worker to run as we are submitting the secondaries...
> >>> But I have not noticed this failure before hooking up retirement to
> >>> process_csb. However, the issue is definitely present in the current
> >>> test.
> >>
> >> So what happens? Is this another issue limited to selftests? Because I
> >> don't see that uAPI itself can't be used in this way.
> > 
> > Since the master batch is already completed & signaled by the time we
> > submit the secondaries, the submit-fence is a dud and the secondaries
> > are not constrained in their engine selection.
> > 
> > i915_request_await_execution:
> >       if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> >               continue;
> >       else
> >               __i915_request_await_execution()
> > 
> > Now, our choice is either to drop the check on the signaled bit (and so
> > we will apply the bonding constrained from the already finished batch)
> > or not. Given that the master is already complete, I feel justified in
> > the current decision to ignore the constraint (since equally the fence
> > could already have been retired and so completely inaccessible), so chose
> > to fix the test instead.
> 
> Yes I agree it sounds okay to skip/ignore the constraint. But also seems 
> a valid test what this test was doing before since it exercises a 
> slightly different code path, or at least set of conditions.

What's the verification step? If we submit a bonded pair without a
submit-fence or a completed submit-fence, it's free to run on either?

We're just testing that it degenerates into a normal submit.

	create stub (completed) fence [on bonded engines]
	rq = i915_request_create();
	i915_request_await_execution(rq, stub);
	if (fence_exists(rq, stub)) {
		FAIL;
	}
?

It doesn't even have to be a bonded setup.

> What do you think? Would it be hard to add this as 3rd flavour? Maybe 
> just a new flag and then allow spinner to finish as soon as is created 
> to keep the existing flow?

I'm a little worried that maybe it's enshrining an implementation detail
without a thorough userspace use case. I'm half waiting for somebody to
ask and then being able to determine what is the best approach here with
somebody that an has actual example and anticipated behaviour.

E.g. maybe we should make it a uAPI error? Although we can not
completely detect all signaled fences prior to submission -- as in some
cases the secondary engine may have to be delayed until it has a slot
(back to the nightmare of pipeline bubbles and whether we should stall
the bonded engines until the entire set is ready).

Maybe we should be keeping the bonding details around for as long as the
submit-fence exists and apply them even if the master is already
completed. I'm torn as to whether or not that is the better idea -- the
argument that one cannot tell between a completed fence and a
non-existent fence is compelling to me (given the implementation of
fences), but I can see how that would cause confusion to the user.

I'll try and summarise the discussion here and add that to the test.
-Chris