[Intel-gfx] [PATCH 6/9] drm/i915/selftests: Force bonded submission to overlap

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Fri Nov 22 09:34:34 UTC 2019


On 20/11/2019 13:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-20 13:18:27)
>>
>> On 20/11/2019 12:59, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-11-20 12:55:42)
>>>>
>>>> On 20/11/2019 09:32, Chris Wilson wrote:
>>>>> Bonded request submission is designed to allow requests to execute in
>>>>> parallel as laid out by the user. If the master request is already
>>>>> finished before its bonded pair is submitted, the pair were not destined
>>>>> to run in parallel and we lose the information about the master engine
>>>>> to dictate selection of the secondary. If the second request was
>>>>> required to be run on a particular engine in a virtual set, that should
>>>>> have been specified, rather than left to the whims of a random
>>>>> unconnected requests!
>>>>>
>>>>> In the selftest, I made the mistake of not ensuring the master would
>>>>> overlap with its bonded pairs, meaning that it could indeed complete
>>>>> before we submitted the bonds. Those bonds were then free to select any
>>>>> available engine in their virtual set, and not the one expected by the
>>>>> test.
>>>>
>>>> There is a submit await which ensures master is not runnable before
>>>> bonded pairs are submitted. Why was that not enough? Are the sporadic
>>>> test failures?
>>>
>>> One test is using the submit_await, the other does not. It takes the
>>> background retire worker to run as we are submitting the secondaries...
>>> But I have not noticed this failure before hooking up retirement to
>>> process_csb. However, the issue is definitely present in the current
>>> test.
>>
>> So what happens? Is this another issue limited to selftests? Because I
>> don't see that uAPI itself can't be used in this way.
> 
> Since the master batch is already completed & signaled by the time we
> submit the secondaries, the submit-fence is a dud and the secondaries
> are not constrained in their engine selection.
> 
> i915_request_await_execution:
> 	if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> 		continue;
> 	else
> 		__i915_request_await_execution()
> 
> Now, our choice is either to drop the check on the signaled bit (and so
> we will apply the bonding constrained from the already finished batch)
> or not. Given that the master is already complete, I feel justified in
> the current decision to ignore the constraint (since equally the fence
> could already have been retired and so completely inaccessible), so chose
> to fix the test instead.

Yes I agree it sounds okay to skip/ignore the constraint. But also seems 
a valid test what this test was doing before since it exercises a 
slightly different code path, or at least set of conditions.

What do you think? Would it be hard to add this as 3rd flavour? Maybe 
just a new flag and then allow spinner to finish as soon as is created 
to keep the existing flow?

Regards,

Tvrtko




More information about the Intel-gfx mailing list