[igt-dev] [i-g-t] tests/i915/exec_balancer: Added Skip Guc Submission
John Harrison
john.c.harrison at intel.com
Wed Dec 22 21:02:13 UTC 2021
On 12/22/2021 01:30, Tvrtko Ursulin wrote:
> I see this has been merged despite my complaint that the commit
> message is not transparent enough and without public arch level acks.
I would ack a revert if you want to post it...
>
> On 10/12/2021 10:02, Tvrtko Ursulin wrote:
>>
>> Old commit message:
>>
>> """
>> This a known failure when running
>> igt at gem_exec_balancer@bonded-(dual|pair|sync)
>> tests with GuC submission.The hang is expected with GuC submission
>> since the
>> test was written to expect execlist scheduling hence added skip if Guc
>> submission enabled.
>> """
>>
>> New commit message:
>>
>> On 09/12/2021 08:20, Mastan Katragadda wrote:
>>> This test makes assumptions about the backend scheduling algorithm that
>>> a real world UMD would never assume. This test is not testing a UMD-KMD
>>> contract, rather a specific backend.It is an invalid test case thus we
>>> are skipping.
>>>
>>> Changes Since v1:
>>> -Updated commit message
>>
>> Well that hasn't been really improved at all an is still totally
>> opaque. No mention whether it is all or some subtests, no mention of
>> what assumptions, what exact usage of the ABI is discouraged, not
>> much more than pure rewording.
>>
>> No desire to split into passing and non passing tests and only skip
>> latter group?
It's not clear from the patch itself, but looking at the code this
change appears to only affect a specific set of subtests. Those being
'sliced' and 'bonded-(pair|dual|sync)'. It's not actually disabling the
entire gem_exec_balancer suite. That is really not clear from the
subject or commit description, though. And I totally agree that it
should be documenting a) what the test is doing and b) why that
behaviour is both backend specific and unreasonable for a UMD.
My understanding is that gem_exec_balancer had already been updated to
support the new parallel extension as well as the old bonding extension.
So any subtest with 'parallel' as the prefix is supposed to work with
GuC submission. Having said that, there are odd other subtests that are
not obviously related to either bonded nor parallel submission (such as
persistence, noheartbeat, nohangcheck) which I recall do rely on
execlist backend implementations and therefore have issues when using
GuC submission.
I totally agree that we should not be blanket disabling an entire test
suite or even sub-group with such a generic stated reason. The bonded
tests (and parallel tests) are meant to check for the presence of the
appropriate API and skip if not available. They should not be skipping
based on arbitrary assumptions about the platform.
>>
>> Or to add a twist, what about making the specific subtests which hang
>> with GuC, expect to hang and therefore pass?
Seems unnecessarily complicated. Why require that something is broken?
If a test is specific to a backend implementation then the behaviour is
undefined on any other implementation. It may hang, it may pass, it may
be random each time you run it. If a test is not relevant to a given
platform, I would say that it should just be skipped.
>> That would a) document the failing corner case of the ABI and more
>> importantly b) keep exercising the i915 code paths which sit between
>> the uAPI and the GuC firmware?
You can't exercise code paths which don't exist. Bonding is
fundamentally an i915 only implementation. Persistence is partially
implemented inside the execlist backend. So you can't get to the low
level code of either when GuC submission is enabled no matter what
hideousness you put in the test.
>>
>> If that is not acceptable, and I do acknowledge to the cost to CI
>> time, then architects should ack and in my mind the ack should mean
>> one of the two things. Either:
>>
>> 1) We know there is coverage for the same i915 code paths elsewhere
>> in IGT. Or:
That exists when running on older platforms that support the feature
being tested.
>>
>> 2) We are willingly dropping coverage for these i915 code paths on
>> GuC platforms with the risk that the code might unknowingly bit rot
>> and cause more serious security issues down the line.
But we don't care if the code rots for platforms it does not support
because it doesn't support them.
>
> And no acks to the effect of either 1) or 2) were given. So git
> history will not show what is this allegedly backend specific
> behavior, neither it will clarify which parts of the uapi are stopped
> being tested in GuC mode.
I agree entirely with this. And would go further to say that the test
code itself should be documenting what it is doing, what it is trying to
test and why it things that is not applicable to a given platform if it
is going to skip that platform.
John.
>
> Regards,
>
> Tvrtko
More information about the igt-dev
mailing list