[PATCH i-g-t 1/1] lib/xe/xe_query: Wait for xe_supports_faults
Welty, Brian
brian.welty at intel.com
Thu May 9 01:27:59 UTC 2024
On 5/8/2024 10:11 AM, Kamil Konieczny wrote:
> Hi,
> On 2024-05-08 at 16:43:04 +0000, Cavitt, Jonathan wrote:
>> -----Original Message-----
>> From: Kamil Konieczny <kamil.konieczny at linux.intel.com>
>> Sent: Wednesday, May 8, 2024 9:24 AM
>> To: igt-dev at lists.freedesktop.org
>> Cc: Cavitt, Jonathan <jonathan.cavitt at intel.com>; Gupta, saurabhg <saurabhg.gupta at intel.com>; Welty, Brian <brian.welty at intel.com>; Mistat, Tomasz <tomasz.mistat at intel.com>; Girotra, Himanshu <himanshu.girotra at intel.com>
>> Subject: Re: [PATCH i-g-t 1/1] lib/xe/xe_query: Wait for xe_supports_faults
>>>
>>> Hi Jonathan,
>>> On 2024-05-03 at 12:37:14 -0700, Jonathan Cavitt wrote:
>>>> It's possible for xe_supports_faults to return false if the system is
>>>> busy with multiple running tests. This is because the check looks for
>>>> all active VMs and searches for VMs that do not have faults enabled,
>>>> returning false if any exist. Recently, this check has been changed to
>>>> return EBUSY when the check fails in this way, so wait for up to ten
>>>> seconds for all the active VMs to flush out before proceeding.
>>>>
>>>> Suggested-by: Brian Welty <brian.welty at intel.com>
>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>>> ---
>>>> lib/xe/xe_query.c | 9 +++++++--
>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
>>>> index 6df8f42649..5458c73417 100644
>>>> --- a/lib/xe/xe_query.c
>>>> +++ b/lib/xe/xe_query.c
>>>> @@ -314,8 +314,13 @@ bool xe_supports_faults(int fd)
>>>> .flags = DRM_XE_VM_CREATE_FLAG_LR_MODE |
>>>> DRM_XE_VM_CREATE_FLAG_FAULT_MODE,
>>>> };
>>>> -
>>>> - supports_faults = !igt_ioctl(fd, DRM_IOCTL_XE_VM_CREATE, &create);
>>>> + struct timespec tv = {};
>>>> + int result, timeout;
>>>> + do {
>>>> + result = igt_ioctl(fd, DRM_IOCTL_XE_VM_CREATE, &create);
>>>> + supports_faults = !result;
>>>> + timeout = igt_seconds_elapsed(&tv);
>>>> + } while (result == -EBUSY && timeout < 10);
>>> -------------------------------- ^^^^^^^^^^^^
>>>
>>> Waiting any number of seconds in library function is way too much,
>>> imho this is ok in test itself, not on lib.
>>
>> Is the suggestion here that we should perform the wait "on" xe_supports_faults,
>> rather than "in" xe_supports_faults? I.E. that we do something like this instead
>> in xe_exec_fault_mode.c:
>>
>> igt_fixture {
>> bool supports_faults;
>> struct timespec tv = {};
>> fd = drm_open_driver(DRIVER_XE);
>> do {
>> supports_faults = xe_supports_faults(fd);
>> } while (!supports_faults && igt_seconds_elapsed(&tv) < 10);
>> igt_require(supports_faults);
>> }
>>
>> I can do this, though if xe_supports_faults returns false for any non-EBUSY related
>> reasons, we won't be able to detect it from here and we'll spend ten seconds waiting
>> for xe_supports_faults to return true when it strictly cannot. Is this an acceptable
>> tradeoff?
>> -Jonathan Cavitt
>>
>
> If you are doing this just after igt_main you do not expect
> it will return EBUSY. Other way would be to return errno from
> lib function and let subtest decide if it want to wait.
> There you could also account for simulation, where it can
> take longer. Btw is there any sysfs param for it?
>
> Regards,
> Kamil
>
Problem is the issue is that 2 VMs cannot be created in the 2 different
modes.
Cannot have one VM in 'fault mode' at same time as one is in 'non-fault
mode'. So the correct place for the check is not even inside
xe_supports_fault_mode(), but in xe_vm_create().
As every single IGT test that creates an IGT can fail with -EBUSY from
xe_vm_create() as creating 2 such VMs is not allowed. Whoever tries to
create the VM second loses (fails).
Maybe a good first step is to have xe_vm_create() raise a SKIP on -EBUSY
errors?
Maybe a good second step is to figure when and where adding a
timeout+retry might make sense..... maybe since this is a shard run (I
think?), this test can be attempted again later in the batch of tests?
-Brian
>>>
>>> Regards,
>>> Kamil
>>>
>>>>
>>>> if (supports_faults)
>>>> xe_vm_destroy(fd, create.vm_id);
>>>> --
>>>> 2.25.1
>>>>
>>>
More information about the igt-dev
mailing list