[PATCH i-g-t] tests/kms_async_flip: skip subtest if invalid driver-specific condition
Melissa Wen
mwen at igalia.com
Thu May 1 17:13:40 UTC 2025
On 21/04/2025 13:59, Borah, Chaitanya Kumar wrote:
> Hi Melissa,
>
>> -----Original Message-----
>> From: Melissa Wen <mwen at igalia.com>
>> Sent: Monday, April 21, 2025 8:35 PM
>> To: B S, Karthik <karthik.b.s at intel.com>
>> Cc: Development mailing list for IGT GPU Tools <igt-
>> dev at lists.freedesktop.org>; Petri Latvala <adrinael at adrinael.net>; Arkadiusz
>> Hiler <arek at hiler.eu>; Kamil Konieczny <kamil.konieczny at linux.intel.com>;
>> Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>; Bhanuprakash
>> Modem <bhanuprakash.modem at gmail.com>; Dixit, Ashutosh
>> <ashutosh.dixit at intel.com>; Thadeu Lima de Souza Cascardo
>> <cascardo at igalia.com>; Rodrigo Siqueira <siqueira at igalia.com>; Andre
>> Almeida <andrealmeid at igalia.com>; Alex Hung <alex.hung at amd.com>; Leo
>> Li <sunpeng.li at amd.com>; Simon Ser <contact at emersion.fr>; Dmitry
>> Baryshkov <dmitry.baryshkov at linaro.org>; Ville Syrjala
>> <ville.syrjala at linux.intel.com>; Reddy Guddati, Santhosh
>> <santhosh.reddy.guddati at intel.com>; Borah, Chaitanya Kumar
>> <chaitanya.kumar.borah at intel.com>; Xaver Hugl <xaver.hugl at gmail.com>;
>> kernel-dev at igalia.com
>> Subject: Re: [PATCH i-g-t] tests/kms_async_flip: skip subtest if invalid driver-
>> specific condition
>>
>> On 04/17, Karthik B S wrote:
>>> Hi Melissa,
>>>
>>> Although I agree that async flips can be rejected for multiple
>>> driver-specific reasons, my concern here is that there could
>>> potentially be a genuine failure(as EINVAL is a very generic failure
>>> check) as well which might get caught in this category and be treated
>>> as expected skip, whereas it ideally was something which should've been
>> allowed by the driver.
>>> IMHO, if we need to have this, we need to have some more checks
>>> together with EINVAL (though I'm not aware of what this could be
>>> currently) to treat it as a valid skip.
>> Hi Karthik,
>>
>> Thanks for raising these points. I understand your concerns and I agree that
>> we should not ignore genuine failures, but I don't see a way to currently get
>> from the kernel a more accurate reason for async flip failures.
>>
>> Also, I don't think the current test implementation prevents misleading skips
>> even for i915 drivers. While the test attempts to meet all valid conditions for
>> i915 drivers, it allows the test to ignore the first flip failure, which can be
>> questionable when a driver advertises that it can do async flip, but then the
>> first flip fails (EINVAL) and we cannot guarantee (control) that the cause is one
>> or another unsupported modifier or what else.
>>
> We will be removing this quirk soon. With the addition of IN_FORMAT_ASYNC [1]
> we won't need to "allow_fail" for the first frame. Please see [2].
So, I think that case is in line with what I'm trying to discuss.
For example, why did you add the quirk and skip the test instead of
letting the test fail when [1] wasn't available?
>
> The idea behind falling back to a sync flip for an unsupported modifier was that, in view
> of user-space, it is still performing just a flip with a new FB (albeit with a new modifier)
> without changing any other parameters. It is a valid expectation for an async flip.
>
> With [1], user space can check which modifiers are supported for ASYNC and flip appropriately.
>
> However, there needs to be a larger discussion regarding what should be the "uapi contract"
> for async flips. Some of it going on here [3]
>
>
>> Unfortunately, cataloging all failure conditions and create a valid and generic
>> enough test configuration for all drivers in the subsystem doesn't seem a
>> feasible task.
>>
>> So, if the current setup are suitable for validating i915 drivers, but are not
>> accurate for other drivers, an alternative may be limiting this patch chane to
>> AMD or any non-Intel device. WDYT?
>>
> Won't all drivers be interested in knowing when async failed (particularly because there can be
> so many reasons for it)? At least, it has helped us debug some issues.
My concern here is that we cannot guarantee that the EINVAL is actually
a driver bug or a hw restriction.
So, why are we asserting something that isn't well-defined?
And every time we figure out that EINVAL is a hw restriction and not a
bug, we will add another quirk, similar to the one that intel added for
unsupported modifiers... and we can end up with a test with multiple
quirks for multiple hw vendors and versions.
Melissa
>
> Downgrade to sync is anyway covered by the flip per frame test criteria.
>
> Regards
>
> Chaitanya
>
> [1] https://patchwork.freedesktop.org/series/140935/
> [2] https://patchwork.freedesktop.org/series/146853/
> [3] https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13834
>
>> Anyway, I can't think of anything better right now, but I'm open to other
>> suggestions that make the test cases valid for more vendors and their
>> hardware versions.
>>
>> BR,
>>
>> Melissa
>>
>>
>>> Regards,
>>> Karthik.B.S
>>> On 4/16/2025 1:17 AM, Melissa Wen wrote:
>>>> Async page flips can fail for any driver-specific reasons, i.e. when
>>>> the driver can't commit to async page flip request for hw
>>>> limitations, the commit is rejected with -EINVAL since there is one
>>>> or more conditions that make the request invalid. These limitations
>>>> varies between vendors and hw versions. Skip a subtest if any
>>>> invalid driver-specific condition happens. This behavior is in line
>>>> with userspace expectations and fits better the plurality of drivers in the
>> DRM subsystem.
>>>> Signed-off-by: Melissa Wen <mwen at igalia.com>
>>>>
>>>> ---
>>>>
>>>> Hi,
>>>>
>>>> Recent changes to async page flip conditions and limitations in the
>>>> DRM subsystem shed light on unexpected restrictions of async page
>>>> flip tests. The current approach doesn't make the test generic
>>>> enough to correctly deal with the diversity of reasons for a async
>>>> page flip being rejected by drivers of any HW vendor.
>>>>
>>>> As an example, AMD display driver can't perform async flip when the
>>>> memory type changes [1][2][3]. After further investigation performed
>>>> by my colleague Thadeu Cascardo, he elucidated that this situation
>>>> can happen whenever there is not enough space for FBs in VRAM, so
>>>> they are allocated in GTT. Moreover, the space available in VRAM
>>>> varies between hw families and other system needs. Therefore, the
>>>> driver rejects the async flip due to memory type change, instead of
>>>> silently downgrade the page flip from async to sync - which seems correct.
>>>>
>>>> The userspace expects the rejection (-EINVAL), but not the downgrade.
>>>> So, I understand the test_async_flip should just validate if async
>>>> page flips aren't downgrade in acceptable conditions and skip for
>>>> any invalid driver-specific condition.
>>>>
>>>> [1]
>>>> https://lore.kernel.org/amd-gfx/20230621202459.979661-2-
>> andrealmeid@
>>>> igalia.com/ [2]
>>>> https://lore.kernel.org/amd-gfx/20230804182054.142988-1-
>> hamza.mahfoo
>>>> z at amd.com/ [3]
>>>> https://lore.kernel.org/amd-gfx/20250107152855.2953302-17-chiahsuan.
>>>> chung at amd.com/
>>>>
>>>> Let me know your thoughts!
>>>>
>>>> Melissa
>>>>
>>>> ---
>>>> tests/kms_async_flips.c | 8 +++++---
>>>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/tests/kms_async_flips.c b/tests/kms_async_flips.c index
>>>> 031417ed8..0de299c2b 100644
>>>> --- a/tests/kms_async_flips.c
>>>> +++ b/tests/kms_async_flips.c
>>>> @@ -397,8 +397,10 @@ static void test_async_flip(data_t *data)
>>>> ret = perform_flip(data, frame, flags);
>>>> - if (frame == 1 && data->allow_fail)
>>>> - igt_skip_on(ret == -EINVAL);
>>>> + if (data->allow_fail)
>>>> + igt_skip_on_f(ret == -EINVAL,
>>>> + "Skipping, async flip not supported at
>> frame %d " \
>>>> + "due to invalid driver-specific
>> conditions.\n", frame);
>>>> else
>>>> igt_assert_eq(ret, 0);
>>>> @@ -768,7 +770,7 @@ static void run_test(data_t *data, void
>> (*test)(data_t *))
>>>> continue;
>>>> test_init(data);
>>>> - data->allow_fail = false;
>>>> + data->allow_fail = true;
>>>> data->modifier = default_modifier(data);
>>>> igt_dynamic_f("pipe-%s-%s", kmstest_pipe_name(data-
>>> pipe), data->output->name) {
>>>> /*
More information about the igt-dev
mailing list