[Intel-xe] [PATCH 1/5] drm/xe: Introduce new selftests framework

Thomas Hellström thomas.hellstrom at linux.intel.com
Fri Mar 17 07:20:46 UTC 2023


Hi,

On 3/16/23 07:46, Lucas De Marchi wrote:
> On Mon, Mar 13, 2023 at 07:18:34PM +0100, Michal Wajdeczko wrote:
>>
>>
>> On 13.03.2023 17:23, Lucas De Marchi wrote:
>>> On Mon, Mar 13, 2023 at 09:13:12AM +0100, Michal Wajdeczko wrote:
>>>> Our initial attempt to run Kunit based tests on live devices is hard
>>>> to use and will not work well with multiple devices attached.
>>>>
>>>> Instead of relying on separate test module and duplicating support
>>>> for multiple live devices in every test entry point, create new set
>>>> of helpers and use test->priv to pass reference of the device under
>>>> test (DUT) directly to every test function.
>>>>
>>>> The same use pattern is supported for both mock and live selftests.
>>>>
>>>> By using bus_notifier we can start testing each new bound device
>>>> separatelly, subject to filtering using dut_filter modparam.
>>>>
>>>> All test cases can be included directly from .c file, allowing test
>>>> code to access and use any private functions.
>>>>
>>>> As Kunit will attempt to run all suites on module load, including
>>>> our new live test suites, we may want to allow to execute them on
>>>> any available device. This is controlled by anonymous_dut modparam.
>>>>
>>>> Support for either mock or live selftests, only one at time, is
>>>> controlled by new Kconfig options.
>>>>
>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>> ---
>>>> drivers/gpu/drm/xe/Kconfig.debug              |  22 ++++
>>>> drivers/gpu/drm/xe/Makefile                   |  11 ++
>>>> drivers/gpu/drm/xe/selftests/xe_selftests.h   |  35 ++++++
>>>> .../gpu/drm/xe/selftests/xe_selftests_auto.c  |  46 +++++++
>>>> .../drm/xe/selftests/xe_selftests_executor.c  | 118 ++++++++++++++++++
>>>> .../drm/xe/selftests/xe_selftests_helpers.c   | 101 +++++++++++++++
>>>> .../drm/xe/selftests/xe_selftests_internal.h  |  22 ++++
>>>> .../gpu/drm/xe/selftests/xe_selftests_mock.c  |  32 +++++
>>>> .../drm/xe/selftests/xe_selftests_params.c    |  23 ++++
>>>> drivers/gpu/drm/xe/xe_module.c                |   2 +
>>>> 10 files changed, 412 insertions(+)
>>>> create mode 100644 drivers/gpu/drm/xe/selftests/xe_selftests.h
>>>> create mode 100644 drivers/gpu/drm/xe/selftests/xe_selftests_auto.c
>>>> create mode 100644 
>>>> drivers/gpu/drm/xe/selftests/xe_selftests_executor.c
>>>> create mode 100644 drivers/gpu/drm/xe/selftests/xe_selftests_helpers.c
>>>> create mode 100644 
>>>> drivers/gpu/drm/xe/selftests/xe_selftests_internal.h
>>>> create mode 100644 drivers/gpu/drm/xe/selftests/xe_selftests_mock.c
>>>> create mode 100644 drivers/gpu/drm/xe/selftests/xe_selftests_params.c
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/Kconfig.debug
>>>> b/drivers/gpu/drm/xe/Kconfig.debug
>>>> index 93b284cdd0a2..aeba43977dca 100644
>>>> --- a/drivers/gpu/drm/xe/Kconfig.debug
>>>> +++ b/drivers/gpu/drm/xe/Kconfig.debug
>>>> @@ -61,6 +61,28 @@ config DRM_XE_SIMPLE_ERROR_CAPTURE
>>>>
>>>>       If in doubt, say "N".
>>>>
>>>> +config DRM_XE_SELFTESTS_MOCK
>>>> +    bool "Mock selftests for Xe" if !KUNIT_ALL_TESTS
>>>> +    depends on DRM_XE && KUNIT
>>>> +    default KUNIT_ALL_TESTS
>>>> +    help
>>>> +        Choose this option to allow the driver to perform mock
>>>> +        selftests based on the Kunit framework.
>>>> +
>>>> +        Recommended for driver developers only.
>>>> +        If in doubt, say "N".
>>>> +
>>>> +config DRM_XE_SELFTESTS_LIVE
>>>> +    bool "Live selftests for Xe" if !DRM_XE_SELFTESTS_MOCK
>>>
>>> why?  I think we should aim to run live tests in *addition* to
>>> mock ones.  The mock tests are more testing our SW infra and can be 
>>> used
>>> in a true "unit test" sense. While the live ones are testing the
>>> integration with the HW and making sure they work together.
>>
>> I've added that just to simplify the code logic as I assumed we will be
>> running mock tests only in "true kunit" fashion as part of the build.
>>
>> So then we could avoid mixing mock/live "kunit suites" in single driver
>> module, especially that as of today there is no nice support to filter
>> out undesired suites from the execution. Thus adding compile time flag
>> to only have one category of selftests at time was looking promising.
>
> I'm using this in the patches I'm sending to unit test rtp and wa parts:
>
>     ./tools/testing/kunit/kunit.py run --raw_output  \
>                 --kunitconfig drivers/gpu/drm/xe/.kunitconfig xe_rtp
>
>>
>> So I was using .kunitconfig with
>>     CONFIG_DRM_XE_SELFTESTS_MOCK=y
>> and normal .config with
>>     CONFIG_DRM_XE_SELFTESTS_LIVE=y
>>
>>>
>>> I have changes pending to test some WA and RTP infra we added,
>>> regardless of the HW. But I don't want to have to choose either one or
>>> the other.
>>
>> And I guess you still can do that, just it wont be automatically
>> compiled out, nor you be able to benefit from the helpers, but they will
>> be executed
>>
>>>
>>> Also Cc Thomas as original author of the kunit integration.
>>>
>>> We also have some overloaded terms that need better explanation:
>>> mock, selftest, live.
>>
>> I reused terms from i915, assumed everyone is familiar with that.
>>
>> "mock selftest" - a kunit test suite to be run on abstract xe device
>> "live selftest" - a kunit test suite to be executed on real xe device
>
> Things I like on the current kunit integration we have in xe:
>
>     1) The ability to move unit test code out to a separate module
>     2) Not having a big infra that is xe-only
>     3) Share terminology with the rest of drm subsystem
>
> This patch series is basically killing (1) and (3). And for (2) it seems
> to be heading another direction.  Looking at the current available
> drivers,  I think we should be more like the drm tests and the vc4 tests
> rather than the i915 ones.  If we need more helpers we may probably
> extend the ones in DRM_KUNIT_TEST_HELPERS
>
> I'd love to hear others' opinions though.

I basically agree with Lucas here. Some points:

  * For mock selftests IMO we don't need anything on top of what we
    already have, except perhaps consolidate initialization of a mock
    device as a helper each test can optionally use.  I think, though,
    that Mauro's suggestion of coalescing multiple mock tests together
    in a single separate kernel module with multiple test suites to
    avoid too many separate kernel module might be worth considering if
    needed.
  * For live selftests it seems like where the big disconnect is. I
    still believe that we should have separate modules, and depending on
    how the IGT runner end up working have enough information in that
    module for IGT to execute the tests with the --run-subtest and
    --dynamic-subtest options.
  * Admiddedly the xe_for_each_device() helper is very limited, but
    couldn't we just make a more elaborate helper that can add
    functionality like filter out devices if needed? I'm thinking if we
    enforce a middle layer that tells each test what device it should be
    testing, things like testing PCI p2p dma between devices will become
    awkward (but still possible I guess).
  * Do we really need to support compiled-in live selftests that run a
    device on device binding? I guess the use-case isn't fully clear to me.
  * And yes the test parameters shouldn't be needing the test->priv, but
    neither should we need to pass the device here, right?

/Thomas





>
> Lucas De Marchi
>
>>
>> If you have better suggestions, then you're welcomed
>>
>>>
>>>> +    depends on DRM_XE && KUNIT && EXPERT
>>>> +    default n
>>>> +    help
>>>> +        Choose this option to allow the driver to perform live
>>>> +        selftests based on the Kunit framework.
>>>> +
>>>> +        Recommended for driver developers only.
>>>> +        If in doubt, say "N".
>>>> +
>>>> config DRM_XE_KUNIT_TEST
>>>>         tristate "KUnit tests for the drm xe driver" if 
>>>> !KUNIT_ALL_TESTS
>>>>     depends on DRM_XE && KUNIT && DEBUG_FS
>>>> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
>>>> index b94e65c91101..b193af2d9930 100644
>>>> --- a/drivers/gpu/drm/xe/Makefile
>>>> +++ b/drivers/gpu/drm/xe/Makefile
>>>> @@ -211,6 +211,17 @@ ifeq ($(CONFIG_DRM_FBDEV_EMULATION),y)
>>>>     xe-$(CONFIG_DRM_XE_DISPLAY) += display/intel_fbdev.o
>>>> endif
>>>>
>>>> +xe-$(CONFIG_DRM_XE_SELFTESTS_MOCK) += \
>>>> +    selftests/xe_selftests_mock.o \
>>>> +
>>>> +xe-$(CONFIG_DRM_XE_SELFTESTS_LIVE) += \
>>>> +    selftests/xe_selftests_auto.o \
>>>> +    selftests/xe_selftests_executor.o \
>>>> +    selftests/xe_selftests_params.o \
>>>> +
>>>> +xe-$(CONFIG_KUNIT) += \
>>>> +    selftests/xe_selftests_helpers.o \
>>>> +
>>>> obj-$(CONFIG_DRM_XE) += xe.o
>>>> obj-$(CONFIG_DRM_XE_KUNIT_TEST) += tests/
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/selftests/xe_selftests.h
>>>> b/drivers/gpu/drm/xe/selftests/xe_selftests.h
>>>> new file mode 100644
>>>> index 000000000000..6b13b998e789
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/xe/selftests/xe_selftests.h
>>>> @@ -0,0 +1,35 @@
>>>> +/* SPDX-License-Identifier: MIT */
>>>> +/*
>>>> + * Copyright © 2023 Intel Corporation
>>>> + */
>>>> +
>>>> +#ifndef _XE_SELFTESTS_H_
>>>> +#define _XE_SELFTESTS_H_
>>>> +
>>>> +struct kunit_suite;
>>>> +struct kunit;
>>>> +
>>>> +int xe_selftests_run(const char *dev_name);
>>>> +
>>>> +#if IS_ENABLED(CONFIG_DRM_XE_SELFTESTS_LIVE)
>>>> +int init_xe_live_selftest_suite(struct kunit_suite *suite);
>>>> +#endif
>>>> +
>>>> +int init_xe_device_selftest(struct kunit *test);
>>>> +void exit_xe_device_selftest(struct kunit *test);
>>>> +
>>>> +int init_xe_gt_selftest(struct kunit *test);
>>>> +void exit_xe_gt_selftest(struct kunit *test);
>>>> +
>>>> +int init_xe_guc_selftest(struct kunit *test);
>>>> +void exit_xe_guc_selftest(struct kunit *test);
>>>> +
>>>> +#if IS_ENABLED(CONFIG_DRM_XE_SELFTESTS_LIVE)
>>>> +int xe_selftests_module_init(void);
>>>> +void xe_selftests_module_exit(void);
>>>> +#else
>>>> +static inline int xe_selftests_module_init(void) { return 0; }
>>>> +static inline void xe_selftests_module_exit(void) { }
>>>> +#endif
>>>> +
>>>> +#endif /* _XE_SELFTESTS_H_ */
>>>> diff --git a/drivers/gpu/drm/xe/selftests/xe_selftests_auto.c
>>>> b/drivers/gpu/drm/xe/selftests/xe_selftests_auto.c
>>>> new file mode 100644
>>>> index 000000000000..ef98f3599eac
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/xe/selftests/xe_selftests_auto.c
>>>> @@ -0,0 +1,46 @@
>>>> +// SPDX-License-Identifier: MIT
>>>> +/*
>>>> + * Copyright © 2023 Intel Corporation
>>>> + */
>>>> +
>>>> +#include <linux/device.h>
>>>> +#include <linux/glob.h>
>>>> +#include <linux/pci.h>
>>>> +
>>>> +#include "xe_drv.h"
>>>> +#include "selftests/xe_selftests.h"
>>>> +#include "selftests/xe_selftests_internal.h"
>>>> +
>>>> +static int bus_cb(struct notifier_block *nb, unsigned long action,
>>>> void *data)
>>>> +{
>>>> +    struct device *dev = data;
>>>> +    struct device_driver *drv = dev->driver;
>>>> +
>>>> +    if (action != BUS_NOTIFY_BOUND_DRIVER)
>>>> +        return 0;
>>>> +
>>>> +    if (strcmp(DRIVER_NAME, drv->name))
>>>> +        return 0;
>>>> +
>>>> +    /* only matching DUT names will be tested */
>>>> +    if (!dut_filter || !glob_match(dut_filter, dev_name(dev)))
>>>
>>> "DUT" name could be avoided and just use what exactly it means.
>>> When I see "DUT" I'm wondering... is this the platform name, the pci 
>>> ID,
>>> the pci slot, the device node?
>>
>> will rename to "dut_name_filter" do the trick ?
>>
>> or do you want to drop "dut" completely and have "device_name_filter" ?
>> but then we would need to add "selftest" or similar word somewhere to
>> give context and then full modparam name will no longer be that friendly
>>
>> Thanks,
>> Michal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-xe/attachments/20230317/4cc17c56/attachment-0001.htm>


More information about the Intel-xe mailing list