[igt-dev] [PATCH i-g-t 10/10] KUnit: replace abort with graceful skip

Tue Jun 13 14:25:01 UTC 2023

On Tuesday, 13 June 2023 15:34:00 CEST Mauro Carvalho Chehab wrote:
> On Tue, 13 Jun 2023 13:15:50 +0200
> Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com> wrote:
> 
> > On Tuesday, 13 June 2023 11:58:47 CEST Mauro Carvalho Chehab wrote:
> > > On Tue, 13 Jun 2023 11:11:00 +0200
> > > Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com> wrote:
> > >   
> > > > On Tuesday, 13 June 2023 10:41:03 CEST Mauro Carvalho Chehab wrote:  
> > > > > On Tue, 13 Jun 2023 09:27:26 +0200
> > > > > Dominik Karol Piatkowski <dominik.karol.piatkowski at intel.com> wrote:
> > > > >     
> > > > > > Sample drm_buddy output with missing requirements:
> > > > > > 	Starting subtest: drm_buddy_test
> > > > > > 	(drm_buddy:32218) igt_kmod-WARNING: Unable to load KUnit
> > > > > > 	Subtest drm_buddy_test: SKIP (0.001s)
> > > > > > 
> > > > > > Signed-off-by: Dominik Karol Piątkowski     
> > > > <dominik.karol.piatkowski at intel.com>  
> > > > > > Cc: Janusz Krzysztofik <janusz.krzysztofik at linux.intel.com>
> > > > > > Cc: Mauro Carvalho Chehab <mauro.chehab at linux.intel.com>
> > > > > > ---
> > > > > >  lib/igt_kmod.c | 20 +++++++++++++-------
> > > > > >  1 file changed, 13 insertions(+), 7 deletions(-)
> > > > > > 
> > > > > > diff --git a/lib/igt_kmod.c b/lib/igt_kmod.c
> > > > > > index 1511bdef4..04220f404 100644
> > > > > > --- a/lib/igt_kmod.c
> > > > > > +++ b/lib/igt_kmod.c
> > > > > > @@ -763,27 +763,27 @@ static void __igt_kunit(const char *module_name,     
> > > > const char *opts)  
> > > > > >  	int ret;
> > > > > >  	struct ktap_test_results *results;
> > > > > >  	struct ktap_test_results_element *temp;
> > > > > > +	bool skip = false;
> > > > > >  
> > > > > >  	/* get normalized module name */
> > > > > >  	if (igt_ktest_init(&tst, module_name) != 0) {
> > > > > > -		igt_warn("Unable to initialize ktest for %s\n",     
> > > > module_name);  
> > > > > > -		igt_fail(IGT_EXIT_SKIP);
> > > > > > +		igt_skip("Unable to initialize ktest for %s\n",     
> > > > module_name);  
> > > > > >  	}
> > > > > >  
> > > > > >  	if (igt_ktest_begin(&tst) != 0) {
> > > > > > -		igt_warn("Unable to begin ktest for %s\n", module_name);
> > > > > > -
> > > > > >  		igt_ktest_fini(&tst);
> > > > > > -		igt_fail(IGT_EXIT_SKIP);
> > > > > > +		igt_skip("Unable to begin ktest for %s\n", module_name);
> > > > > >  	}
> > > > > >  
> > > > > >  	if (tst.kmsg < 0) {
> > > > > >  		igt_warn("Could not open /dev/kmsg\n");
> > > > > > +		skip = true;    
> > > > > 
> > > > > Hmm.... not being able to read from /dev/kmsg doesn't seem a reason for
> > > > > skipping the test.
> > > > > 
> > > > > IMO, the test should be skipped only if:
> > > > > 
> > > > > 1. The Kernel is not compiled with KUnit support;
> > > > > 2. The KUnit test was not compiled.
> > > > > 
> > > > > Errors for all other conditions should be considered as failures.
> > > > > 
> > > > > Now, I'm not sure what would be the proper way to test for such
> > > > > skip condition. I suspect you could check the error code at
> > > > > igt_kmod_load(), calling igt_skip() if it returns -ENOENT,
> > > > > e. g. on this part of __igt_kunit():
> > > > > 
> > > > >         /* The KUnit module is required for running any KUnit tests */
> > > > >         if (igt_kmod_load("kunit", NULL) != 0 ||
> > > > >             kmod_module_new_from_name(kmod_ctx(), "kunit", &kunit_kmod) !=     
> > > > 0) {  
> > > > >                 igt_warn("Unable to load KUnit\n");
> > > > >                 igt_fail(IGT_EXIT_FAILURE);
> > > > >         }
> > > > > 
> > > > >         is_builtin = kmod_module_get_initstate(kunit_kmod) ==     
> > > > KMOD_MODULE_BUILTIN;  
> > > > > 
> > > > >         results = ktap_parser_start(f, is_builtin);
> > > > > 
> > > > >         if (igt_kmod_load(module_name, opts) != 0) {
> > > > >                 igt_warn("Unable to load %s module\n", module_name);
> > > > >                 ret = ktap_parser_stop();
> > > > >                 igt_fail(IGT_EXIT_FAILURE);
> > > > >         }
> > > > > 
> > > > > 
> > > > > This could be replaced by something similar to:
> > > > > 
> > > > > 	err = igt_kmod_load("kunit", NULL);
> > > > > 	igt_skip_on(err == ENOENT, "KUnit not supported");
> > > > > 	igt_fail_on_f(err != 0, "Error %d when trying to load kunit", err);
> > > > >         if (kmod_module_new_from_name(kmod_ctx(), "kunit", &kunit_kmod) !=     
> > > > 0) {  
> > > > >                 igt_warn("Unable to load KUnit\n");
> > > > >                 igt_fail(IGT_EXIT_FAILURE);
> > > > >         }
> > > > > 
> > > > > 
> > > > >         is_builtin = kmod_module_get_initstate(kunit_kmod) ==     
> > > > KMOD_MODULE_BUILTIN;  
> > > > > 
> > > > >         results = ktap_parser_start(f, is_builtin);
> > > > > 
> > > > >         err = igt_kmod_load(module_name, opts) 
> > > > > 	igt_skip_on(err == ENOENT, "Module %s not found", module_name);
> > > > > 	if (err) {
> > > > >                 igt_warn("Error %d when trying to load %s module\n", err,     
> > > > module_name);  
> > > > >                 ret = ktap_parser_stop();
> > > > >                 igt_fail(IGT_EXIT_FAILURE);
> > > > >         }
> > > > > 
> > > > > 
> > > > > In order to explicitly check for (1) and (2).    
> > > > 
> > > > Why do we reinvent the wheel instead of following the pattern from 
> > > > igt_kselftest()?   
> > > 
> > > Nobody is talking about reinventing the wheel. Selftests and KUnit tests
> > > are executed on different ways.
> > > 
> > > For selftests, the same driver that was loaded before needs to be unloaded
> > > and re-probed with a different argument.
> > > 
> > > For KUnit tests, a different module will be loaded, if the Kernel was
> > > compiled with KUnit support for a given test suite.
> > > 
> > > Looking at KUnit modprobe time, if a module doesn't load because the file
> > > doesn't exist (-ENOENT error), it means that:
> > > 
> > > - for KUnit: the KUnit module was not compiled.
> > > - for selftests: this is really unlikely to happen, as all other
> > >   previous IGT tests depend on having the driver module loaded.
> > >   So, no need for any explicit check for this specific condition.
> > >   
> > > > We used to follow it, before the whole igt_kunit() body was 
> > > > put under igt_subtest_with_dynamic() (patch 6/8 of v5, then squashed with 5/5, 
> > > > IIRC).  That forced us to introduce a series of changes inside igt_kunit(), 
> > > > which we now contest.  Was that really necessary?  
> > > 
> > > This has nothing to do with using (or not) a dynamic IGT subtest.  
> > 
> > It has.  As soon as you enter an igt_subtest or igt_subtest_with_dynamic 
> > section, your inform users (CI) that the subtest execution started.  And, you 
> > can't call return, only igt_success(), or igt_skip()/igt_fail() and friends, 
> > form inside, while you can before.
> 
> Well, igt selftest excecution also calls igt_subtest_with_dynamic().
> However, on kunit, the tests start running during modprobe time, 

The same as it was happening on modules that implemented the old DRM i915-
alike selftests, and still happens on i915 selftests.  AFAICT, kunit modules 
are not any different in that aspect.

> so
> igt_subtest_with_dynamic() needs to be called before that. Maybe some
> validations could happen earlier, but the most important one (checking
> that the module exists) happens when the module is probed, which is the
> same call that starts the test.
> 
> > 
> > The DRM selftests we are going to update are now using igt_kselftest(), which 
> > calls return on some initial conditions not met, before the 
> > igt_subtest_with_dynamic section is entered.
> 
> No. Since 2022 - specifically, since those commits:
> 
> 	- 932da861956a ("drm: selftest: convert drm_buddy selftest to KUnit")
> 	- fc8d29e298cf ("drm: selftest: convert drm_mm selftest to KUnit")
> 
> those tests don't work anymore, as upstream removed support for selftest,
> in favor of KUnit. So, those tests currently do nothing.

In my opinion -- no, those tests currently skip, which is different from doing 
nothing.

> 
> We are changing because those tests are currently broken on IGT. 

In my opinion -- no, they are not broken, they are outdated.

> That
> is the reason why we need KUnit support for such tests.
> 
> > CI expects that behavior, users 
> > are used to it.  We shouldn't change that only because we switch from an old 
> > underlying kernel selftest format to a new shiny one, I believe, unless we are 
> > able to prove that there was something wrong with that former approach, or we 
> > can explain why it no longer fits into the kunit variant.
> 
> Yes, there's something wrong with the former approach: 

Are you saying that error handling in igt_ksleftest() is broken?  In my 
opinion, i915 selftests successfully executing on CI now and again, as well as 
the outdated DRM selftests successfully skipping, prove something different.

I believe that igt_kunit() that follows as closely as possible the pattern of 
how igt_kselftest() is handling unmet conditions and errors, with full respect 
to real differences between kunit and i915-like selftests, would be the best 
solution.

Thanks,
Janusz

> the tests are
> currently broken, as upstream moved on.
> 
> > 
> > > 
> > > It is related to how to report IGT errors while trying to run KUnit
> > > tests.
> > > 
> > > Without this patch, every kind of errors will produce a failure;  
> > 
> > Yes, that was wrong, and thank you Dominik for fixing this.  But plaese also 
> > note that the badness was mostly caused by ad-hoc corrections to igt_kunit() 
> > body, forced by putting it as a whole insinde igt_subtest_with_dynamic() 
> > section while it was not ready for that.  IOW, my objections don't apply to 
> > this patch specifically, but rather the the whole series that started from 
> > igt_kunit() trying to provide the same used experience as igt_ksleftest(), 
> > then that approach being destroyed with patches subsequently added to the 
> > series.
> > 
> > > after this patch, they'll all produce a skip.
> > > 
> > > IMO, what it is needed is to split KUnit errors on two different
> > > categories:
> > > 
> > > 1. "real" errors while running KUnit (like troubles reading dmesg);  
> > 
> > I agree that troubles reading dmesg are critical to kunit variants, while they 
> > were negligible for old selftests, hence their handling specifically must now 
> > be different, but for me such a difference is not a sufficient justification 
> > for killing the whole igt_ksleftest() error handling pattern and inventing a 
> > new one from scratch.
> > 
> > Thanks,
> > Janusz
> > 
> > > 2. the KUnit module to be modprobed doesn't exist.
> > > 
> > > for (1), igt_fail() is the proper error handling.
> > > 
> > > For (2), igt_skip() is the right error handling, as the Kernel under
> > > tests doesn't have what it is needed to run KUnit, as the module was
> > > not compiled.
> > > 
> > > Regards,
> > > Mauro
> > > 
> > > 
> > >   
> > 
> > 
> > 
> > 
>