[i-g-t V2 1/2] lib/drmtest: Ensure the XE driver is idle before starting a test

Matthew Brost matthew.brost at intel.com
Thu Jul 25 16:35:16 UTC 2024


On Thu, Jul 25, 2024 at 03:45:19PM +0200, Bernatowicz, Marcin wrote:
> 
> 
> On 7/25/2024 6:18 AM, Bhanuprakash Modem wrote:
> > Re-use the existing i915's exit handler to make sure that the
> > XE driver is idle before starting the subtest.
> > 
> > V2:
> >   - Add some delay after attempting the gt reset
> >   - Cover drm render device path too
> > 
> > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/294
> > Cc: Matthew Brost <matthew.brost at intel.com>
> > Cc: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> > Cc: Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>
> > Signed-off-by: Bhanuprakash Modem <bhanuprakash.modem at intel.com>
> > Reviewed-by: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> > ---
> >   lib/drmtest.c | 24 +++++++++++++++---------
> >   1 file changed, 15 insertions(+), 9 deletions(-)
> > 
> > diff --git a/lib/drmtest.c b/lib/drmtest.c
> > index f8810da43..90885ec36 100644
> > --- a/lib/drmtest.c
> > +++ b/lib/drmtest.c
> > @@ -64,6 +64,7 @@
> >   #include "intel_reg.h"
> >   #include "ioctl_wrappers.h"
> >   #include "igt_dummyload.h"
> > +#include "xe/xe_gt.h"
> >   #include "xe/xe_query.h"
> >   /**
> > @@ -663,12 +664,17 @@ static void __cancel_work_at_exit(int fd)
> >   {
> >   	igt_terminate_spins(); /* for older kernels */
> > -	igt_params_set(fd, "reset", "%u", -1u /* any method */);
> > -	igt_drop_caches_set(fd,
> > -			    /* cancel everything */
> > -			    DROP_RESET_ACTIVE | DROP_RESET_SEQNO |
> > -			    /* cleanup */
> > -			    DROP_ACTIVE | DROP_RETIRE | DROP_IDLE | DROP_FREED);
> > +	if (is_xe_device(fd)) {
> > +		xe_force_gt_reset_all(fd);
> 
> It looks quite invasive, we will loose the guc log ?
> 
> > +		sleep(1);
> 
> How the 1 is selected? Is it reliable enough ?
> 

Definitely don't do this. Forcing a reset is not a great idea, sleeping
is a horrible idea as all tests are going to run way slower. I depend on
tests running fast to be able to quickly test my code.

The question is also way is required. A driver is built around process
isolation so if the GPU not being idle before the test causing issues it
means one of two things.

1. We have bug in the KMD
2. The test is very poorly written

Idling the GPU will simply paper over the above issues.

It is quite troubling that their are RBs on this too.

Matt

> > +	} else {
> > +		igt_params_set(fd, "reset", "%u", -1u /* any method */);
> > +		igt_drop_caches_set(fd,
> > +				    /* cancel everything */
> > +				    DROP_RESET_ACTIVE | DROP_RESET_SEQNO |
> > +				    /* cleanup */
> > +				    DROP_ACTIVE | DROP_RETIRE | DROP_IDLE | DROP_FREED);
> > +	}
> >   }
> >   static void cancel_work_at_exit(int sig)
> > @@ -716,11 +722,11 @@ int drm_open_driver(int chipset)
> >   	igt_skip_on_f(fd<0, "No known gpu found for chipset flags 0x%u (%s)\n",
> >   		      chipset, chipset_to_str(chipset));
> > -	/* For i915, at least, we ensure that the driver is idle before
> > +	/* For i915 & xe, at least, we ensure that the driver is idle before
> >   	 * starting a test and we install an exit handler to wait until
> >   	 * idle before quitting.
> >   	 */
> > -	if (is_i915_device(fd)) {
> > +	if (is_intel_device(fd)) {
> >   		if (__sync_fetch_and_add(&open_count, 1) == 0) {
> >   			__cancel_work_at_exit(fd);
> >   			at_exit_drm_fd = drm_reopen_driver(fd);
> > @@ -836,7 +842,7 @@ int drm_open_driver_render(int chipset)
> >   		return fd;
> >   	at_exit_drm_render_fd = drm_reopen_driver(fd);
> > -	if (chipset & DRIVER_INTEL) {
> > +	if (chipset & (DRIVER_INTEL | DRIVER_XE)) {
> >   		__cancel_work_at_exit(fd);
> >   		igt_install_exit_handler(cancel_work_at_exit_render);
> >   	}


More information about the igt-dev mailing list