[Intel-xe] [PATCH i-g-t] test/xe_pm: Assert on unbounded VSP

Gupta, Anshuman anshuman.gupta at intel.com
Mon Apr 24 13:50:00 UTC 2023



> -----Original Message-----
> From: Rodrigo Vivi <rodrigo.vivi at kernel.org>
> Sent: Monday, April 24, 2023 5:52 PM
> To: Gupta, Anshuman <anshuman.gupta at intel.com>
> Cc: Chehab, Mauro <mauro.chehab at intel.com>; intel-
> xe at lists.freedesktop.org; Vivi, Rodrigo <rodrigo.vivi at intel.com>
> Subject: Re: [Intel-xe] [PATCH i-g-t] test/xe_pm: Assert on unbounded VSP
> 
> On Fri, Apr 21, 2023 at 02:30:42PM +0000, Gupta, Anshuman wrote:
> >
> >
> > > -----Original Message-----
> > > From: Rodrigo Vivi <rodrigo.vivi at kernel.org>
> > > Sent: Friday, April 21, 2023 7:25 PM
> > > To: Gupta, Anshuman <anshuman.gupta at intel.com>
> > > Cc: intel-xe at lists.freedesktop.org; Chehab, Mauro
> > > <mauro.chehab at intel.com>; Vivi, Rodrigo <rodrigo.vivi at intel.com>
> > > Subject: Re: [Intel-xe] [PATCH i-g-t] test/xe_pm: Assert on
> > > unbounded VSP
> > >
> > > On Fri, Apr 21, 2023 at 03:45:11PM +0530, Anshuman Gupta wrote:
> > > > Discrete GFX cards gfx may have multiple PCIe endpoints, they
> > > > connects to root port via pcie upstream switch port(USP) and
> > > > virtual pcie switch port(VSP), sometimes VSP pcie devices doesn't
> > > > bind to pcieport driver. Without pcieport driver pcie PM comes
> > > > without any warranty and with unbounded VSP runtime pm and system
> > > > wide suspend/resume IGT
> > > test
> > > > cases may get pass as false positive.
> > >
> > > I'm not sure if I'm following here... you mean that in cases like
> > > this we cannot suspend/resume at all? What are the false positive?
> > > I'm not following the whole scenario here...
> > Recently Linux core kernel revert the L1 substate save/restore feature
> from Linux.
> > That eventually requires L1 substate to be  disable from BIOS,
> > otherwise s2idle fails on a HOST with DG2 card as DG2 was not able to exit
> form L1 substate. But due to the unbounded VSP our CI missed to report
> such issue.
> > As VSP were unbounded and DG2 Link was not transitioning to low power
> > Link State and our tests were passing instead of failing.
> 
> But then on this case we are just avoiding the test now?
> Why don't we cherry-pick the linux core kernel fix as a temporary topic/core-
> for-CI instead of blocking the test?
We don't have any fix yet, Atm Linux core Kernel does not support L1 substate during Suspend/Resume.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pcie/aspm.c?id=a7152be79b627428c628da2a887ca4b2512a78fd
> 
> Or why are we not handling this case in kernel side then?
> Why block even s3 tests if this is only about the s2idle?
This is about D0/D3 and Link state for our discrete graphics card not directly related to s2idle/S3.
Due to unbounded VSP, PM core will always keep the bridge devices in D0, L0 state,
There is no point to test D3 and Suspend/Resume with unbounded VSP.
Unbounded VSP was the one of reason on i915 that we missed runtime pm wakeref before accessing lmem. 

Br,
Anshuman Gupta.

> 
> >
> > Br,
> > Anshuman Gupta.
> > >
> > > >
> > > > Therefore assert the xe_pm igt tests on unbounded VSP.
> > > >
> > > > Signed-off-by: Anshuman Gupta <anshuman.gupta at intel.com>
> > > > ---
> > > >  tests/xe/xe_pm.c | 11 ++++++++++-
> > > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/tests/xe/xe_pm.c b/tests/xe/xe_pm.c index
> > > > 23b8246ed..78d9374cf 100644
> > > > --- a/tests/xe/xe_pm.c
> > > > +++ b/tests/xe/xe_pm.c
> > > > @@ -33,6 +33,7 @@
> > > >  typedef struct {
> > > >  	int fd_xe;
> > > >  	struct pci_device *pci_xe;
> > > > +	struct pci_device *pci_parent;
> > > >  	struct pci_device *pci_root;
> > > >  } device_t;
> > > >
> > > > @@ -344,6 +345,7 @@ NULL));
> > > >  igt_main
> > > >  {
> > > >  	struct drm_xe_engine_class_instance *hwe;
> > > > +	struct drm_xe_query_config *config;
> > > >  	device_t device;
> > > >  	char d3cold_allowed[2];
> > > >  	const struct s_state {
> > > > @@ -369,8 +371,15 @@ igt_main
> > > >  		device.fd_xe = drm_open_driver(DRIVER_XE);
> > > >  		device.pci_xe = igt_device_get_pci_device(device.fd_xe);
> > > >  		device.pci_root =
> > > igt_device_get_pci_root_port(device.fd_xe);
> > > > +		device.pci_parent =
> > > pci_device_get_parent_bridge(device.pci_xe);
> > > >
> > > > -		xe_device_get(device.fd_xe);
> > > > +		config = xe_device_get(device.fd_xe)->config;
> > > > +
> > > > +		if (config->info[XE_QUERY_CONFIG_FLAGS] &
> > > XE_QUERY_CONFIG_FLAGS_HAS_VRAM) {
> > > > +			igt_assert(device.pci_parent);
> > > > +
> > > 	igt_assert_f(!pci_device_has_kernel_driver(device.pci_parent),
> > > > +				     "pci bridge device does not bind with
> > > pcieport driver\n");
> > > > +		}
> > > >
> > > >  		/* Always perform initial once-basic exec checking for health
> > > */
> > > >  		xe_for_each_hw_engine(device.fd_xe, hwe)
> > > > --
> > > > 2.25.1
> > > >


More information about the Intel-xe mailing list