[igt-dev] [PATCH i-g-t] lib/igt_device_scan: Rescan pci properties if PCI_SLOT_NAME wasn't found

Petri Latvala petri.latvala at intel.com
Thu Aug 11 07:50:50 UTC 2022


On Thu, Aug 11, 2022 at 09:38:36AM +0200, Zbigniew Kempczyński wrote:
> On Wed, Aug 10, 2022 at 03:38:40PM +0300, Petri Latvala wrote:
> > On Wed, Aug 10, 2022 at 10:19:49AM +0200, Zbigniew Kempczyński wrote:
> > > References: https://gitlab.freedesktop.org/drm/intel/-/issues/6543
> > > 
> > > Due to lack of reproduction path of the above issue more data about
> > > missing PCI_SLOT_NAME property is required.
> > > 
> > > What is extremely weird when PCI_SLOT_NAME is missing udev returns some
> > > properties, like:
> > > 
> > > [properties]
> > > DEVPATH                         : /devices/pci0000:00/0000:00:02.0
> > > DRIVER                          : i915
> > > PCI_CLASS                       : 30000
> > > PCI_ID                          : 8086:191E
> > > SUBSYSTEM                       : pci
> > > 
> > > To narrow the problematic code when PCI_SLOT_NAME is missing lets dump
> > > kernel uevent file and retry scanning properties from udev. Retry path
> > > allows to detect if udev returns same list of properties.
> > > 
> > > Above doesn't fix the issue. It is providing additional information about
> > > devices and their properties provided from udev, especially when missing
> > > PCI_SLOT_NAME will strike again. It also contains warning which might
> > > detect situation where two scanning of properties will provide different
> > > results.
> > > 
> > > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski at intel.com>
> > > Cc: Petri Latvala <petri.latvala at intel.com>
> > > ---
> > >  lib/igt_device_scan.c | 48 +++++++++++++++++++++++++++++++++++++------
> > >  1 file changed, 42 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/lib/igt_device_scan.c b/lib/igt_device_scan.c
> > > index d6fae0650c..c2ecfcd3d4 100644
> > > --- a/lib/igt_device_scan.c
> > > +++ b/lib/igt_device_scan.c
> > > @@ -568,28 +568,54 @@ static void dump_props_and_attrs(const struct igt_device *dev)
> > >  	printf("\n");
> > >  }
> > >  
> > > +static void dump_uevent_file(struct igt_device *dev)
> > > +{
> > > +	char filename[FILENAME_MAX];
> > > +	const char *devpath = get_prop(dev, "DEVPATH");
> > > +	char *line = NULL;
> > > +	FILE *in;
> > > +	size_t n;
> > > +
> > > +	igt_assert_f(devpath, "DEVPATH property doesn't exist\n");
> > > +	snprintf(filename, FILENAME_MAX, "/sys%s/uevent", devpath);
> > > +
> > > +	in = fopen(filename, "r");
> > > +	igt_assert(in);
> > > +
> > > +	printf("[uevent: %s]\n", filename);
> > > +	while (getline(&line, &n, in) >= 0)
> > > +		printf("%s", line);
> > 
> > Bleh, I was going to ask you to use igt_info instead so we get the log
> > buffer out in the correct order, but the file already has a lot of
> > printfs for lsgpu's sake. Let's consider that a cleanup TODO for
> > later...
> > 
> > Reviewed-by: Petri Latvala <petri.latvala at intel.com>
> > 
> > > +
> > > +	free(line);
> > > +	fclose(in);
> > > +}
> > > +
> > >  /*
> > >   * Get PCI_SLOT_NAME property, it should be in format of
> > >   * xxxx:yy:zz.z
> > >   */
> > > -static void set_pci_slot_name(struct igt_device *dev)
> > > +static bool set_pci_slot_name(struct igt_device *dev)
> > >  {
> > >  	const char *pci_slot_name = get_prop(dev, "PCI_SLOT_NAME");
> > >  	int len;
> > >  
> > >  	if (!pci_slot_name) {
> > >  		dump_props_and_attrs(dev);
> > > -		igt_assert_f(pci_slot_name, "PCI_SLOT_NAME property == NULL\n");
> > > +		igt_warn("PCI_SLOT_NAME property == NULL\n");
> > > +		dump_uevent_file(dev);
> > > +		return false;
> > >  	}
> > >  
> > >  	len = strlen(pci_slot_name);
> > >  	if (len != PCI_SLOT_NAME_SIZE) {
> > >  		dump_props_and_attrs(dev);
> > > -		igt_assert_f(len != PCI_SLOT_NAME_SIZE,
> > > -			     "PCI_SLOT_NAME length != %d [%s]\n", len, pci_slot_name);
> > > +		igt_warn("PCI_SLOT_NAME length != %d [%s]\n", len, pci_slot_name);
> > > +		dump_uevent_file(dev);
> > > +		return false;
> > >  	}
> > >  
> > >  	dev->pci_slot_name = strdup(pci_slot_name);
> > > +	return true;
> > >  }
> > >  
> > >  /*
> > > @@ -649,7 +675,16 @@ static struct igt_device *igt_device_new_from_udev(struct udev_device *dev)
> > >  		uint16_t vendor, device;
> > >  
> > >  		set_vendor_device(idev);
> > > -		set_pci_slot_name(idev);
> > > +
> > > +		/*
> > > +		 * Very rare we observe there's no PCI_SLOT_NAME property.
> > > +		 * We depend on it so retry acquiring properties from udev.
> > > +		 */
> > > +		if (!set_pci_slot_name(idev)) {
> > > +			g_hash_table_remove_all(idev->props_ht);
> > > +			get_props(dev, idev);
> > > +			igt_assert(set_pci_slot_name(idev));
> > > +		}
> > >  		get_pci_vendor_device(idev, &vendor, &device);
> > >  		idev->codename = __pci_codename(vendor, device);
> > >  		idev->dev_type = __pci_devtype(vendor, device, idev->pci_slot_name);
> > > @@ -1270,7 +1305,8 @@ igt_devs_print_detail(struct igt_list_head *view,
> > >  			_print_key_value("codename", dev->codename);
> > >  		}
> > >  
> > > -		dump_props_and_attrs(dev);
> > > +		if (is_pci_subsystem(dev))
> > > +			dump_props_and_attrs(dev);
> 
> This incidentally left in patch I've sent (I just narrowed debugging
> to pci subsystem). Do you want to resend or I can just go back with 
> previous code and merge?

Up to you. Breakage potential very small I'd say.


-- 
Petri Latvala


More information about the igt-dev mailing list