[PATCH v5 10/11] drm/xe/configfs: Only allow configurations for supported devices

Michal Wajdeczko michal.wajdeczko at intel.com
Wed Jul 30 19:57:01 UTC 2025



On 7/30/2025 8:28 PM, Rodrigo Vivi wrote:
> On Wed, Jul 30, 2025 at 10:12:06AM -0700, John Harrison wrote:
>> On 7/30/2025 9:30 AM, Rodrigo Vivi wrote:
>>> On Wed, Jul 30, 2025 at 10:23:22AM +0200, Michal Wajdeczko wrote:
>>>> On 7/30/2025 12:20 AM, John Harrison wrote:
>>>>> On 7/29/2025 2:40 PM, Rodrigo Vivi wrote:
>>>>>> On Tue, Jul 29, 2025 at 01:42:14PM +0200, Michal Wajdeczko wrote:
>>>>>>> Since we already lookup for the real PCI device before we allow
>>>>>>> to create its directory config, we might also check if the found
>>>>>>> device matches our driver PCI ID list. This will prevent creation
>>>>>>> of the directory configs for the unsupported devices.
>>>>>>>
>>>>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>>>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>>>>>> ---
>>>>>>> v2: rebased
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/xe/xe_configfs.c | 17 +++++++++++++++++
>>>>>>>    1 file changed, 17 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
>>>>>>> index 5f145ccdf535..766775772eef 100644
>>>>>>> --- a/drivers/gpu/drm/xe/xe_configfs.c
>>>>>>> +++ b/drivers/gpu/drm/xe/xe_configfs.c
>>>>>>> @@ -275,11 +275,22 @@ static const struct config_item_type xe_config_device_type = {
>>>>>>>        .ct_owner    = THIS_MODULE,
>>>>>>>    };
>>>>>>>    +static const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev)
>>>>>>> +{
>>>>>>> +    struct device_driver *driver = driver_find("xe", &pci_bus_type);
>>>>>>> +    struct pci_driver *drv = to_pci_driver(driver);
>>>>>>> +    const struct pci_device_id *ids = drv ? drv->id_table : NULL;
>>>>>>> +    const struct pci_device_id *found = pci_match_id(ids, pdev);
>>>>>>> +
>>>>>>> +    return found ? (const void *)found->driver_data : NULL;
>>>>>>> +}
>>>>>>> +
>>>>>>>    static struct config_group *xe_config_make_device_group(struct config_group *group,
>>>>>>>                                const char *name)
>>>>>>>    {
>>>>>>>        unsigned int domain, bus, slot, function;
>>>>>>>        struct xe_config_group_device *dev;
>>>>>>> +    const struct xe_device_desc *match;
>>>>>>>        struct pci_dev *pdev;
>>>>>>>        char canonical[16];
>>>>>>>        int ret;
>>>>>>> @@ -297,8 +308,14 @@ static struct config_group *xe_config_make_device_group(struct config_group *gro
>>>>>>>        pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(slot, function));
>>>>>>>        if (!pdev)
>>>>>>>            return ERR_PTR(-ENODEV);
>>>>>>> +
>>>>>>> +    match = xe_match_desc(pdev);
>>>>>>> +
>>>>>>>        pci_dev_put(pdev);
>>>>>>>    +    if (!match)
>>>>>>> +        return ERR_PTR(-ECANCELED);
>>>>>> why ECANCELED instead of ENODEV?
>>>>> Or EINVAL? This is for when a user has created a directory for a PCI device that exists but it is not a Xe device? Seems like ENODEV is not the correct option given that the device does exist. But ECANCELED also seems odd.
>>>> we can always return -EINVAL but user will have no clue what went wrong
>>>>
>>>> so the idea was:
>>>>
>>>> -EINVAL     when config directory name is malformed
>>>> -ENODEV     config directory name is correct but
>>>>              there is no such device under such BDF
>>>> -ECANCELED  when device exists but it is not covered by Xe driver
>>>>
>>>> since you both don't like -ECANCELED for this case, what about:
>>>>
>>>> -ECONNREFUSED /* Connection refused */
>>>> -EOPNOTSUPP   /* Operation not supported on transport endpoint */
>>>> -ENXIO        /* No such device or address */
>>> What about -ENODEV, but with debug messages to differentiate the cases?
>>>
>>> Otherwise -ENXIO seems the best of the alternatives here.
>> I'm confused by what is wrong with EINVAL. Are there many other paths that
>> can return EINVAL for this operation? If not then how is any other return
>> code any different from a user confusion point of view? But yes, I think at
>> least a debug message to say "requested device is not supported by this
>> driver" would be good.
> 
> to me it looks more like a 'no device' than 'invalid device', hence my
> thought on the enodev, but no strong/hard preference to be honest.
> and we are definitely on the same page that there
> shouldn't be nothing wrong with reusing the error code and printing the
> debug message like 'requested device is not supported by this driver'...

looking at pci/iov.c where some diag messages are given:

	/* is PF driver loaded */
	if (!pdev->driver) {
		pci_info(pdev, "no driver bound to device; cannot configure SR-IOV\n");
		ret = -ENOENT;
		goto exit;
	}

	/* is PF driver loaded w/callback */
	if (!pdev->driver->sriov_configure) {
		pci_info(pdev, "driver does not support SR-IOV configuration via sysfs\n");
		ret = -ENOENT;
		goto exit;
	}

so maybe we should use -ENOENT

$ sudo mkdir /sys/kernel/config/xe/0000:00:07.0
mkdir: cannot create directory ‘/sys/kernel/config/xe/0000:00:07.0’: No such file or directory

and print message using pci_info() like this

pcieport 0000:00:07.0: xe driver does not support configuration of this device


> 
>>
>> John.
>>
>>
>>>>> John.
>>>>>
>>>>>>> +
>>>>>>>        dev = kzalloc(sizeof(*dev), GFP_KERNEL);
>>>>>>>        if (!dev)
>>>>>>>            return ERR_PTR(-ENOMEM);
>>>>>>> -- 
>>>>>>> 2.47.1
>>>>>>>
>>



More information about the Intel-xe mailing list