[PATCH v5 10/11] drm/xe/configfs: Only allow configurations for supported devices
John Harrison
john.c.harrison at intel.com
Wed Jul 30 20:10:59 UTC 2025
On 7/30/2025 1:08 PM, Rodrigo Vivi wrote:
> On Wed, Jul 30, 2025 at 09:57:01PM +0200, Michal Wajdeczko wrote:
>> On 7/30/2025 8:28 PM, Rodrigo Vivi wrote:
>>> On Wed, Jul 30, 2025 at 10:12:06AM -0700, John Harrison wrote:
>>>> On 7/30/2025 9:30 AM, Rodrigo Vivi wrote:
>>>>> On Wed, Jul 30, 2025 at 10:23:22AM +0200, Michal Wajdeczko wrote:
>>>>>> On 7/30/2025 12:20 AM, John Harrison wrote:
>>>>>>> On 7/29/2025 2:40 PM, Rodrigo Vivi wrote:
>>>>>>>> On Tue, Jul 29, 2025 at 01:42:14PM +0200, Michal Wajdeczko wrote:
>>>>>>>>> Since we already lookup for the real PCI device before we allow
>>>>>>>>> to create its directory config, we might also check if the found
>>>>>>>>> device matches our driver PCI ID list. This will prevent creation
>>>>>>>>> of the directory configs for the unsupported devices.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>>>>>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>>>>>>>> ---
>>>>>>>>> v2: rebased
>>>>>>>>> ---
>>>>>>>>> drivers/gpu/drm/xe/xe_configfs.c | 17 +++++++++++++++++
>>>>>>>>> 1 file changed, 17 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
>>>>>>>>> index 5f145ccdf535..766775772eef 100644
>>>>>>>>> --- a/drivers/gpu/drm/xe/xe_configfs.c
>>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_configfs.c
>>>>>>>>> @@ -275,11 +275,22 @@ static const struct config_item_type xe_config_device_type = {
>>>>>>>>> .ct_owner = THIS_MODULE,
>>>>>>>>> };
>>>>>>>>> +static const struct xe_device_desc *xe_match_desc(struct pci_dev *pdev)
>>>>>>>>> +{
>>>>>>>>> + struct device_driver *driver = driver_find("xe", &pci_bus_type);
>>>>>>>>> + struct pci_driver *drv = to_pci_driver(driver);
>>>>>>>>> + const struct pci_device_id *ids = drv ? drv->id_table : NULL;
>>>>>>>>> + const struct pci_device_id *found = pci_match_id(ids, pdev);
>>>>>>>>> +
>>>>>>>>> + return found ? (const void *)found->driver_data : NULL;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> static struct config_group *xe_config_make_device_group(struct config_group *group,
>>>>>>>>> const char *name)
>>>>>>>>> {
>>>>>>>>> unsigned int domain, bus, slot, function;
>>>>>>>>> struct xe_config_group_device *dev;
>>>>>>>>> + const struct xe_device_desc *match;
>>>>>>>>> struct pci_dev *pdev;
>>>>>>>>> char canonical[16];
>>>>>>>>> int ret;
>>>>>>>>> @@ -297,8 +308,14 @@ static struct config_group *xe_config_make_device_group(struct config_group *gro
>>>>>>>>> pdev = pci_get_domain_bus_and_slot(domain, bus, PCI_DEVFN(slot, function));
>>>>>>>>> if (!pdev)
>>>>>>>>> return ERR_PTR(-ENODEV);
>>>>>>>>> +
>>>>>>>>> + match = xe_match_desc(pdev);
>>>>>>>>> +
>>>>>>>>> pci_dev_put(pdev);
>>>>>>>>> + if (!match)
>>>>>>>>> + return ERR_PTR(-ECANCELED);
>>>>>>>> why ECANCELED instead of ENODEV?
>>>>>>> Or EINVAL? This is for when a user has created a directory for a PCI device that exists but it is not a Xe device? Seems like ENODEV is not the correct option given that the device does exist. But ECANCELED also seems odd.
>>>>>> we can always return -EINVAL but user will have no clue what went wrong
>>>>>>
>>>>>> so the idea was:
>>>>>>
>>>>>> -EINVAL when config directory name is malformed
>>>>>> -ENODEV config directory name is correct but
>>>>>> there is no such device under such BDF
>>>>>> -ECANCELED when device exists but it is not covered by Xe driver
>>>>>>
>>>>>> since you both don't like -ECANCELED for this case, what about:
>>>>>>
>>>>>> -ECONNREFUSED /* Connection refused */
>>>>>> -EOPNOTSUPP /* Operation not supported on transport endpoint */
>>>>>> -ENXIO /* No such device or address */
>>>>> What about -ENODEV, but with debug messages to differentiate the cases?
>>>>>
>>>>> Otherwise -ENXIO seems the best of the alternatives here.
>>>> I'm confused by what is wrong with EINVAL. Are there many other paths that
>>>> can return EINVAL for this operation? If not then how is any other return
>>>> code any different from a user confusion point of view? But yes, I think at
>>>> least a debug message to say "requested device is not supported by this
>>>> driver" would be good.
>>> to me it looks more like a 'no device' than 'invalid device', hence my
>>> thought on the enodev, but no strong/hard preference to be honest.
>>> and we are definitely on the same page that there
>>> shouldn't be nothing wrong with reusing the error code and printing the
>>> debug message like 'requested device is not supported by this driver'...
>> looking at pci/iov.c where some diag messages are given:
>>
>> /* is PF driver loaded */
>> if (!pdev->driver) {
>> pci_info(pdev, "no driver bound to device; cannot configure SR-IOV\n");
>> ret = -ENOENT;
>> goto exit;
>> }
>>
>> /* is PF driver loaded w/callback */
>> if (!pdev->driver->sriov_configure) {
>> pci_info(pdev, "driver does not support SR-IOV configuration via sysfs\n");
>> ret = -ENOENT;
>> goto exit;
>> }
>>
>> so maybe we should use -ENOENT
>>
>> $ sudo mkdir /sys/kernel/config/xe/0000:00:07.0
>> mkdir: cannot create directory ‘/sys/kernel/config/xe/0000:00:07.0’: No such file or directory
>>
>> and print message using pci_info() like this
>>
>> pcieport 0000:00:07.0: xe driver does not support configuration of this device
> works for me.
Likewise. Ship it.
John.
>
>
>>
>>>> John.
>>>>
>>>>
>>>>>>> John.
>>>>>>>
>>>>>>>>> +
>>>>>>>>> dev = kzalloc(sizeof(*dev), GFP_KERNEL);
>>>>>>>>> if (!dev)
>>>>>>>>> return ERR_PTR(-ENOMEM);
>>>>>>>>> --
>>>>>>>>> 2.47.1
>>>>>>>>>
More information about the Intel-xe
mailing list