[PATCH v3 2/3] drm/xe: Expose PCIe Gen4 downspeed attributes

Lucas De Marchi lucas.demarchi at intel.com
Wed Apr 23 15:26:46 UTC 2025


On Wed, Apr 23, 2025 at 06:01:42PM +0300, Raag Jadav wrote:
>On Wed, Apr 23, 2025 at 08:35:42AM -0500, Lucas De Marchi wrote:
>> On Wed, Apr 23, 2025 at 02:02:21PM +0300, Raag Jadav wrote:
>> > On Wed, Apr 23, 2025 at 03:29:41PM +0530, Riana Tauro wrote:
>> > > On 4/23/2025 2:18 PM, Raag Jadav wrote:
>> > > > On Wed, Apr 23, 2025 at 10:55:30AM +0530, Riana Tauro wrote:
>> > > > > Hi Raag
>> > > > >
>> > > > > On 4/17/2025 4:42 PM, Raag Jadav wrote:
>> > > > > > Expose sysfs attributes for PCIe Gen4 downspeed capability and status.
>> > > > > >
>> > > > > > v2: Move from debugfs to sysfs (Lucas, Rodrigo, Badal)
>> > > > > >       Rework macros and their naming (Rodrigo)
>> > > > > > v3: Use sysfs_create_files() (Riana)
>> > > > > >       Fix checkpatch warning (Riana)
>> > > > > >
>> > > > > > Signed-off-by: Raag Jadav <raag.jadav at intel.com>
>> > > > > > ---
>> > > > > >    drivers/gpu/drm/xe/xe_device.c       |   5 ++
>> > > > > >    drivers/gpu/drm/xe/xe_device_sysfs.c | 101 +++++++++++++++++++++++++++
>> > > > > >    drivers/gpu/drm/xe/xe_device_sysfs.h |   1 +
>> > > > > >    drivers/gpu/drm/xe/xe_pcode_api.h    |   5 ++
>> > > > > >    4 files changed, 112 insertions(+)
>> > > > > >
>> > > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>> > > > > > index 6c9d3009aa03..79b7b0ecfbae 100644
>> > > > > > --- a/drivers/gpu/drm/xe/xe_device.c
>> > > > > > +++ b/drivers/gpu/drm/xe/xe_device.c
>> > > > > > @@ -26,6 +26,7 @@
>> > > > > >    #include "xe_bo_evict.h"
>> > > > > >    #include "xe_debugfs.h"
>> > > > > >    #include "xe_devcoredump.h"
>> > > > > > +#include "xe_device_sysfs.h"
>> > > > > >    #include "xe_dma_buf.h"
>> > > > > >    #include "xe_drm_client.h"
>> > > > > >    #include "xe_drv.h"
>> > > > > > @@ -916,6 +917,10 @@ int xe_device_probe(struct xe_device *xe)
>> > > > > >    	if (err)
>> > > > > >    		goto err_unregister_display;
>> > > > > > +	err = xe_device_sysfs_init(xe);
>> > > > > > +	if (err)
>> > > > > > +		goto err_unregister_display;
>> > > > > > +
>> > > > > >    	xe_debugfs_register(xe);
>> > > > > >    	err = xe_hwmon_register(xe);
>> > > > > > diff --git a/drivers/gpu/drm/xe/xe_device_sysfs.c b/drivers/gpu/drm/xe/xe_device_sysfs.c
>> > > > > > index 2d25e5b5d4bf..923612a0a2e0 100644
>> > > > > > --- a/drivers/gpu/drm/xe/xe_device_sysfs.c
>> > > > > > +++ b/drivers/gpu/drm/xe/xe_device_sysfs.c
>> > > > > > @@ -11,6 +11,9 @@
>> > > > > >    #include "xe_device.h"
>> > > > > >    #include "xe_device_sysfs.h"
>> > > > > > +#include "xe_mmio.h"
>> > > > > > +#include "xe_pcode_api.h"
>> > > > > > +#include "xe_pcode.h"
>> > > > > >    #include "xe_pm.h"
>> > > > > >    /**
>> > > > > > @@ -81,3 +84,101 @@ int xe_pm_sysfs_init(struct xe_device *xe)
>> > > > > >    	return devm_add_action_or_reset(dev, xe_pm_sysfs_fini, xe);
>> > > > > >    }
>> > > > > > +
>> > > > > > +/**
>> > > > > > + * DOC: PCIe Gen5 Update Limitations
>> > > > > > + *
>> > > > > > + * Default link speed of discrete GPUs is determined by configuration
>> > > > > > + * parameters stored in their flash memory, which are subject to override
>> > > > > > + * through user initiated firmware updates. It has been observed that devices
>> > > > > > + * configured with PCIe Gen5 as their default speed can come across link
>> > > > > > + * quality issues due to host or motherboard limitations and may have to
>> > > > > > + * auto-downspeed to PCIe Gen4 when faced with unstable link at Gen5, which
>> > > > > > + * makes firmware updates rather risky on such setups. It is required to
>> > > > > > + * ensure that the device is capable of auto-downspeeding to PCIe Gen4 link
>> > > > > > + * before pushing the image with PCIe Gen5 as default configuration. This
>> > > > > > + * can be done by reading ``pcie_gen4_downspeed_capable`` sysfs entry, which
>> > > > > > + * will denote PCIe Gen4 downspeed capability of the device with boolean output
>> > > > > > + * value of ``0`` or ``1``, meaning `incapable` or `capable` respectively.
>> > > > > > + *
>> > > > > > + * .. code-block:: shell
>> > > > > > + *
>> > > > > > + *    $ cat /sys/bus/pci/devices/<bdf>/pcie_gen4_downspeed_capable
>> > > > > > + *
>> > > > > > + * Pushing PCIe Gen5 update on a downspeed incapable device and facing link
>> > > > > > + * instability due to host or motherboard limitations can result in driver
>> > > > > > + * failing to bind to the device, making further firmware updates impossible
>> > > > > > + * with RMA being the only last resort.
>> > > > > > + *
>> > > > > > + * PCIe Gen4 downspeed status of downspeed capable devices is available through
>> > > > > > + * ``pcie_gen4_downspeed_status`` sysfs entry with boolean output value of
>> > > > > > + * ``0`` or ``1``, where ``0`` means no auto-downspeeding was required during
>> > > > > > + * link training (which is the optimal scenario) and ``1`` means the device
>> > > > > > + * has auto-downsped to PCIe Gen4 due to unstable Gen5 link.
>> > > > > > + *
>> > > > > > + * .. code-block:: shell
>> > > > > > + *
>> > > > > > + *    $ cat /sys/bus/pci/devices/<bdf>/pcie_gen4_downspeed_status
>> > > > > The code looks good. But i am not sure of the word downspeed.
>> > > > > Couldn't find downspeed used in Pcie generation context. For link,
>> > > > > it is mentioned as  'link downgrade'
>> > > > >
>> > > > > Could you share if you found any?
>>
>> I'm with Riana here and also found it strange. User is cat'ing a file
>> inside  /sys/bus/pci/devices/<bdf>/ - it's odd to use downspeed when
>> that word is not used in PCIe spec and...
>>
>> > > >
>> > > > Since we're describing both firmware and PCI link in the same document,
>> > > >
>> > > > 1. It helps distinguish between them.
>>
>> I also don't understand the distinction you are trying to make. The
>> firmware is there, monitoring the communication and and is capable to
>> auto downgrade the link to gen4 if it needs to. I imagine that if this
>> situation happens, the link speed is re-negotiated and we will end up
>> seeing the new one on LnkSta... LnkCap still shows what is possible. The
>> only difference I see here from "I'm connecting a gen5 capable device on
>> a max gen4 slot" is that initially the gen5 is attempted (since both the
>> host and device report they support it), but the firmware may make a decision
>> in runtime to "downgrade the link speed".
>
>And all of this will take place when the user is doing a "firmware upgrade",
>and this can potentially open the door for "link speed downgrade" to be
>confused with "firmware rollback (downgrade) done for the link speed",
>which is not the case.

I think it'd be a stretch for the user to read a file named
/sys/bus/pci/devices/<bdf>/auto_link_downgrade_capable and think it has
anything to do with firmware downgrade/upgrade... it's called **link**
downgrade. There are already other files there with similar naming:

$ ls -l /sys/bus/pci/devices/<bdf>/ | grep link
-r--r--r-- 1 root root 4096 Apr 23 08:19 current_link_speed
-r--r--r-- 1 root root 4096 Apr 23 08:19 current_link_width
drwxr-xr-x 2 root root    0 Apr 23 08:19 link
-r--r--r-- 1 root root 4096 Apr 23 08:19 max_link_speed
-r--r--r-- 1 root root 4096 Apr 23 08:19 max_link_width

>
>> > > > 2. This information is for the end user and has to be translatable enough
>> > > >     regardless of what spec says about it and the distinction reduces the
>> > > >     chances of misinterpretation.
>>
>> when you introduce a new term that is not known, I'd say the effect is
>> pretty much the opposite and it can be even worse if a future pcie spec
>> starts to use that term.
>
>Agree, unless we're expecting users to be informed about the spec to be
>able to flash their firmwares. But if you insist, sure will _upgrade_ the
>document ;)

the recommendation that the user should read any file like that before
trying to flash a firmware is completely broken - that should be done
by whatever is flashing the firmware.

Lucas De Marchi

>
>Raag


More information about the Intel-xe mailing list