[PATCH v3 2/3] drm/xe: Expose PCIe Gen4 downspeed attributes
Raag Jadav
raag.jadav at intel.com
Wed Apr 23 11:02:21 UTC 2025
On Wed, Apr 23, 2025 at 03:29:41PM +0530, Riana Tauro wrote:
> On 4/23/2025 2:18 PM, Raag Jadav wrote:
> > On Wed, Apr 23, 2025 at 10:55:30AM +0530, Riana Tauro wrote:
> > > Hi Raag
> > >
> > > On 4/17/2025 4:42 PM, Raag Jadav wrote:
> > > > Expose sysfs attributes for PCIe Gen4 downspeed capability and status.
> > > >
> > > > v2: Move from debugfs to sysfs (Lucas, Rodrigo, Badal)
> > > > Rework macros and their naming (Rodrigo)
> > > > v3: Use sysfs_create_files() (Riana)
> > > > Fix checkpatch warning (Riana)
> > > >
> > > > Signed-off-by: Raag Jadav <raag.jadav at intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_device.c | 5 ++
> > > > drivers/gpu/drm/xe/xe_device_sysfs.c | 101 +++++++++++++++++++++++++++
> > > > drivers/gpu/drm/xe/xe_device_sysfs.h | 1 +
> > > > drivers/gpu/drm/xe/xe_pcode_api.h | 5 ++
> > > > 4 files changed, 112 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > > index 6c9d3009aa03..79b7b0ecfbae 100644
> > > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > > @@ -26,6 +26,7 @@
> > > > #include "xe_bo_evict.h"
> > > > #include "xe_debugfs.h"
> > > > #include "xe_devcoredump.h"
> > > > +#include "xe_device_sysfs.h"
> > > > #include "xe_dma_buf.h"
> > > > #include "xe_drm_client.h"
> > > > #include "xe_drv.h"
> > > > @@ -916,6 +917,10 @@ int xe_device_probe(struct xe_device *xe)
> > > > if (err)
> > > > goto err_unregister_display;
> > > > + err = xe_device_sysfs_init(xe);
> > > > + if (err)
> > > > + goto err_unregister_display;
> > > > +
> > > > xe_debugfs_register(xe);
> > > > err = xe_hwmon_register(xe);
> > > > diff --git a/drivers/gpu/drm/xe/xe_device_sysfs.c b/drivers/gpu/drm/xe/xe_device_sysfs.c
> > > > index 2d25e5b5d4bf..923612a0a2e0 100644
> > > > --- a/drivers/gpu/drm/xe/xe_device_sysfs.c
> > > > +++ b/drivers/gpu/drm/xe/xe_device_sysfs.c
> > > > @@ -11,6 +11,9 @@
> > > > #include "xe_device.h"
> > > > #include "xe_device_sysfs.h"
> > > > +#include "xe_mmio.h"
> > > > +#include "xe_pcode_api.h"
> > > > +#include "xe_pcode.h"
> > > > #include "xe_pm.h"
> > > > /**
> > > > @@ -81,3 +84,101 @@ int xe_pm_sysfs_init(struct xe_device *xe)
> > > > return devm_add_action_or_reset(dev, xe_pm_sysfs_fini, xe);
> > > > }
> > > > +
> > > > +/**
> > > > + * DOC: PCIe Gen5 Update Limitations
> > > > + *
> > > > + * Default link speed of discrete GPUs is determined by configuration
> > > > + * parameters stored in their flash memory, which are subject to override
> > > > + * through user initiated firmware updates. It has been observed that devices
> > > > + * configured with PCIe Gen5 as their default speed can come across link
> > > > + * quality issues due to host or motherboard limitations and may have to
> > > > + * auto-downspeed to PCIe Gen4 when faced with unstable link at Gen5, which
> > > > + * makes firmware updates rather risky on such setups. It is required to
> > > > + * ensure that the device is capable of auto-downspeeding to PCIe Gen4 link
> > > > + * before pushing the image with PCIe Gen5 as default configuration. This
> > > > + * can be done by reading ``pcie_gen4_downspeed_capable`` sysfs entry, which
> > > > + * will denote PCIe Gen4 downspeed capability of the device with boolean output
> > > > + * value of ``0`` or ``1``, meaning `incapable` or `capable` respectively.
> > > > + *
> > > > + * .. code-block:: shell
> > > > + *
> > > > + * $ cat /sys/bus/pci/devices/<bdf>/pcie_gen4_downspeed_capable
> > > > + *
> > > > + * Pushing PCIe Gen5 update on a downspeed incapable device and facing link
> > > > + * instability due to host or motherboard limitations can result in driver
> > > > + * failing to bind to the device, making further firmware updates impossible
> > > > + * with RMA being the only last resort.
> > > > + *
> > > > + * PCIe Gen4 downspeed status of downspeed capable devices is available through
> > > > + * ``pcie_gen4_downspeed_status`` sysfs entry with boolean output value of
> > > > + * ``0`` or ``1``, where ``0`` means no auto-downspeeding was required during
> > > > + * link training (which is the optimal scenario) and ``1`` means the device
> > > > + * has auto-downsped to PCIe Gen4 due to unstable Gen5 link.
> > > > + *
> > > > + * .. code-block:: shell
> > > > + *
> > > > + * $ cat /sys/bus/pci/devices/<bdf>/pcie_gen4_downspeed_status
> > > The code looks good. But i am not sure of the word downspeed.
> > > Couldn't find downspeed used in Pcie generation context. For link,
> > > it is mentioned as 'link downgrade'
> > >
> > > Could you share if you found any?
> >
> > Since we're describing both firmware and PCI link in the same document,
> >
> > 1. It helps distinguish between them.
> >
> > 2. This information is for the end user and has to be translatable enough
> > regardless of what spec says about it and the distinction reduces the
> > chances of misinterpretation.
> Please get an ack for the naming from any maintainer before merging, as it
> cannot be changed later and differs from the spec
Lucas, Rodrigo, if you are okay with this, please let me know.
> Code looks good to me
> Reviewed-by: Riana Tauro <riana.tauro at intel.com>
Awesome.
Raag
More information about the Intel-xe
mailing list