[PATCH v1 1/2] drm/xe/debugfs: Expose PCIe Gen5 update telemetry
Raag Jadav
raag.jadav at intel.com
Thu Apr 3 03:38:44 UTC 2025
On Wed, Apr 02, 2025 at 11:54:26PM +0530, Nilawar, Badal wrote:
> On 31-03-2025 19:53, Raag Jadav wrote:
> > Expose debugfs telemetry required for PCIe Gen5 firmware update for
> > discrete GPUs.
> >
> > Signed-off-by: Raag Jadav <raag.jadav at intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_debugfs.c | 93 +++++++++++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_pcode_api.h | 4 ++
> > 2 files changed, 97 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_debugfs.c b/drivers/gpu/drm/xe/xe_debugfs.c
> > index d0503959a8ed..67c941abf4fe 100644
> > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > @@ -17,6 +17,9 @@
> > #include "xe_gt_debugfs.h"
> > #include "xe_gt_printk.h"
> > #include "xe_guc_ads.h"
> > +#include "xe_mmio.h"
> > +#include "xe_pcode_api.h"
> > +#include "xe_pcode.h"
> > #include "xe_pm.h"
> > #include "xe_pxp_debugfs.h"
> > #include "xe_sriov.h"
> > @@ -191,6 +194,89 @@ static const struct file_operations wedged_mode_fops = {
> > .write = wedged_mode_set,
> > };
> > +/**
> > + * DOC: PCIe Gen5 Update Limitations
> > + *
> > + * Default link speed of discrete GPUs is determined by FIT parameters stored
> > + * in their flash memory, which are subject to override through user initiated
> > + * firmware updates. It has been observed that devices configured with PCIe
> > + * Gen5 as their default speed can come across link quality issues due to host
> > + * or motherboard limitations and may have to auto-downspeed to PCIe Gen4 when
> > + * faced with unstable link at Gen5. The users are required to ensure that the
> > + * device is capable of auto-downspeeding to PCIe Gen4 before pushing the image
> > + * with Gen5 as default configuration. This can be done by reading
> > + * ``pcie_gen4_downspeed_capable`` debugfs entry, which will denote PCIe Gen4
> > + * auto-downspeed capability of the device with boolean output value of ``0``
> > + * or ``1``, meaning `incapable` or `capable` respectively.
> > + *
> > + * .. code-block:: shell
> > + *
> > + * $ cat /sys/kernel/debug/dri/<N>/pcie_gen4_downspeed_capable
>
> Why not on sysfs?
>
> So how about simply using "downgrade" instead of "downspeed" through out the
> code?
It might be confused between PCI link and firmware.
Raag
More information about the Intel-xe
mailing list