[PATCH] drm/xe/pf: Restore TotalVFs
Lucas De Marchi
lucas.demarchi at intel.com
Mon Jul 15 17:09:43 UTC 2024
On Mon, Jul 15, 2024 at 01:40:52PM GMT, Michal Wajdeczko wrote:
>If we update TotalVFs to a new value and fail the .probe() then
>the PCI subsystem will not restore original value, which might
>impact next .probe() attempt. As a backup plan, use managed action
>to restore it ourselves.
>
>Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>Cc: Piotr Piórkowski <piotr.piorkowski at intel.com>
>---
>Attempt to fix that at PCI layer is here [1], but until it will be
>available for us, we can still fix that on our side.
>
>[1] https://lore.kernel.org/linux-pci/20240715102835.1286-1-michal.wajdeczko@intel.com/
I don't think we should merge this while that is pending with no reply.
If that is not accepted, then we can think about doing this.
>---
> drivers/gpu/drm/xe/xe_sriov_pf.c | 48 +++++++++++++++++++++++++++-----
> 1 file changed, 41 insertions(+), 7 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
>index 0f721ae17b26..d61933770f1c 100644
>--- a/drivers/gpu/drm/xe/xe_sriov_pf.c
>+++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
>@@ -17,12 +17,40 @@ static unsigned int wanted_max_vfs(struct xe_device *xe)
> return xe_modparam.max_vfs;
> }
>
>-static int pf_reduce_totalvfs(struct xe_device *xe, int limit)
>+static void restore_totalvfs(void *arg)
>+{
>+ struct xe_device *xe = arg;
>+ struct device *dev = xe->drm.dev;
>+ struct pci_dev *pdev = to_pci_dev(dev);
>+ int totalvfs = xe->sriov.pf.device_total_vfs;
>+
>+ xe_sriov_dbg_verbose(xe, "restoring totalvfs to %d\n", totalvfs);
>+ pci_sriov_set_totalvfs(pdev, totalvfs);
>+}
>+
>+static int pf_reduce_totalvfs(struct xe_device *xe, u16 limit)
^^^^^^
so the current logic assumes the maximum comes from BIOS, but it could
rather come from a previous (failed) driver load? once we set the value
is the information lost from the device or can we recover it from somewhere?
somewhere != where we saved in the this driver.
> {
> struct device *dev = xe->drm.dev;
> struct pci_dev *pdev = to_pci_dev(dev);
> int err;
>
>+ xe->sriov.pf.device_total_vfs = pci_sriov_get_totalvfs(pdev);
>+ xe->sriov.pf.driver_max_vfs = limit;
>+
>+ xe_sriov_dbg_verbose(xe, "reducing totalvfs from %u to %u\n",
>+ xe->sriov.pf.device_total_vfs,
>+ xe->sriov.pf.driver_max_vfs);
>+
>+ /*
>+ * XXX: If we update TotalVFs to a new value and fail the .probe()
why XXX?
>+ * then the PCI subsystem might not restore original TotalVFs value,
>+ * which might impact next .probe() attempt. As a backup plan, use
>+ * managed action to restore it ourselves.
>+ *
>+ * Note that it's not critical if registering that action fails.
>+ */
>+ devm_add_action(xe->drm.dev, restore_totalvfs, xe);
>+
> err = pci_sriov_set_totalvfs(pdev, limit);
> if (err)
> xe_sriov_notice(xe, "Failed to set number of VFs to %d (%pe)\n",
>@@ -37,6 +65,14 @@ static bool pf_continue_as_native(struct xe_device *xe, const char *why)
> return false;
> }
>
>+static bool pf_continue_ready(struct xe_device *xe, u16 max_vfs)
>+{
>+ xe_assert(xe, max_vfs > 0);
>+ xe_sriov_dbg(xe, "preparing for %u VF%s\n", max_vfs, str_plural(max_vfs));
>+ pf_reduce_totalvfs(xe, max_vfs);
>+ return true;
what's the point of splitting this function that always return true? I'd
rather leave it where it was
Lucas De Marchi
>+}
>+
> /**
> * xe_sriov_pf_readiness - Check if PF functionality can be enabled.
> * @xe: the &xe_device to check
>@@ -58,18 +94,16 @@ bool xe_sriov_pf_readiness(struct xe_device *xe)
> if (!dev_is_pf(dev))
> return false;
>
>+ if (!totalvfs)
>+ return pf_continue_as_native(xe, "no VFs reported");
>+
> if (!xe_device_uc_enabled(xe))
> return pf_continue_as_native(xe, "Guc submission disabled");
>
> if (!newlimit)
> return pf_continue_as_native(xe, "all VFs disabled");
>
>- pf_reduce_totalvfs(xe, newlimit);
>-
>- xe->sriov.pf.device_total_vfs = totalvfs;
>- xe->sriov.pf.driver_max_vfs = newlimit;
>-
>- return true;
>+ return pf_continue_ready(xe, newlimit);
> }
>
> /**
>--
>2.43.0
>
More information about the Intel-xe
mailing list