[PATCH] drm/amdgpu: Add use_xgmi_p2p module parameter

Alex Deucher alexdeucher at gmail.com
Wed Feb 23 18:54:34 UTC 2022


On Wed, Feb 23, 2022 at 1:47 PM Felix Kuehling <felix.kuehling at amd.com> wrote:
>
> On 2022-02-23 12:47, Alex Sierra wrote:
> > This parameter controls xGMI p2p communication, which is enabled by
> > default. However, it can be disabled by setting it to 0. In case xGMI
> > p2p is disabled in a dGPU, PCIe p2p interface will be used instead.
> > This parameter is ignored in GPUs that do not support xGMI
> > p2p configuration.
> >
> > Signed-off-by: Alex Sierra <alex.sierra at amd.com>
> > Acked-by: Luben Tuikov <luben.tuikov at amd.com>
> > Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu.h      | 1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  | 8 ++++++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 3 ++-
> >   3 files changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index f97848a0ed14..7e95d8bd2338 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -217,6 +217,7 @@ extern int amdgpu_mes;
> >   extern int amdgpu_noretry;
> >   extern int amdgpu_force_asic_type;
> >   extern int amdgpu_smartshift_bias;
> > +extern int amdgpu_use_xgmi_p2p;
> >   #ifdef CONFIG_HSA_AMD
> >   extern int sched_policy;
> >   extern bool debug_evictions;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index 2f8eafb6cf22..6156265f3178 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -181,6 +181,7 @@ int amdgpu_tmz = -1; /* auto */
> >   int amdgpu_reset_method = -1; /* auto */
> >   int amdgpu_num_kcq = -1;
> >   int amdgpu_smartshift_bias;
> > +int amdgpu_use_xgmi_p2p = -1;
>
> We usually use 1- as default if some logic is needed to figure out the
> actual default value. I don't think that's applicable for this option.
> Why not make the default 1?

If we want to make -1 the default we should document it as "auto" so
we can allow for different configurations in different situations.
That said, I don't really see a case where we would want to disable
xgmi by default unless there was some hardware limitation, and at that
point, we could just change the default to -1.

Alex

>
>
> >
> >   static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work);
> >
> > @@ -677,6 +678,13 @@ MODULE_PARM_DESC(force_asic_type,
> >       "A non negative value used to specify the asic type for all supported GPUs");
> >   module_param_named(force_asic_type, amdgpu_force_asic_type, int, 0444);
> >
> > +/**
> > + * DOC: use_xgmi_p2p (int)
> > + * Enables/disables XGMI P2P interface (0 = disable, 1 = enable). The Default is -1 (enabled).
>
> Same as above.
>
>
> > + */
> > +MODULE_PARM_DESC(use_xgmi_p2p,
> > +     "Disable XGMI P2P interface (0 = disable; 1 = enable; -1 default, enabled)");
>
> Same as above.
>
> With that fixed, the patch is
>
> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>
>
> > +module_param_named(use_xgmi_p2p, amdgpu_use_xgmi_p2p, int, 0444);
> >
> >
> >   #ifdef CONFIG_HSA_AMD
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> > index 0afca51c3c0c..095921851fb5 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> > @@ -66,7 +66,8 @@ uint64_t amdgpu_xgmi_get_relative_phy_addr(struct amdgpu_device *adev,
> >   static inline bool amdgpu_xgmi_same_hive(struct amdgpu_device *adev,
> >               struct amdgpu_device *bo_adev)
> >   {
> > -     return (adev != bo_adev &&
> > +     return (amdgpu_use_xgmi_p2p &&
> > +             adev != bo_adev &&
> >               adev->gmc.xgmi.hive_id &&
> >               adev->gmc.xgmi.hive_id == bo_adev->gmc.xgmi.hive_id);
> >   }


More information about the amd-gfx mailing list