[PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and set CP_HQD_HQ_STATUS0[29]

Sider, Graham Graham.Sider at amd.com
Tue Apr 4 15:59:41 UTC 2023


[Public]

> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
> Russell, Kent
> Sent: Tuesday, April 4, 2023 9:43 AM
> To: Somasekharan, Sreekant <Sreekant.Somasekharan at amd.com>; amd-
> gfx at lists.freedesktop.org
> Cc: Somasekharan, Sreekant <Sreekant.Somasekharan at amd.com>
> Subject: RE: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and
> set CP_HQD_HQ_STATUS0[29]
> 
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
> 
> 
> [AMD Official Use Only - General]
> 
> Comments inline
> 
> > -----Original Message-----
> > From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
> > Sreekant Somasekharan
> > Sent: Monday, April 3, 2023 3:59 PM
> > To: amd-gfx at lists.freedesktop.org
> > Cc: Somasekharan, Sreekant <Sreekant.Somasekharan at amd.com>
> > Subject: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and
> > set CP_HQD_HQ_STATUS0[29]
> >
> > On GFX11, CP_HQD_HQ_STATUS0[29] bit will be used by CPFW to
> > acknowledge whether PCIe atomics are supported. The default value of
> > this bit is set to 0. Driver will check whether PCIe atomics are
> > supported and set the bit to 1 if supported. This will force CPFW to use real
> atomic ops.
> > If the bit is not set, CPFW will default to read/modify/write using
> > the firmware itself.
> >
> > This is applicable only to RS64 based GFX11 with MEC FW greater than
> > or equal to 509. If MEC FW is less than 509, PCIe atomics needs to be
> > supported, else it will skip the device.
> >
> > This commit also involves moving amdgpu_amdkfd_device_probe()
> function
> > call after per-IP early_init loop in amdgpu_device_ip_early_init()
> > function so as to check for RS64 enabled device.
> >
> > Signed-off-by: Sreekant Somasekharan
> <sreekant.somasekharan at amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c       |  2 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_device.c          | 11 +++++++++++
> >  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c |  9 +++++++++
> >  3 files changed, 21 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 7116119ed038..b3a754ca0923 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2150,7 +2150,6 @@ static int amdgpu_device_ip_early_init(struct
> > amdgpu_device *adev)
> >               adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
> >       }
> >
> > -     amdgpu_amdkfd_device_probe(adev);
> >
> >       adev->pm.pp_feature = amdgpu_pp_feature_mask;
> >       if (amdgpu_sriov_vf(adev) || sched_policy ==
> > KFD_SCHED_POLICY_NO_HWS)
> > @@ -2206,6 +2205,7 @@ static int amdgpu_device_ip_early_init(struct
> > amdgpu_device *adev)
> >       if (!total)
> >               return -ENODEV;
> >
> > +     amdgpu_amdkfd_device_probe(adev);
> >       adev->cg_flags &= amdgpu_cg_mask;
> >       adev->pg_flags &= amdgpu_pg_mask;
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > index 521dfa88aad8..64a295a35d37 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > @@ -204,6 +204,17 @@ static void kfd_device_info_init(struct kfd_dev
> *kfd,
> >                       /* Navi1x+ */
> >                       if (gc_version >= IP_VERSION(10, 1, 1))
> >                               kfd->device_info.needs_pci_atomics =
> > true;
> > +             } else if (gc_version < IP_VERSION(12, 0, 0)) {
> 
> 
> What if we get a GFX9 with MEC v509? Wouldn't that trigger this too?
> Wondering if this should be if (gc_version>=IP_VERSION(11,0,0) &&
> gc_version < IP_VERSION(12,0,0)) thus ensuring it's only GFX11. Or maybe
> there is some better check than that. Either way, checking that it's < GFX11
> might false-positive on GFX10- too, so we should probably be explicit in our
> GFX check that it's only GFX11.

The previous condition is for gc_version < IP_VERSION(11, 0, 0), so that condition will (and currently is) taken for gfx9/gfx10/etc.

That's to say the logic after this change will look like:

If (KFD_IS_SOC15(kfd)) {
	<...>
	If (gc_version < IP_VERSION(11, 0, 0)) {
		<...>
	} else if (gc_version < IP_VERSION(12, 0, 0)) {
		<...>
	}
}

So this new path will only be taken for gfx11.

Best,
Graham

> 
>  Kent
> 
> > +                     /* On GFX11 running on RS64, MEC FW version must
> > + be
> > greater than
> > +                      * or equal to version 509 to support
> > + acknowledging
> > whether
> > +                      * PCIe atomics are supported. Before MEC
> > + version 509,
> > PCIe
> > +                      * atomics are required. After that, the FW's
> > + use of
> > atomics
> > +                      * is controlled by CP_HQD_HQ_STATUS0[29].
> > +                      * This will fail on GFX11 when PCIe atomics are
> > + not
> > supported
> > +                      * and MEC FW version < 509 for RS64 based CPFW.
> > +                      */
> > +                     kfd->device_info.needs_pci_atomics = true;
> > +                     kfd->device_info.no_atomic_fw_version =
> > + kfd->adev-
> > >gfx.rs64_enable ? 509 : 0;
> >               }
> >       } else {
> >               kfd->device_info.doorbell_size = 4; diff --git
> > a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > index 4a9af800b1f1..c5ea594abbf6 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > @@ -143,6 +143,15 @@ static void init_mqd(struct mqd_manager *mm,
> void
> > **mqd,
> >                       1 << CP_HQD_QUANTUM__QUANTUM_SCALE__SHIFT
> > |
> >                       1 <<
> > CP_HQD_QUANTUM__QUANTUM_DURATION__SHIFT;
> >
> > +     /*
> > +      * If PCIe atomics are supported, set CP_HQD_HQ_STATUS0[29] == 1
> > +      * to force CPFW to use atomics. This is supported only on MEC FW
> > +      * version >= 509 and on RS64 based CPFW only. On previous versions,
> > +      * platforms running on GFX11 must support atomics else will
> > + skip the
> > device.
> > +      */
> > +     if (amdgpu_amdkfd_have_atomics_support((mm->dev->adev)))
> > +             m->cp_hqd_hq_status0 |= 1 << 29;
> > +
> >       if (q->format == KFD_QUEUE_FORMAT_AQL) {
> >               m->cp_hqd_aql_control =
> >                       1 << CP_HQD_AQL_CONTROL__CONTROL0__SHIFT;
> > --
> > 2.25.1


More information about the amd-gfx mailing list