[PATCH] drm/amdkfd: fix partition query when setting up recommended sdma engines
Kim, Jonathan
Jonathan.Kim at amd.com
Thu Aug 8 14:29:02 UTC 2024
[Public]
> -----Original Message-----
> From: Lazar, Lijo <Lijo.Lazar at amd.com>
> Sent: Wednesday, August 7, 2024 11:46 PM
> To: Kim, Jonathan <Jonathan.Kim at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Kuehling, Felix <Felix.Kuehling at amd.com>
> Subject: Re: [PATCH] drm/amdkfd: fix partition query when setting up
> recommended sdma engines
>
>
>
> On 8/8/2024 2:04 AM, Jonathan Kim wrote:
> > When users dynamically set the partition mode through sysfs writes,
> > this can lead to a double lock situation where the KFD is trying to take
> > the partition lock when updating the recommended SDMA engines.
> > Have the KFD do a lockless query instead to avoid this.
> > This should work since the KFD always initializes synchronously after
> > the KGD partition mode is set regardless of user or system setup.
> >
> > Fixes: a0f548d7871e ("drm/amdkfd: allow users to target recommended
> SDMA engines")
> > Signed-off-by: Jonathan Kim <jonathan.kim at amd.com>
> > ---
> > drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > index 40771f8752cb..8fee89b8dd67 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > @@ -1287,7 +1287,7 @@ static void
> kfd_set_recommended_sdma_engines(struct kfd_topology_device *to_dev,
> > int num_xgmi_nodes = adev->gmc.xgmi.num_physical_nodes;
> > bool support_rec_eng = !amdgpu_sriov_vf(adev) && to_dev->gpu &&
> > adev->aid_mask && num_xgmi_nodes &&
> > - (amdgpu_xcp_query_partition_mode(adev->xcp_mgr,
> AMDGPU_XCP_FL_NONE) ==
> > + (amdgpu_xcp_query_partition_mode(adev->xcp_mgr,
> AMDGPU_XCP_FL_LOCKED) ==
> > AMDGPU_SPX_PARTITION_MODE) &&
>
> Replacing with (gpu->kfd->num_nodes == 1) may be better.
Thanks. That seems a lot simpler. Also another assumption is that all 14 SDMA xGMI engines are present, but that may or may not always be the case for all dGPU SPX-mode devices.
I'll add that as a hard condition check as well.
Jon
>
> Thanks,
> Lijo
>
> > (!(adev->flags & AMD_IS_APU) && num_xgmi_nodes == 8);
> >
More information about the amd-gfx
mailing list