[PATCH 2/3 v5] drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush
Christian König
christian.koenig at amd.com
Thu Feb 27 16:00:33 UTC 2025
Am 27.02.25 um 12:47 schrieb Jesse.zhang at amd.com:
> From: "Jesse.zhang at amd.com" <Jesse.zhang at amd.com>
>
> - Modify the VM invalidation engine allocation logic to handle SDMA page rings.
> SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of
> allocating a separate engine. This change ensures efficient resource management and
> avoids the issue of insufficient VM invalidation engines.
>
> - Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
> Use spin_lock and spin_unlock to ensure thread safety and prevent race conditions
> during TLB flush operations. This improves the stability and reliability of the driver,
> especially in multi-threaded environments.
>
> v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
> to check if a ring is an SDMA page queue.(Lijo)
>
> v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0
This needs to be the last patch in the series and not the second. Otherwise you have a broken state in between.
>
> Suggested-by: Lijo Lazar <lijo.lazar at amd.com>
> Signed-off-by: Jesse Zhang <jesse.zhang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 +++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 23 +++++++++++++++++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 1 +
> 3 files changed, 31 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index c6e5c50a3322..68088d731c23 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -602,8 +602,15 @@ int amdgpu_gmc_allocate_vm_inv_eng(struct amdgpu_device *adev)
> return -EINVAL;
> }
>
> + if(amdgpu_sdma_is_shared_inv_eng(adev, ring)) {
> + /* Do not allocate a separate VM invalidation engine for SDMA page rings.
> + * Shared VM invalid engine with sdma gfx ring.
> + */
First of all that comment has style issues, please use checkpatch.pl.
Then you need to describe why that is done and what are the pre-requisites to make it work.
E.g. something like "SDMA has a special packet which allows it to use the same invalidation engine for all the rings in one instance."
Christian.
> + ring->vm_inv_eng = inv_eng - 1;
> + } else {
> ring->vm_inv_eng = inv_eng - 1;
> vm_inv_engs[vmhub] &= ~(1 << ring->vm_inv_eng);
> + }
>
> dev_info(adev->dev, "ring %s uses VM inv eng %u on hub %u\n",
> ring->name, ring->vm_inv_eng, ring->vm_hub);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> index 39669f8788a7..019f670edc29 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> @@ -504,6 +504,29 @@ void amdgpu_sdma_sysfs_reset_mask_fini(struct amdgpu_device *adev)
> }
> }
>
> +/**
> +* amdgpu_sdma_is_shared_inv_eng - Check if a ring is an SDMA ring that shares a VM invalidation engine
> +* @adev: Pointer to the AMDGPU device structure
> +* @ring: Pointer to the ring structure to check
> +*
> +* This function checks if the given ring is an SDMA ring that shares a VM invalidation engine.
> +* It returns true if the ring is such an SDMA ring, false otherwise.
> +*/
> +bool amdgpu_sdma_is_shared_inv_eng(struct amdgpu_device *adev, struct amdgpu_ring* ring)
> +{
> + int i = ring->me;
> +
> + if (!adev->sdma.has_page_queue || i >= adev->sdma.num_instances)
> + return false;
> +
> + if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) ||
> + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) ||
> + amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0))
> + return (ring == &adev->sdma.instance[i].ring);
> + else
> + return false;
> +}
> +
> /**
> * amdgpu_sdma_register_on_reset_callbacks - Register SDMA reset callbacks
> * @funcs: Pointer to the callback structure containing pre_reset and post_reset functions
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> index 965169320065..dcc8fd7a6784 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> @@ -194,4 +194,5 @@ int amdgpu_sdma_ras_sw_init(struct amdgpu_device *adev);
> void amdgpu_debugfs_sdma_sched_mask_init(struct amdgpu_device *adev);
> int amdgpu_sdma_sysfs_reset_mask_init(struct amdgpu_device *adev);
> void amdgpu_sdma_sysfs_reset_mask_fini(struct amdgpu_device *adev);
> +bool amdgpu_sdma_is_shared_inv_eng(struct amdgpu_device *adev, struct amdgpu_ring* ring);
> #endif
More information about the amd-gfx
mailing list