[PATCH v4] drm/amdkfd: Use dynamic allocation for CU occupancy array in 'kfd_get_cu_occupancy()'

SRINIVASAN SHANMUGAM srinivasan.shanmugam at amd.com
Mon Oct 28 15:12:07 UTC 2024


Ping?

On 10/25/2024 8:13 AM, Srinivasan Shanmugam wrote:
> The `kfd_get_cu_occupancy` function previously declared a large
> `cu_occupancy` array as a local variable, which could lead to stack
> overflows due to excessive stack usage. This commit replaces the static
> array allocation with dynamic memory allocation using `kcalloc`,
> thereby reducing the stack size.
>
> This change avoids the risk of stack overflows in kernel space,  in
> scenarios where `AMDGPU_MAX_QUEUES` is large. The  allocated memory is
> freed using `kfree` before the function returns  to prevent memory
> leaks.
>
> Fixes the below with gcc W=1:
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c: In function ‘kfd_get_cu_occupancy’:
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:322:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
>    322 | }
>        | ^
>
> Fixes: 6fc91266b798 ("drm/amdkfd: Update logic for CU occupancy calculations")
> Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan at amd.com>
> Cc: Felix Kuehling <felix.kuehling at amd.com>
> Cc: Christian König <christian.koenig at amd.com>
> Cc: Alex Deucher <alexander.deucher at amd.com>
> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam at amd.com>
> Suggested-by: Mukul Joshi <mukul.joshi at amd.com>
> ---
> v4:
>   - Allocation is moved just before it's needed (Mukul)
>
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index d4aa843aacfd..6bab6fc6a35d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -271,11 +271,9 @@ static int kfd_get_cu_occupancy(struct attribute *attr, char *buffer)
>   	struct kfd_process *proc = NULL;
>   	struct kfd_process_device *pdd = NULL;
>   	int i;
> -	struct kfd_cu_occupancy cu_occupancy[AMDGPU_MAX_QUEUES];
> +	struct kfd_cu_occupancy *cu_occupancy;
>   	u32 queue_format;
>   
> -	memset(cu_occupancy, 0x0, sizeof(cu_occupancy));
> -
>   	pdd = container_of(attr, struct kfd_process_device, attr_cu_occupancy);
>   	dev = pdd->dev;
>   	if (dev->kfd2kgd->get_cu_occupancy == NULL)
> @@ -293,6 +291,10 @@ static int kfd_get_cu_occupancy(struct attribute *attr, char *buffer)
>   	wave_cnt = 0;
>   	max_waves_per_cu = 0;
>   
> +	cu_occupancy = kcalloc(AMDGPU_MAX_QUEUES, sizeof(*cu_occupancy), GFP_KERNEL);
> +	if (!cu_occupancy)
> +		return -ENOMEM;
> +
>   	/*
>   	 * For GFX 9.4.3, fetch the CU occupancy from the first XCC in the partition.
>   	 * For AQL queues, because of cooperative dispatch we multiply the wave count
> @@ -318,6 +320,7 @@ static int kfd_get_cu_occupancy(struct attribute *attr, char *buffer)
>   
>   	/* Translate wave count to number of compute units */
>   	cu_cnt = (wave_cnt + (max_waves_per_cu - 1)) / max_waves_per_cu;
> +	kfree(cu_occupancy);
>   	return snprintf(buffer, PAGE_SIZE, "%d\n", cu_cnt);
>   }
>   


More information about the amd-gfx mailing list