[PATCH 2/2] drm/amdkfd: drop struct kfd_cu_info

Felix Kuehling felix.kuehling at amd.com
Wed Sep 27 16:39:06 UTC 2023


On 2023-09-26 15:29, Arnd Bergmann wrote:
> On Tue, Sep 26, 2023, at 20:47, Deucher, Alexander wrote:
>>> From: Arnd Bergmann <arnd at kernel.org>
>>> Subject: Re: [PATCH 2/2] drm/amdkfd: drop struct kfd_cu_info
>>>
>>> On Tue, Sep 26, 2023, at 18:39, Alex Deucher wrote:
>>>> I think this was an abstraction back from when kfd supported both
>>>> radeon and amdgpu.  Since we just support amdgpu now, there is no more
>>>> need for this and we can use the amdgpu structures directly.
>>>>
>>>> This also avoids having the kfd_cu_info structures on the stack when
>>>> inlining which can blow up the stack.
>>>>
>>>> Cc: Arnd Bergmann <arnd at kernel.org>
>>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>> Nice cleanup!
>>>
>>> Acked-by: Arnd Bergmann <arnd at arndb.de>
>>>
>>> I guess you could fold patch 1/2 into this as it removes all the added code from
>>> that anyway.
>> I left it as a separate patch as I didn't get a chance to see when the
>> stack warning appeared and figured it might be a good way to mitigate
>> that on stable kernels if necessary without pulling in the whole
>> rework, but if not, I can just squash it into the second patch.
> Makes sense. FWIW, I had never seen the warning before updating
> to linux-next this week from an older snapshot from last month.
>
> My guess is that one of the recent changes made gcc take
> different inlining decisions so we end up with two copies
> of the cu_info in the same stack frame, even though the
> fundamental problem was there already.

I've seen this type of problem before because our data structures keep 
growing. When we need to support more GPUs, or bigger GPUs with more 
CUs, the arrays in those structures grow, and start blowing up the stack 
in functions that didn't have a problem before.

Regards,
   Felix


>
>      Arnd


More information about the amd-gfx mailing list