[PATCH] drm/amdkfd: increase max number of queues per process
Alex Deucher
alexdeucher at gmail.com
Mon Mar 24 21:21:32 UTC 2025
On Mon, Mar 24, 2025 at 5:07 PM Eric Huang <jinhuieric.huang at amd.com> wrote:
>
>
> On 2025-03-24 15:32, Alex Deucher wrote:
> > On Mon, Mar 24, 2025 at 1:26 PM Eric Huang <jinhuieric.huang at amd.com> wrote:
> >> kfdtest KFDQMTest.OverSubscribeCpQueues with multiple
> >> gpu mode fails on gfx v9.4.3+NPS4+CPX which has 64 gpu
> >> nodes, the queues created are 65x64=4160, but the number
> >> 1024 0f KFD_MAX_NUM_OF_QUEUES_PER_PROCESS is not enough
> >> and test fails at function find_available_queue_slot().
> >> So increasing the nubmer will make the test passed.
> >>
> >> Signed-off-by: Eric Huang <jinhuieric.huang at amd.com>
> >> ---
> >> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> >> index f6aedf69c644..054a78207ffe 100644
> >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> >> @@ -94,7 +94,7 @@
> >> ((typeof(ptr_to_struct)) kzalloc(sizeof(*ptr_to_struct), GFP_KERNEL))
> >>
> >> #define KFD_MAX_NUM_OF_PROCESSES 512
> >> -#define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS 1024
> >> +#define KFD_MAX_NUM_OF_QUEUES_PER_PROCESS 4160
> > Doesn't this limit have more to do with the number of doorbells you
> > can fit into a 4K page? If you only allocate 4K for doorbells how can
> > you increase this?
>
> The doorbells size is allocated dynamically as multiple pages based on
> KFD_MAX_NUM_OF_QUEUES_PER_PROCESS in KFD. Currently with 1024 of this
> macro 2 pages are allocated, and after changing to 4160, 9 pages will be
> allocated. Please refer in function kfd_allocate_process_doorbells().
Thanks for the details. Since most apps don't use that many, it seems
like a waste of doorbells. Should this be limited to certain
partition modes?
Alex
>
> Thanks,
> Eric
>
> >
> > Alex
> >
> >> /*
> >> * Size of the per-process TBA+TMA buffer: 2 pages
> >> --
> >> 2.34.1
> >>
>
More information about the amd-gfx
mailing list