[PATCH 0/2] Fit one IB struct amdgpu_job into a 512 byte slab
Tvrtko Ursulin
tvrtko.ursulin at igalia.com
Tue Feb 25 16:52:52 UTC 2025
On 24/02/2025 12:06, Tvrtko Ursulin wrote:
> A lot of the workloads create jobs with just one IB and if we re-order some
> struct members we can stop that allocation spilling into the 1k SLAB bucket.
>
> Before:
>
> sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib) = 480 + 40 = 520
>
> After:
>
> sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib) = 472 + 32 = 504
>
> It is not a huge gain in the big picture but every little helps.
FWIW it is also quite* possible to make two IB jobs fit into 512 by
converting booleans to flags and shrinking some fields:
/* size: 448, cachelines: 7, members: 24 */
/* forced alignments: 1 */
So 448 + 2 * 64 = 512 !
That avoids spilling _any_ submissions, for example from Cyberpunk 2077,
into the 1k SLAB bucket.
*) I said quite because as after I converted booleans to flags, which
required u16 for 9 flags, shrunk vmid and num_ibs to u8 and
job_run_counter to u16 (all of which seems completely fine), I needed
just a tiny bit extra. So I shrank gws_size to u16. Being a size in
pages that could also easily be large enough.
Regards,
Tvrtko
> Tvrtko Ursulin (2):
> drm/amdgpu: Remove hole from struct amdgpu_ib
> drm/amdgpu: Reduce holes in struct amdgpu_job
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 19 ++++++++-----------
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++--
> 2 files changed, 10 insertions(+), 13 deletions(-)
>
More information about the amd-gfx
mailing list