[PATCH 0/2] Fit one IB struct amdgpu_job into a 512 byte slab

Tvrtko Ursulin tvrtko.ursulin at igalia.com
Tue Feb 25 16:52:52 UTC 2025


On 24/02/2025 12:06, Tvrtko Ursulin wrote:
> A lot of the workloads create jobs with just one IB and if we re-order some
> struct members we can stop that allocation spilling into the 1k SLAB bucket.
> 
> Before:
> 
>    sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib) = 480 + 40 = 520
> 
> After:
> 
>    sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib) = 472 + 32 = 504
> 
> It is not a huge gain in the big picture but every little helps.

FWIW it is also quite* possible to make two IB jobs fit into 512 by 
converting booleans to flags and shrinking some fields:

             /* size: 448, cachelines: 7, members: 24 */
             /* forced alignments: 1 */

So 448 + 2 * 64 = 512 !

That avoids spilling _any_ submissions, for example from Cyberpunk 2077, 
into the 1k SLAB bucket.

*) I said quite because as after I converted booleans to flags, which 
required u16 for 9 flags, shrunk vmid and num_ibs to u8 and 
job_run_counter to u16 (all of which seems completely fine), I needed 
just a tiny bit extra. So I shrank gws_size to u16. Being a size in 
pages that could also easily be large enough.

Regards,

Tvrtko

> Tvrtko Ursulin (2):
>    drm/amdgpu: Remove hole from struct amdgpu_ib
>    drm/amdgpu: Reduce holes in struct amdgpu_job
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h  | 19 ++++++++-----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 ++--
>   2 files changed, 10 insertions(+), 13 deletions(-)
> 



More information about the amd-gfx mailing list