[RFC v4 00/12] Ring padding micro-optimisation etc
Tvrtko Ursulin
tursulin at igalia.com
Fri Dec 27 11:19:26 UTC 2024
From: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
[Re-send after fdo mailman got recovered out of the -ENOSPC state.]
There is a few ideas in this series and not all might stick.
Trivial stuff aside, the two main things to higlight are:
1) Ther departure from the existing state of "duplicate everything" by
consolidating some SDMA insert nop vfuncs.
2) Conversion of amdgpu_ring_write() to variadic to allow for more compact
compiled code.
For the latter I have only included VCE, GFX v10.0 and SDMA v5.2 as examples.
(But note the code shrink is already noticable with even only those three.)
But it is churny and looks different so people may not like it. TBD.
Other than those two, the remaining general idea of the series is to consolidate
the padding approach to memset32, especially on rings with 64 or 256 dword
alignment.
Binary size comparison:
text data bss dec hex filename
10439777 542501 188232 11170510 aa72ce amdgpu.ko.before
10412793 542609 188232 11143634 aa09d2 amdgpu.ko.after
Cc: Christian König <christian.koenig at amd.com>
Cc: Sunil Khatri <sunil.khatri at amd.com>
Tvrtko Ursulin (12):
drm/amdgpu: Use memset32 for IB padding
drm/amdgpu: Use memset32 for ring clearing
drm/amdgpu: Cache SDMA instance and index in the ring
drm/amdgpu: Consolidate a bunch of similar sdma insert nop vfuncs
drm/amdgpu: Use memset32 for sdma insert nops
drm/amdgpu: Use amdgpu_ring_fill() for VPE padding
drm/amdgpu: Convert JPEG, VCE and UVD to more efficient ring padding
drm/amdgpu: Cache some values in ring emission helpers
drm/amdgpu: Optimise amdgpu_ring_write()
drm/amdgpu: Convert VCE to variadic amdgpu_ring_write()
drm/amdgpu: Convert GFX v10.0 to variadic amdgpu_ring_write()
drm/amdgpu: Convert SDMA v5.2 to variadic amdgpu_ring_write()
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 32 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 321 +++++++++++++++++-
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 43 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 22 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c | 13 +-
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 +-
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 399 ++++++++++++-----------
drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c | 8 +-
drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 8 +-
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 8 +-
drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 24 +-
drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 24 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 31 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 31 +-
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 28 +-
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 182 +++++------
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 31 +-
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 31 +-
drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 7 +-
drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 7 +-
drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c | 7 +-
drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 7 +-
drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 9 +-
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 8 +-
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 7 +-
26 files changed, 745 insertions(+), 551 deletions(-)
--
2.47.1
More information about the amd-gfx
mailing list