[RFC v4 00/12] Ring padding micro-optimisation etc

Tvrtko Ursulin tursulin at igalia.com
Fri Dec 27 11:19:26 UTC 2024


From: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>

[Re-send after fdo mailman got recovered out of the -ENOSPC state.]

There is a few ideas in this series and not all might stick.

Trivial stuff aside, the two main things to higlight are:

1) Ther departure from the existing state of "duplicate everything" by
consolidating some SDMA insert nop vfuncs.

2) Conversion of amdgpu_ring_write() to variadic to allow for more compact
compiled code.

For the latter I have only included VCE, GFX v10.0 and SDMA v5.2 as examples.
(But note the code shrink is already noticable with even only those three.)

But it is churny and looks different so people may not like it. TBD.

Other than those two, the remaining general idea of the series is to consolidate
the padding approach to memset32, especially on rings with 64 or 256 dword
alignment.

Binary size comparison:

    text    data     bss     dec     hex filename
 10439777   542501  188232 11170510   aa72ce amdgpu.ko.before
 10412793   542609  188232 11143634   aa09d2 amdgpu.ko.after

Cc: Christian König <christian.koenig at amd.com>
Cc: Sunil Khatri <sunil.khatri at amd.com>

Tvrtko Ursulin (12):
  drm/amdgpu: Use memset32 for IB padding
  drm/amdgpu: Use memset32 for ring clearing
  drm/amdgpu: Cache SDMA instance and index in the ring
  drm/amdgpu: Consolidate a bunch of similar sdma insert nop vfuncs
  drm/amdgpu: Use memset32 for sdma insert nops
  drm/amdgpu: Use amdgpu_ring_fill() for VPE padding
  drm/amdgpu: Convert JPEG, VCE and UVD to more efficient ring padding
  drm/amdgpu: Cache some values in ring emission helpers
  drm/amdgpu: Optimise amdgpu_ring_write()
  drm/amdgpu: Convert VCE to variadic amdgpu_ring_write()
  drm/amdgpu: Convert GFX v10.0 to variadic amdgpu_ring_write()
  drm/amdgpu: Convert SDMA v5.2 to variadic amdgpu_ring_write()

 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |  32 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 321 +++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c |  43 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c  |  22 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c  |  13 +-
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c    |   4 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 399 ++++++++++++-----------
 drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c   |   8 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c   |   8 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c |   8 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c   |  24 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c   |  24 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   |  31 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c |  31 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c   |  28 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c   | 182 +++++------
 drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c   |  31 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c   |  31 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c    |   7 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c    |   7 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c    |   7 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c    |   7 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c    |   9 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c    |   8 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c    |   7 +-
 26 files changed, 745 insertions(+), 551 deletions(-)

-- 
2.47.1



More information about the amd-gfx mailing list