[PATCH 0/4] Ring padding CPU optimisation and some RFC bits
Christian König
ckoenig.leichtzumerken at gmail.com
Tue Oct 8 18:10:45 UTC 2024
Am 08.10.24 um 17:05 schrieb Tvrtko Ursulin:
> From: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
>
> I've noticed the hardware ring padding optimisations have landed so I decided
> to respin the CPU side optimisations.
>
> First two patches are simply adding ring fill helpers which deal with reducing
> the CPU cost of emitting hundreds of nops from the for-amdgpu_ring_write loops.
>
> If receptive for the idea, please double-check I preserved endianess behaviour
> as is.
I'm pretty sure that this was broken before or at least uses HW features
which are not guaranteed to work any more.
Sunil has already commited a set which does mostly the same as this
here. The only thing missing is the improvements for the IB patching and
a bunch of things I've been working on recently.
Going to send those out in a Minute, would be cool if you could run a
few performance analysis on those patches as well since you already seem
to have the setup for that.
Thanks,
Christian.
>
> Last two patches are new and RFC. Both are incomplete conversion to two new
> helpers intended to deal with an often repeated pattern of:
>
> - amdgpu_ring_write(ring, lower_32_bits(addr));
> - amdgpu_ring_write(ring, upper_32_bits(addr));
> + amdgpu_ring_write_addr(ring, addr);
>
> Last patch is the most uncertain one where there seems to be some magic bit
> used only on big endian. It has no name so I couldn't figure out what it was
> about.
>
> - amdgpu_ring_write(ring,
> -#ifdef __BIG_ENDIAN
> - (2 << 0) |
> -#endif
> - lower_32_bits(ib->gpu_addr));
> - amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr));
> + amdgpu_ring_write_addr_xbe(ring, ib->gpu_addr);
>
> Anyway, both patterns have a lot of users so reductions in source code and
> binary size aside, main question is do these kind of helpers improve readability
> or are making it worse.
>
> (Note that the _xbe name in the last patch is just a placeholder.)
>
> Cc: Christian König <ckoenig.leichtzumerken at gmail.com>
> Cc: Sunil Khatri <sunil.khatri at amd.com>
>
> Tvrtko Ursulin (4):
> drm/amdgpu: More efficient ring padding
> drm/amdgpu: More more efficient ring padding
> drm/amdgpu: Add and use amdgpu_ring_write_addr() helper
> drm/amdgpu: Document the magic big endian bit
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 19 ++++-
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 101 +++++++++++++++++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 6 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c | 25 +++---
> drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 27 +++---
> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 66 +++++----------
> drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 60 +++++---------
> drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 45 ++++------
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 63 +++++---------
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 48 ++++-------
> drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c | 8 +-
> drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 8 +-
> drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 8 +-
> drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 16 ++--
> drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 7 +-
> drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 7 +-
> drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c | 7 +-
> drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 7 +-
> drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c | 9 +-
> drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 8 +-
> drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 7 +-
> 28 files changed, 319 insertions(+), 345 deletions(-)
>
More information about the amd-gfx
mailing list