[PATCH 5/6] drm/amdgpu: always enable move threshold for BOs
Tvrtko Ursulin
tursulin at ursulin.net
Tue Jun 11 16:24:00 UTC 2024
Hi Christian,
On 04/06/2024 17:05, Christian König wrote:
> This should prevent buffer moves when the threshold is reached during
> CS.
>
> Signed-off-by: Christian König <christian.koenig at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 36 ++++++++--------------
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 22 +++++++++----
> 2 files changed, 29 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index ec888fc6ead8..9a217932a4fc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -784,7 +784,6 @@ static int amdgpu_cs_bo_validate(void *param, struct amdgpu_bo *bo)
> .no_wait_gpu = false,
> .resv = bo->tbo.base.resv
> };
> - uint32_t domain;
> int r;
>
> if (bo->tbo.pin_count)
> @@ -796,37 +795,28 @@ static int amdgpu_cs_bo_validate(void *param, struct amdgpu_bo *bo)
> if (p->bytes_moved < p->bytes_moved_threshold &&
> (!bo->tbo.base.dma_buf ||
> list_empty(&bo->tbo.base.dma_buf->attachments))) {
> +
> + /* And don't move a CPU_ACCESS_REQUIRED BO to limited
> + * visible VRAM if we've depleted our allowance to do
> + * that.
> + */
> if (!amdgpu_gmc_vram_full_visible(&adev->gmc) &&
> - (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) {
> - /* And don't move a CPU_ACCESS_REQUIRED BO to limited
> - * visible VRAM if we've depleted our allowance to do
> - * that.
> - */
> - if (p->bytes_moved_vis < p->bytes_moved_vis_threshold)
> - domain = bo->preferred_domains;
> - else
> - domain = bo->allowed_domains;
> - } else {
> - domain = bo->preferred_domains;
> - }
> - } else {
> - domain = bo->allowed_domains;
> + (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) &&
> + p->bytes_moved_vis < p->bytes_moved_vis_threshold)
> + ctx.move_threshold = p->bytes_moved_vis_threshold -
> + p->bytes_moved_vis;
> + else
> + ctx.move_threshold = p->bytes_moved_vis_threshold -
> + p->bytes_moved;
> }
>
> -retry:
> - amdgpu_bo_placement_from_domain(bo, domain);
> + amdgpu_bo_placement_from_domain(bo, bo->allowed_domains);
> r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>
> p->bytes_moved += ctx.bytes_moved;
> if (!amdgpu_gmc_vram_full_visible(&adev->gmc) &&
> amdgpu_res_cpu_visible(adev, bo->tbo.resource))
> p->bytes_moved_vis += ctx.bytes_moved;
> -
> - if (unlikely(r == -ENOMEM) && domain != bo->allowed_domains) {
> - domain = bo->allowed_domains;
> - goto retry;
> - }
> -
> return r;
> }
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 8c92065c2d52..cae1a5420c58 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -168,13 +168,23 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain)
> abo->flags & AMDGPU_GEM_CREATE_PREEMPTIBLE ?
> AMDGPU_PL_PREEMPT : TTM_PL_TT;
> places[c].flags = 0;
> - /*
> - * When GTT is just an alternative to VRAM make sure that we
> - * only use it as fallback and still try to fill up VRAM first.
> - */
> +
> if (domain & abo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM &&
> - !(adev->flags & AMD_IS_APU))
> - places[c].flags |= TTM_PL_FLAG_FALLBACK;
> + !(adev->flags & AMD_IS_APU)) {
> + /*
> + * When GTT is just an alternative to VRAM make sure that we
> + * only use it as fallback and still try to fill up VRAM first.
> + */
> + if (abo->preferred_domains & AMDGPU_GEM_DOMAIN_GTT)
> + places[c].flags |= TTM_PL_FLAG_FALLBACK;
> +
> + /*
> + * Enable GTT when the threshold of moved bytes is
> + * reached. This prevents any non essential buffer move
> + * when the links are already saturated.
> + */
> + places[c].flags |= TTM_PL_FLAG_MOVE_THRESHOLD;
> + }
For the APU case I *think* this works, but for discrete I am not sure yet.
As a side note and disclaimer, the TTM "resource compatible" logic has a
half-life of about one week in my brain until I need to almost re-figure
it all out. I don't know if it just me, but I find it really
non-intuitive and almost like double, triple, or even quadruple negation
way of thinking about things.
It is not helping that with this proposal you set threshold on just one
of the possible object placements which further increases the asymmetry.
For me intuitive thing would be that thresholds apply to the act of
changing the current placement directly. Not indirectly via playing with
one of the placement flags dynamically.
Anyway, lets see.. So you set TTM_PL_FLAG_MOVE_THRESHOLD and
TTM_PL_FLAG_FALLBACK on the GTT placement, with the logic that it will
be considered compatible while under the migration budget?
(Side note, the fact both flags are set I also find very difficult to
mentally model.)
Say a buffer was evicted to GTT already. What then brings it back to VRAM?
The first subsequent ttm_bo_validate pass (!evicting) says GTT is fine
(applicable) while ctx->bytes_moved < ctx->move_threshold, no? Isn't
that the opposite of what would be required and causes nothing to be
migrated back in? What am I missing?
Regards,
Tvrtko
More information about the amd-gfx
mailing list