[PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

Wed Jun 28 13:05:32 UTC 2017

Am 28.06.2017 um 04:33 schrieb John Brooks:
> When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace,
> it should only be treated as a hint to initially place a BO somewhere CPU
> accessible, rather than having a permanent effect on BO placement.
>
> Instead of the flag being set in stone at BO creation, set the flag when a
> page fault occurs so that it goes somewhere CPU-visible, and clear it when
> the BO is requested by the GPU.
>
> However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
> invisible VRAM, where they may promptly generate another page fault. When
> BOs are constantly moved back and forth like this, it is highly detrimental
> to performance. Only clear the flag on CS if:
>
> - The BO wasn't page faulted for a certain amount of time (currently 10
>    seconds), and
> - its last page fault didn't occur too soon (currently 500ms) after its
>    last CS request, or vice versa.
>
> Setting the flag in amdgpu_fault_reserve_notify() also means that we can
> remove the loop to restrict lpfn to the end of visible VRAM, because
> amdgpu_ttm_placement_init() will do it for us.

I'm fine with the general approach, but I'm still absolutely not keen 
about clearing the flag when userspace has originally specified it.

Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT 
or something like this.

Regards,
Christian.

>
> Signed-off-by: John Brooks <john at fastquake.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 45 +++++++++++++++++++++++-------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  1 +
>   3 files changed, 38 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 071b592..1ac3c2e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -341,6 +341,8 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
>   	if (bo->pin_count)
>   		return 0;
>   
> +	amdgpu_bo_clear_cpu_access_required(bo);
> +
>   	/* Don't move this buffer if we have depleted our allowance
>   	 * to move it. Don't move anything if the threshold is zero.
>   	 */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index b71775c..658d7b1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -943,8 +943,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>   {
>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
>   	struct amdgpu_bo *abo;
> -	unsigned long offset, size, lpfn;
> -	int i, r;
> +	unsigned long offset, size;
> +	int r;
>   
>   	if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
>   		return 0;
> @@ -967,15 +967,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
>   
>   	/* hurrah the memory is not visible ! */
>   	atomic64_inc(&adev->num_vram_cpu_page_faults);
> +	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>   	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
> -	lpfn =	adev->mc.visible_vram_size >> PAGE_SHIFT;
> -	for (i = 0; i < abo->placement.num_placement; i++) {
> -		/* Force into visible VRAM */
> -		if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
> -		    (!abo->placements[i].lpfn ||
> -		     abo->placements[i].lpfn > lpfn))
> -			abo->placements[i].lpfn = lpfn;
> -	}
> +
>   	r = ttm_bo_validate(bo, &abo->placement, false, false);
>   	if (unlikely(r == -ENOMEM)) {
>   		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
> @@ -1033,3 +1027,34 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
>   
>   	return bo->tbo.offset;
>   }
> +
> +/**
> + * amdgpu_bo_clear_cpu_access_required
> + * @bo: BO to update
> + *
> + * Clears CPU_ACCESS_REQUIRED flag if the BO hasn't had a page fault in a while
> + * and it didn't have a page fault too soon after the last time it was moved to
> + * VRAM.
> + *
> + * Caller should have bo reserved.
> + *
> + */
> +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo)
> +{
> +	const unsigned int page_fault_timeout_ms = 10000;
> +	const unsigned int min_period_ms = 500;
> +	unsigned int ms_since_pf, period_ms;
> +
> +	ms_since_pf = jiffies_to_msecs(jiffies - bo->last_page_fault_jiffies);
> +	period_ms = jiffies_to_msecs(abs(bo->last_page_fault_jiffies -
> +					 bo->last_cs_move_jiffies));
> +
> +	/*
> +	 * Try to avoid a revolving door between GTT and VRAM. Clearing the
> +	 * flag may move this BO back to VRAM, so don't clear it if it's likely
> +	 * to page fault and go right back to GTT.
> +	 */
> +	if ((!bo->last_page_fault_jiffies || !bo->last_cs_move_jiffies) ||
> +	    (ms_since_pf > page_fault_timeout_ms && period_ms > min_period_ms))
> +		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 3824851..b0cb137 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -182,6 +182,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
>   				  struct reservation_object *resv,
>   				  struct dma_fence **fence,
>   				  bool direct);
> +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo);
>   
>   
>   /*