[PATCH] drm/xe: Flush delayed frees and retry on user object allocation failure

Mon Mar 17 13:02:23 UTC 2025

Hi, Tvrtko,

On Fri, 2025-03-14 at 14:57 +0000, Tvrtko Ursulin wrote:
> If userspace workload is operating near the limit of available memory
> and
> it moves from one large working set to another, in other words free
> some
> large buffers and immediately allocate some new ones, the TTM
> eviction
> attempt during new resource allocation may not be effective due the
> combination of relying on trylock and the fact released buffer might
> have
> ended on the delayed release path.
> 
> From userspace point of view this reflects as sporadic
> VK_ERROR_OUT_OF_DEVICE_MEMORY ie. sometimes the application will
> work,
> sometimes will fail, even if it does exactly the same thing on an
> otherwise idle system.
> 
> Good examples are two tests from the VK CTS suite:
> 
>  - dEQP-VK.pipeline.monolithic.render_to_image.core.*
>  - dEQP-VK.memory.allocation.random*
> 
> To improve this we can flush the TTM delayed object free workqueue
> and
> retry when encountering ENOMEM, which so far looks like a significant
> improvement in mean-time-to-failure.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> ---
>  drivers/gpu/drm/xe/xe_bo.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 64f9c936eea0..53f45c766e59 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -1900,9 +1900,25 @@ struct xe_bo *xe_bo_create_user(struct
> xe_device *xe, struct xe_tile *tile,
>  				u16 cpu_caching,
>  				u32 flags)
>  {
> -	struct xe_bo *bo = __xe_bo_create_locked(xe, tile, vm, size,
> 0, ~0ULL,
> -						 cpu_caching,
> ttm_bo_type_device,
> -						 flags |
> XE_BO_FLAG_USER, 0);
> +	unsigned int retry = 3;
> +	struct xe_bo *bo;
> +
> +	while (retry--) {
> +		bo = __xe_bo_create_locked(xe, tile, vm, size, 0,
> ~0ULL,
> +					   cpu_caching,
> ttm_bo_type_device,
> +					   flags | XE_BO_FLAG_USER,
> 0);
> +		if (!IS_ERR(bo) || PTR_ERR(bo) != -ENOMEM)
> +			break;
> +
> +		/*
> +		 * TTM eviction may sporadically fail due reliance
> on trylock
> +		 * and delayed ttm_bo deletion causing trylock to
> fail. Work
> +		 * around it by retrying after currently pending
> delayed
> +		 * releases have been processed.
> +		 */
> +		flush_workqueue(xe->ttm.wq);
> +	}

We should use xe_vm_validate_should_retry() for this. (I think actually
I saw a patch for that floating around somewhere). IIRC the bos are
lockable also when in the delayed object free workqueue?

So could you try with that instead? If you think that doesn't help, we
should perhaps consider also flushing the we ttm wq from within that
function instead.

Thanks,
Thomas

> +
>  	if (!IS_ERR(bo))
>  		xe_bo_unlock_vm_held(bo);
>