[Intel-gfx] [PATCH 3/3] drm/i915/gem: Try an alternate engine for relocations

Fri May 1 12:47:36 UTC 2020

On 01/05/2020 11:19, Chris Wilson wrote:
> If at first we don't succeed, try try again.
> 
> No all engines may support the MI ops we need to perform asynchronous
> relocation patching, and so we end up failing back to a synchronous
> operation that has a liability of blocking. However, Tvrtko pointed out
> we don't need to use the same engine to perform the relocations as we
> are planning to execute the execbuf on, and so if we switch over to a
> working engine, we can perform the relocation asynchronously. The user
> execbuf will be queued after the relocations by virtue of fencing.
> 
> This patch creates a new context per execbuf requiring asynchronous
> relocations on an unusable engines. This is perhaps a bit excessive and
> can be amoriated by a small context cache, but for the moment we only
> need it for working around a little used engine on Sandybridge, and only
> if relocations are actually required.
> 
> Now we just need to teach the relocation code to handle physical
> addressing for gen2/3, and we should then have universal support!
> 
> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Testcase: igt/gem_exec_reloc/basic-spin # snb
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 32 ++++++++++++++++---
>   1 file changed, 27 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index b224a453e2a3..6d649de3a796 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -1280,6 +1280,7 @@ static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma)
>   }
>   
>   static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
> +			     struct intel_engine_cs *engine,
>   			     unsigned int len)
>   {
>   	struct reloc_cache *cache = &eb->reloc_cache;
> @@ -1289,7 +1290,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
>   	u32 *cmd;
>   	int err;
>   
> -	pool = intel_gt_get_buffer_pool(eb->engine->gt, PAGE_SIZE);
> +	pool = intel_gt_get_buffer_pool(engine->gt, PAGE_SIZE);
>   	if (IS_ERR(pool))
>   		return PTR_ERR(pool);
>   
> @@ -1312,7 +1313,23 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
>   	if (err)
>   		goto err_unmap;
>   
> -	rq = i915_request_create(eb->context);
> +	if (engine == eb->context->engine) {
> +		rq = i915_request_create(eb->context);
> +	} else {
> +		struct intel_context *ce;
> +
> +		ce = intel_context_create(engine);
> +		if (IS_ERR(ce)) {
> +			err = PTR_ERR(rq);
> +			goto err_unpin;
> +		}
> +
> +		i915_vm_put(ce->vm);
> +		ce->vm = i915_vm_get(eb->context->vm);
> +
> +		rq = intel_context_create_request(ce);
> +		intel_context_put(ce);
> +	}
>   	if (IS_ERR(rq)) {
>   		err = PTR_ERR(rq);
>   		goto err_unpin;
> @@ -1363,10 +1380,15 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
>   	int err;
>   
>   	if (unlikely(!cache->rq)) {
> -		if (!intel_engine_can_store_dword(eb->engine))
> -			return ERR_PTR(-ENODEV);
> +		struct intel_engine_cs *engine = eb->engine;
> +
> +		if (!intel_engine_can_store_dword(engine)) {
> +			engine = engine->gt->engine_class[COPY_ENGINE_CLASS][0];
> +			if (!engine || !intel_engine_can_store_dword(engine))
> +				return ERR_PTR(-ENODEV);
> +		}
>   
> -		err = __reloc_gpu_alloc(eb, len);
> +		err = __reloc_gpu_alloc(eb, engine, len);
>   		if (unlikely(err))
>   			return ERR_PTR(err);
>   	}
> 

If you are not worried about the context create dance on SNB, and it is 
limited to VCS, then neither am I.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko