[Intel-gfx] [PATCH 1/2] drm/i915/lrc: Only enable per-context and per-bb buffers if set

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Thu Sep 21 15:10:51 UTC 2017


On 21/09/2017 14:54, Chris Wilson wrote:
> The per-context and per-batch workaround buffers are optional, yet we
> tell the GPU to execute them even if they contain no instructions. Doing
> so incurs the dispatch latency, which we can avoid if we don't ask the
> GPU to execute the no-op buffers. Allow ourselves to skip setup of empty
> buffer, and then to only enable non-empty buffers in the context image.
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 15 ++++++++++-----
>   1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 86fed9f1f1ae..297c9c1564e5 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1265,7 +1265,8 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
>   			ret = -EINVAL;
>   			break;
>   		}
> -		batch_ptr = wa_bb_fn[i](engine, batch_ptr);
> +		if (wa_bb_fn[i])
> +			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
>   		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
>   	}
>   
> @@ -1994,13 +1995,12 @@ static void execlists_init_reg_state(u32 *regs,
>   	CTX_REG(regs, CTX_SECOND_BB_HEAD_L, RING_SBBADDR(base), 0);
>   	CTX_REG(regs, CTX_SECOND_BB_STATE, RING_SBBSTATE(base), 0);
>   	if (rcs) {
> -		CTX_REG(regs, CTX_BB_PER_CTX_PTR, RING_BB_PER_CTX_PTR(base), 0);
> +		struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
> +
>   		CTX_REG(regs, CTX_RCS_INDIRECT_CTX, RING_INDIRECT_CTX(base), 0);
>   		CTX_REG(regs, CTX_RCS_INDIRECT_CTX_OFFSET,
>   			RING_INDIRECT_CTX_OFFSET(base), 0);
> -
> -		if (engine->wa_ctx.vma) {
> -			struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
> +		if (wa_ctx->indirect_ctx.size) {
>   			u32 ggtt_offset = i915_ggtt_offset(wa_ctx->vma);
>   
>   			regs[CTX_RCS_INDIRECT_CTX + 1] =
> @@ -2009,6 +2009,11 @@ static void execlists_init_reg_state(u32 *regs,
>   
>   			regs[CTX_RCS_INDIRECT_CTX_OFFSET + 1] =
>   				intel_lr_indirect_ctx_offset(engine) << 6;
> +		}
> +
> +		CTX_REG(regs, CTX_BB_PER_CTX_PTR, RING_BB_PER_CTX_PTR(base), 0);
> +		if (wa_ctx->per_ctx.size) {
> +			u32 ggtt_offset = i915_ggtt_offset(wa_ctx->vma);
>   
>   			regs[CTX_BB_PER_CTX_PTR + 1] =
>   				(ggtt_offset + wa_ctx->per_ctx.offset) | 0x01;
> 

Looks OK. Does it make sense to fetch the ggtt offset only once ("if 
(size || size) ggtt_offset = ...")? Meh.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko


More information about the Intel-gfx mailing list