[Intel-gfx] [PATCH 2/2] drm/i915/execlists: Skip forcewake for ELSP submission

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Jan 22 09:50:57 UTC 2018


On 20/01/2018 09:31, Chris Wilson wrote:
> Now that we can read the CSB from the HWSP, we may avoid having to
> perform mmio reads entirely and so forgo the rigmarole of the forcewake
> dance.
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ff25f209d0a5..7d2df72e68d3 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -778,6 +778,7 @@ static void execlists_submission_tasklet(unsigned long data)
>   	struct intel_engine_execlists * const execlists = &engine->execlists;
>   	struct execlist_port * const port = execlists->port;
>   	struct drm_i915_private *dev_priv = engine->i915;
> +	bool fw = false;
>   
>   	/* We can skip acquiring intel_runtime_pm_get() here as it was taken
>   	 * on our behalf by the request (see i915_gem_mark_busy()) and it will
> @@ -788,8 +789,6 @@ static void execlists_submission_tasklet(unsigned long data)
>   	 */
>   	GEM_BUG_ON(!dev_priv->gt.awake);
>   
> -	intel_uncore_forcewake_get(dev_priv, execlists->fw_domains);
> -
>   	/* Prefer doing test_and_clear_bit() as a two stage operation to avoid
>   	 * imposing the cost of a locked atomic transaction when submitting a
>   	 * new request (outside of the context-switch interrupt).
> @@ -818,6 +817,12 @@ static void execlists_submission_tasklet(unsigned long data)
>   		 */
>   		__clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
>   		if (unlikely(execlists->csb_head == -1)) { /* following a reset */
> +			if (!fw) {
> +				intel_uncore_forcewake_get(dev_priv,
> +							   execlists->fw_domains);
> +				fw = true;
> +			}
> +
>   			head = readl(dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine)));
>   			tail = GEN8_CSB_WRITE_PTR(head);
>   			head = GEN8_CSB_READ_PTR(head);

There's a GEM_TRACE which does a readl a line below, outside the if.

Then there's a writel down lower, under "if (head != execlists->csb_head)".

> @@ -943,7 +948,8 @@ static void execlists_submission_tasklet(unsigned long data)
>   	if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT))
>   		execlists_dequeue(engine);
>   
> -	intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
> +	if (fw)
> +		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
>   }
>   
>   static void insert_request(struct intel_engine_cs *engine,
> 

I had a similar patch, which also did some other tweaks, some of which I 
think you recently also sent. So I support it in principle.

Regards,

Tvrtko



More information about the Intel-gfx mailing list