[Intel-gfx] [PATCH] drm/i915: Only grab correct forcewake for the engine with execlists

Chris Wilson chris at chris-wilson.co.uk
Wed Apr 6 15:12:34 UTC 2016


On Wed, Apr 06, 2016 at 03:49:55PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> Rather than blindly waking up all forcewake domains on command
> submission, we can teach each engine what is (or are) the correct
> one to take.
> 
> On platforms with multiple forcewake domains like VLV, CHV, SKL
> and BXT, this has the potential of lowering the GPU and CPU power
> use and submission latency.
> 
> To implement it, we extract the code which holds the built in
> knowledge on which forcewake domains are applicable for each
> registers from the MMIO accessors into separate macros, and
> implement two new functions named intel_reg_read_fw_domains and
> intel_reg_read_fw_domains.
> 
> These enables callers, in this case the execlists engine setup
> code, to query which forcewake domains are relevant per engine
> on the currently running platform.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |   6 +
>  drivers/gpu/drm/i915/intel_lrc.c        |  18 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
>  drivers/gpu/drm/i915/intel_uncore.c     | 306 +++++++++++++++++++++-----------
>  4 files changed, 226 insertions(+), 105 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a1f78f275c55..311217e7f917 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -633,6 +633,12 @@ enum forcewake_domains {
>  			 FORCEWAKE_MEDIA)
>  };
>  
> +enum forcewake_domains
> +intel_reg_read_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg);
> +
> +enum forcewake_domains
> +intel_reg_write_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg);
> +
>  struct intel_uncore_funcs {
>  	void (*force_wake_get)(struct drm_i915_private *dev_priv,
>  							enum forcewake_domains domains);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index a1db6a02cf23..cac387f38cf6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -418,6 +418,7 @@ static void execlists_submit_requests(struct drm_i915_gem_request *rq0,
>  				      struct drm_i915_gem_request *rq1)
>  {
>  	struct drm_i915_private *dev_priv = rq0->i915;
> +	unsigned int fw_domains = rq0->engine->fw_domains_elsp;

So I was thinking this was overly specific and that no hw engineer would
ever split the block of ring registers into separate power domains, and
then I realised they already had with the shadow_regs.

The only other place where we could make semantic use of this is in
intel_ringbuffer.c:init_ring_common. We don't make use of the block fw
put/get anywhere else - the only other candidate I can think of right
now would be gen6 bsd legacy tail writes. That is a block of register
writes were marking the intent to stay awake through the entire sequence
would be worth it.

> +#define __gen6_reg_read_fw_domains(offset) \
> +({ \
> +	enum forcewake_domains __fwd; \
> +	if (NEEDS_FORCE_WAKE(offset)) \
> +		__fwd = FORCEWAKE_RENDER; \
> +	else \
> +		__fwd = 0; \
> +	__fwd; \
> +})

[snip]
This split out looks right, but it is probably worth doing as a seperate
step and checking for changes in objdump. Just to be paranoid that we
didn't make an unexpected modification.

> +enum forcewake_domains
> +intel_reg_read_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg)
> +{
> +	if (intel_vgpu_active(dev_priv->dev))
> +		return 0;
> +
> +	switch (INTEL_INFO(dev_priv)->gen) {
> +	case 9:
> +		return __gen9_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> +	case 8:
> +		if (IS_CHERRYVIEW(dev_priv))
> +			return __chv_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> +		else
> +			return __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> +	case 7:
> +	case 6:
> +		if (IS_VALLEYVIEW(dev_priv))
> +			return __vlv_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> +		else
> +			return __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> +	default:
> +		/* It is a bug to think about forcewake on older platforms. */

Yes, but we might as well report them correctly as 0 and not throw a
warning. You are calling this an intel_() function, and not gen6_()
after all (that is inviting everybody to use it).

default:
> +		MISSING_CASE(INTEL_INFO(dev_priv)->gen);

case 5: /* forcewake was introduced with gen6 */
case 4:
case 3:
case 2:

> +		return 0;

A nice touch here would be to
WARN_ON(fw_domains & ~dev_priv->uncore.forcewake_domains);

R-b on the engine->fw_domains_*, a-b on the split, but let's try to use
the compiler to our advantage on that step.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list