[Intel-gfx] [PATCH] drm/i915: Only grab correct forcewake for the engine with execlists
Chris Wilson
chris at chris-wilson.co.uk
Wed Apr 6 15:12:34 UTC 2016
On Wed, Apr 06, 2016 at 03:49:55PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> Rather than blindly waking up all forcewake domains on command
> submission, we can teach each engine what is (or are) the correct
> one to take.
>
> On platforms with multiple forcewake domains like VLV, CHV, SKL
> and BXT, this has the potential of lowering the GPU and CPU power
> use and submission latency.
>
> To implement it, we extract the code which holds the built in
> knowledge on which forcewake domains are applicable for each
> registers from the MMIO accessors into separate macros, and
> implement two new functions named intel_reg_read_fw_domains and
> intel_reg_read_fw_domains.
>
> These enables callers, in this case the execlists engine setup
> code, to query which forcewake domains are relevant per engine
> on the currently running platform.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 6 +
> drivers/gpu/drm/i915/intel_lrc.c | 18 +-
> drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
> drivers/gpu/drm/i915/intel_uncore.c | 306 +++++++++++++++++++++-----------
> 4 files changed, 226 insertions(+), 105 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a1f78f275c55..311217e7f917 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -633,6 +633,12 @@ enum forcewake_domains {
> FORCEWAKE_MEDIA)
> };
>
> +enum forcewake_domains
> +intel_reg_read_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg);
> +
> +enum forcewake_domains
> +intel_reg_write_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg);
> +
> struct intel_uncore_funcs {
> void (*force_wake_get)(struct drm_i915_private *dev_priv,
> enum forcewake_domains domains);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index a1db6a02cf23..cac387f38cf6 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -418,6 +418,7 @@ static void execlists_submit_requests(struct drm_i915_gem_request *rq0,
> struct drm_i915_gem_request *rq1)
> {
> struct drm_i915_private *dev_priv = rq0->i915;
> + unsigned int fw_domains = rq0->engine->fw_domains_elsp;
So I was thinking this was overly specific and that no hw engineer would
ever split the block of ring registers into separate power domains, and
then I realised they already had with the shadow_regs.
The only other place where we could make semantic use of this is in
intel_ringbuffer.c:init_ring_common. We don't make use of the block fw
put/get anywhere else - the only other candidate I can think of right
now would be gen6 bsd legacy tail writes. That is a block of register
writes were marking the intent to stay awake through the entire sequence
would be worth it.
> +#define __gen6_reg_read_fw_domains(offset) \
> +({ \
> + enum forcewake_domains __fwd; \
> + if (NEEDS_FORCE_WAKE(offset)) \
> + __fwd = FORCEWAKE_RENDER; \
> + else \
> + __fwd = 0; \
> + __fwd; \
> +})
[snip]
This split out looks right, but it is probably worth doing as a seperate
step and checking for changes in objdump. Just to be paranoid that we
didn't make an unexpected modification.
> +enum forcewake_domains
> +intel_reg_read_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg)
> +{
> + if (intel_vgpu_active(dev_priv->dev))
> + return 0;
> +
> + switch (INTEL_INFO(dev_priv)->gen) {
> + case 9:
> + return __gen9_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> + case 8:
> + if (IS_CHERRYVIEW(dev_priv))
> + return __chv_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> + else
> + return __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> + case 7:
> + case 6:
> + if (IS_VALLEYVIEW(dev_priv))
> + return __vlv_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> + else
> + return __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg));
> + default:
> + /* It is a bug to think about forcewake on older platforms. */
Yes, but we might as well report them correctly as 0 and not throw a
warning. You are calling this an intel_() function, and not gen6_()
after all (that is inviting everybody to use it).
default:
> + MISSING_CASE(INTEL_INFO(dev_priv)->gen);
case 5: /* forcewake was introduced with gen6 */
case 4:
case 3:
case 2:
> + return 0;
A nice touch here would be to
WARN_ON(fw_domains & ~dev_priv->uncore.forcewake_domains);
R-b on the engine->fw_domains_*, a-b on the split, but let's try to use
the compiler to our advantage on that step.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list