[Intel-gfx] [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded

Chris Wilson chris at chris-wilson.co.uk
Thu Nov 30 09:45:25 UTC 2017


Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> It seems that the DMC likes to transition between the DC states a lot when
> there are no connected displays (no active power domains) during command
> submission.
> 
> This activity on DC states has a negative impact on the performance of the
> chip with huge latencies observed in the interrupt handlers and elsewhere.
> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> eight.
> 
> Work around it by introducing a new power domain named,
> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> held for the duration of command submission activity.
> 
> v2:
>  * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
>  * Protect macro body with braces. (Jani Nikula)
> 
> v3:
>  * Add dedicated power domain for clarity. (Chris, Imre)
>  * Commit message and comment text updates.
>  * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
>    firmware release.
> 
> v4:
>  * Power domain should be inner to device runtime pm. (Chris)
>  * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
>  * Handle async DMC loading by moving the GT_IRQ power domain logic into
>    intel_runtime_pm. (Daniel, Chris)
>  * Include small core GEN9 as well. (Imre)
> 
> v5
>  * Special handling for async DMC load is not needed since on failure the
>    power domain reference is kept permanently taken. (Imre)
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak at intel.com>
> Acked-by: Chris Wilson <chris at chris-wilson.co.uk> (v2)
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin at intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  4 ++++
>  drivers/gpu/drm/i915/i915_gem.c         |  3 +++
>  drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
>  4 files changed, 35 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bddd65839f60..59cf11dd5d3b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
>         POWER_DOMAIN_AUX_D,
>         POWER_DOMAIN_GMBUS,
>         POWER_DOMAIN_MODESET,
> +       POWER_DOMAIN_GT_IRQ,
>         POWER_DOMAIN_INIT,
>  
>         POWER_DOMAIN_NUM,
> @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
>  #define GT_FREQUENCY_MULTIPLIER 50
>  #define GEN9_FREQ_SCALER 3
>  
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> +       (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
> +
>  #include "i915_trace.h"
>  
>  static inline bool intel_vtd_active(void)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 354b0546a191..c6067cba1dca 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  
>         if (INTEL_GEN(dev_priv) >= 6)
>                 gen6_rps_idle(dev_priv);
> +
> +       intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> +
>         intel_runtime_pm_put(dev_priv);
>  out_unlock:
>         mutex_unlock(&dev_priv->drm.struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index a90bdd26571f..c28a4ceb016d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
>         GEM_BUG_ON(!i915->gt.active_requests);
>  
>         intel_runtime_pm_get_noresume(i915);
> +
> +       /*
> +        * It seems that the DMC likes to transition between the DC states a lot
> +        * when there are no connected displays (no active power domains) during
> +        * command submission.
> +        *
> +        * This activity has negative impact on the performance of the chip with
> +        * huge latencies observed in the interrupt handler and elsewhere.
> +        *
> +        * Work around it by grabbing a GT IRQ power domain whilst there is any
> +        * GT activity, preventing any DC state transitions.
> +        */
> +       intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +
>         i915->gt.awake = true;
>  
>         intel_enable_gt_powersave(i915);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 8315499452dc..48ad0828ace7 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
>                 return "INIT";
>         case POWER_DOMAIN_MODESET:
>                 return "MODESET";
> +       case POWER_DOMAIN_GT_IRQ:
> +               return "GT_IRQ";
>         default:
>                 MISSING_CASE(domain);
>                 return "?";
> @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
>  {
>         struct i915_power_domains *power_domains = &dev_priv->power_domains;
>  
> +       if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> +               return;

I was hoping that in the magic of the powerdomains, this would just
evaporate (or at least find its own place within the powerwell enabling).

If Imre is happy with plonking this here,
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>

Or at least good enough for now and can be improved upon?
-Chris


More information about the Intel-gfx mailing list