[Intel-gfx] [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
Imre Deak
imre.deak at intel.com
Thu Nov 30 11:19:55 UTC 2017
On Thu, Nov 30, 2017 at 09:45:25AM +0000, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
> > From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >
> > It seems that the DMC likes to transition between the DC states a lot when
> > there are no connected displays (no active power domains) during command
> > submission.
> >
> > This activity on DC states has a negative impact on the performance of the
> > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > eight.
> >
> > Work around it by introducing a new power domain named,
> > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > held for the duration of command submission activity.
> >
> > v2:
> > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > * Protect macro body with braces. (Jani Nikula)
> >
> > v3:
> > * Add dedicated power domain for clarity. (Chris, Imre)
> > * Commit message and comment text updates.
> > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > firmware release.
> >
> > v4:
> > * Power domain should be inner to device runtime pm. (Chris)
> > * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> > * Handle async DMC loading by moving the GT_IRQ power domain logic into
> > intel_runtime_pm. (Daniel, Chris)
> > * Include small core GEN9 as well. (Imre)
> >
> > v5
> > * Special handling for async DMC load is not needed since on failure the
> > power domain reference is kept permanently taken. (Imre)
> >
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > Testcase: igt/gem_exec_nop/headless
> > Cc: Imre Deak <imre.deak at intel.com>
> > Acked-by: Chris Wilson <chris at chris-wilson.co.uk> (v2)
> > Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin at intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_drv.h | 4 ++++
> > drivers/gpu/drm/i915/i915_gem.c | 3 +++
> > drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> > drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
> > 4 files changed, 35 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index bddd65839f60..59cf11dd5d3b 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > POWER_DOMAIN_AUX_D,
> > POWER_DOMAIN_GMBUS,
> > POWER_DOMAIN_MODESET,
> > + POWER_DOMAIN_GT_IRQ,
> > POWER_DOMAIN_INIT,
> >
> > POWER_DOMAIN_NUM,
> > @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
> > #define GT_FREQUENCY_MULTIPLIER 50
> > #define GEN9_FREQ_SCALER 3
> >
> > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
For all other platforms the GT_IRQ domain won't be mapped making
display_power_get/put on those just a domain ref inc/dec, otherwise a
nop.
> > +
> > #include "i915_trace.h"
> >
> > static inline bool intel_vtd_active(void)
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 354b0546a191..c6067cba1dca 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >
> > if (INTEL_GEN(dev_priv) >= 6)
> > gen6_rps_idle(dev_priv);
> > +
> > + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> > +
> > intel_runtime_pm_put(dev_priv);
> > out_unlock:
> > mutex_unlock(&dev_priv->drm.struct_mutex);
> > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > index a90bdd26571f..c28a4ceb016d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> > GEM_BUG_ON(!i915->gt.active_requests);
> >
> > intel_runtime_pm_get_noresume(i915);
> > +
> > + /*
> > + * It seems that the DMC likes to transition between the DC states a lot
> > + * when there are no connected displays (no active power domains) during
> > + * command submission.
> > + *
> > + * This activity has negative impact on the performance of the chip with
> > + * huge latencies observed in the interrupt handler and elsewhere.
> > + *
> > + * Work around it by grabbing a GT IRQ power domain whilst there is any
> > + * GT activity, preventing any DC state transitions.
> > + */
> > + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> > +
> > i915->gt.awake = true;
> >
> > intel_enable_gt_powersave(i915);
> > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > index 8315499452dc..48ad0828ace7 100644
> > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> > return "INIT";
> > case POWER_DOMAIN_MODESET:
> > return "MODESET";
> > + case POWER_DOMAIN_GT_IRQ:
> > + return "GT_IRQ";
> > default:
> > MISSING_CASE(domain);
> > return "?";
> > @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
> > {
> > struct i915_power_domains *power_domains = &dev_priv->power_domains;
> >
> > + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> > + return;
>
> I was hoping that in the magic of the powerdomains, this would just
> evaporate (or at least find its own place within the powerwell enabling).
Yep, that would be "normally" the case. But since we have only a single
set of domains->powerwells mapping for all GEN9_BC we can't isolate SKL
that way w/o duplicating the mapping for SKL. Although all GEN9_BC are
the same from power well handling POV for some reason there are still
separate DMC firmwares for KBL/CFL and SKL. The corruption problem is
only fixed in the KBL/CFL firmware the SKL one still to be added to
linux-firmware.git .. At that point we can also remove these checks.
>
> If Imre is happy with plonking this here,
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> Or at least good enough for now and can be improved upon?
Yes, looks ok and can adjust for SKL later.
> -Chris
More information about the Intel-gfx
mailing list