[Intel-gfx] [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Nov 30 09:55:30 UTC 2017
On 30/11/2017 09:45, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> It seems that the DMC likes to transition between the DC states a lot when
>> there are no connected displays (no active power domains) during command
>> submission.
>>
>> This activity on DC states has a negative impact on the performance of the
>> chip with huge latencies observed in the interrupt handlers and elsewhere.
>> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
>> eight.
>>
>> Work around it by introducing a new power domain named,
>> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
>> held for the duration of command submission activity.
>>
>> v2:
>> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
>> * Protect macro body with braces. (Jani Nikula)
>>
>> v3:
>> * Add dedicated power domain for clarity. (Chris, Imre)
>> * Commit message and comment text updates.
>> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
>> firmware release.
>>
>> v4:
>> * Power domain should be inner to device runtime pm. (Chris)
>> * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
>> * Handle async DMC loading by moving the GT_IRQ power domain logic into
>> intel_runtime_pm. (Daniel, Chris)
>> * Include small core GEN9 as well. (Imre)
>>
>> v5
>> * Special handling for async DMC load is not needed since on failure the
>> power domain reference is kept permanently taken. (Imre)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
>> Testcase: igt/gem_exec_nop/headless
>> Cc: Imre Deak <imre.deak at intel.com>
>> Acked-by: Chris Wilson <chris at chris-wilson.co.uk> (v2)
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin at intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 4 ++++
>> drivers/gpu/drm/i915/i915_gem.c | 3 +++
>> drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
>> drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
>> 4 files changed, 35 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index bddd65839f60..59cf11dd5d3b 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
>> POWER_DOMAIN_AUX_D,
>> POWER_DOMAIN_GMBUS,
>> POWER_DOMAIN_MODESET,
>> + POWER_DOMAIN_GT_IRQ,
>> POWER_DOMAIN_INIT,
>>
>> POWER_DOMAIN_NUM,
>> @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
>> #define GT_FREQUENCY_MULTIPLIER 50
>> #define GEN9_FREQ_SCALER 3
>>
>> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
>> + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
>> +
>> #include "i915_trace.h"
>>
>> static inline bool intel_vtd_active(void)
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 354b0546a191..c6067cba1dca 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>
>> if (INTEL_GEN(dev_priv) >= 6)
>> gen6_rps_idle(dev_priv);
>> +
>> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
>> +
>> intel_runtime_pm_put(dev_priv);
>> out_unlock:
>> mutex_unlock(&dev_priv->drm.struct_mutex);
>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>> index a90bdd26571f..c28a4ceb016d 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>> @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
>> GEM_BUG_ON(!i915->gt.active_requests);
>>
>> intel_runtime_pm_get_noresume(i915);
>> +
>> + /*
>> + * It seems that the DMC likes to transition between the DC states a lot
>> + * when there are no connected displays (no active power domains) during
>> + * command submission.
>> + *
>> + * This activity has negative impact on the performance of the chip with
>> + * huge latencies observed in the interrupt handler and elsewhere.
>> + *
>> + * Work around it by grabbing a GT IRQ power domain whilst there is any
>> + * GT activity, preventing any DC state transitions.
>> + */
>> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
>> +
>> i915->gt.awake = true;
>>
>> intel_enable_gt_powersave(i915);
>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> index 8315499452dc..48ad0828ace7 100644
>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
>> return "INIT";
>> case POWER_DOMAIN_MODESET:
>> return "MODESET";
>> + case POWER_DOMAIN_GT_IRQ:
>> + return "GT_IRQ";
>> default:
>> MISSING_CASE(domain);
>> return "?";
>> @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
>> {
>> struct i915_power_domains *power_domains = &dev_priv->power_domains;
>>
>> + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
>> + return;
>
> I was hoping that in the magic of the powerdomains, this would just
> evaporate (or at least find its own place within the powerwell enabling).
>
> If Imre is happy with plonking this here,
> Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
>
> Or at least good enough for now and can be improved upon?
One of my earlier approaches was to remove the domain from the power
well bitmask when the DMC load fails, but I wasn't sure if the power
well arrays should (could?) really be const, and how ugly it would be to
expose some API to allow external callers to do that. Not sure, Imre
indeed to the rescue please. :)
Regards,
Tvrtko
More information about the Intel-gfx
mailing list