[Intel-gfx] [PATCH] i915: pm: Be less agressive with clockfreq changes on Bay Trail

Hans de Goede hdegoede at redhat.com
Fri Nov 10 14:59:24 UTC 2017


Hi,

On 10-11-17 15:56, Chris Wilson wrote:
> Quoting Hans de Goede (2017-11-09 18:51:01)
>> Bay Trail devices are known to hang when changing the frequency often,
>> this is discussed in great length in:
>> https://bugzilla.kernel.org/show_bug.cgi?id=109051
>>
>> Commit 6067a27d1f01 ("drm/i915: Avoid tweaking evaluation thresholds
>> on Baytrail v3") is an attempt to workaround this. Several users in
>> bko109051 report that an earlier version of this patch, v1:
>> https://bugzilla.kernel.org/attachment.cgi?id=251471
>>
>> Works better for them and they still see hangs with the merged v3.
>>
>> Comparing the 2 versions shows that they are indeed not equivalent,
>> v1 not only skips writing the GEN6_RP* registers from valleyview_set_rps,
>> as v3 does. It also contained these modifications to i915_irq.c:
>>
>>       if (pm_iir & GEN6_PM_RP_DOWN_EI_EXPIRED) {
>>           if (!vlv_c0_above(dev_priv,
>>                     &dev_priv->rps.down_ei, &now,
>> -                  dev_priv->rps.down_threshold))
>> +                  VLV_RP_DOWN_EI_THRESHOLD))
>>               events |= GEN6_PM_RP_DOWN_THRESHOLD;
>>           dev_priv->rps.down_ei = now;
>>       }
>>
>>       if (pm_iir & GEN6_PM_RP_UP_EI_EXPIRED) {
>>           if (vlv_c0_above(dev_priv,
>>                    &dev_priv->rps.up_ei, &now,
>> -                 dev_priv->rps.up_threshold))
>> +                 VLV_RP_UP_EI_THRESHOLD))
>>               events |= GEN6_PM_RP_UP_THRESHOLD;
>>           dev_priv->rps.up_ei = now;
>>       }
>>
>> Which use less aggressive up/down thresholds, which results in less
>> GEN6_PM_RP_*_THRESHOLD events and thus in less calls to intel_set_rps() ->
>> valleyview_set_rps() -> vlv_punit_write(PUNIT_REG_GPU_FREQ_REQ).
>> With the last call being the likely cause of the hang.
>>
>> This commit hardcodes the threshold_up and _down values for Bay Trail to
>> less aggressive values, reducing the amount of clock frequency changes,
>> thus avoiding the hangs some people are still seeing with the merged fix.
>>
>> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=109051
>> Signed-off-by: Hans de Goede <hdegoede at redhat.com>
>> ---
>>   drivers/gpu/drm/i915/i915_reg.h | 3 +++
>>   drivers/gpu/drm/i915/intel_pm.c | 5 ++++-
>>   2 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 68a58cce6ab1..2561af075ebb 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1355,6 +1355,9 @@ enum i915_power_well_id {
>>   #define        VLV_BIAS_CPU_125_SOC_875 (6 << 2)
>>   #define        CHV_BIAS_CPU_50_SOC_50 (3 << 2)
>>   
>> +#define VLV_RP_UP_EI_THRESHOLD                 90
>> +#define VLV_RP_DOWN_EI_THRESHOLD               70
>> +
>>   /* vlv2 north clock has */
>>   #define CCK_FUSE_REG                           0x8
>>   #define  CCK_FUSE_HPLL_FREQ_MASK               0x3
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 01966b89be14..177b6caa0a38 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -6096,8 +6096,11 @@ static void gen6_set_rps_thresholds(struct drm_i915_private *dev_priv, u8 val)
>>          /* When byt can survive without system hang with dynamic
>>           * sw freq adjustments, this restriction can be lifted.
>>           */
>> -       if (IS_VALLEYVIEW(dev_priv))
>> +       if (IS_VALLEYVIEW(dev_priv)) {
>> +               threshold_up = VLV_RP_UP_EI_THRESHOLD;
>> +               threshold_down = VLV_RP_DOWN_EI_THRESHOLD;
> 
> Basically you want a limit on the frequency of ... pcode access?
> As presented, you ultimately do not trust any write and the only
> solution is to disable all writes. No RPS whatsoever, run at max and
> hope rc6 works (maybe even decrease the rc6 threshold).
> 
> One of the ideas we floated was moving the pcode access to a worker and
> ratelimiting the updates.

Yes ratelimiting the amount of vlv_punit_write(PUNIT_REG_GPU_FREQ_REQ)
calls is also something which came to my mind as an alternative solution.

As I tried to explain in the commit message my main inspiration for this
patch was that people were reporting different success rates with v1
and v3 of the Commit 6067a27d1f01 ("drm/i915: Avoid tweaking evaluation thresholds
on Baytrail v3") patch.

Regards,

Hans



> -Chris
> 


More information about the Intel-gfx mailing list