[Intel-gfx] [PATCH 1/2] drm/i915: Stop using DC states if firmware disagrees on the state

Mika Kuoppala mika.kuoppala at linux.intel.com
Wed Jan 20 01:49:51 PST 2016


Chris Wilson <chris at chris-wilson.co.uk> writes:

> On Tue, Jan 19, 2016 at 09:50:08PM +0200, Mika Kuoppala wrote:
>> Sometimes we get dmesg warnings claiming that DC6 was already
>> enabled prior to our enabling. Investigations using readback of
>> the written dc state value indicate that sometimes when we disable
>> the dc6, the write doesn't stick, or it does but for a very short time.
>> 
>> This leads to state keeping confusion between firmware and driver,
>> driver thinking that dc6 is off and proceeds with modesets
>> while firmware still keeps/thinks dc6 is on. Evidence from logs
>> suggests that the dc6 is still on as flips start to timeout,
>> leading to eventual gpu hang.
>> 
>> Further complication is that to recover the hang, we reset while
>> the DC6 is enabled. With this state on the reinit of the gpu side
>> won't come out clean and rings rehang right after the init.
>> 
>> Harden the detection of this situation and immediately disable
>> DC6 if we disagree on state with the firmware. This should make it
>> less probable for driver to do end up in a wrong conclusion and
>> let things like modeset and gpu reset proceed with dc6 still
>> enabled.
>> 
>> References: https://bugs.freedesktop.org/show_bug.cgi?id=93768
>> Suggested-by: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Patrik Jakobsson <patrik.jakobsson at linux.intel.com>
>> Cc: Imre Deak <imre.deak at intel.com>
>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>> ---
>>  drivers/gpu/drm/i915/intel_runtime_pm.c | 31 +++++++++++++++++++++++++++----
>>  1 file changed, 27 insertions(+), 4 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> index bbca527184d0..5a21453e38e1 100644
>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> @@ -470,9 +470,23 @@ static void gen9_set_dc_state_debugmask_memory_up(
>>  	}
>>  }
>>  
>> +static int gen9_dc_state_verified(struct drm_i915_private *dev_priv,
>> +				  u32 mask, u32 state)
>> +{
>> +	int i;
>> +
>> +	/* Observe the dc state with handful of reads */
>> +	for (i = 0; i < 3; i++) {
>> +		if (wait_for((I915_READ(DC_STATE_EN) & mask) == state, 100))
>> +			return false;
>> +	}
>
> Why two loops of reads?
>

I once saw the write stick but for a very short time, so with one wait_for we
might miss it.

> Since this is equivalent to
> 	return wait_for((I915_READ(DC_STATE_EN) & mask) == state, 300) == 0;
>
> In this case
>
> WRITE(DC_STATE_EN, val & ~mask | state);
> if (wait_for((READ(DC_STATE_END) & mask) = state, 300) {
> 	/* oh noes */
> }
>
> is more obvious (since we verify with a read of the same register, not
> some magic ack sequence).

I could open code the 2 checks here, would be more obvious yes.

Btw what led you to think that the state doesn't stick? Looking
at the code for correctness and then deducting that if that warn
happens, there is no other explanation?

This is all duct tape until we understand why this
interface doesn't work as advertized.

-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list