[Intel-gfx] [PATCH 1/2] drm/i915: Stop using DC states if firmware disagrees on the state
Chris Wilson
chris at chris-wilson.co.uk
Wed Jan 20 02:17:27 PST 2016
On Wed, Jan 20, 2016 at 11:49:51AM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
> > On Tue, Jan 19, 2016 at 09:50:08PM +0200, Mika Kuoppala wrote:
> >> Sometimes we get dmesg warnings claiming that DC6 was already
> >> enabled prior to our enabling. Investigations using readback of
> >> the written dc state value indicate that sometimes when we disable
> >> the dc6, the write doesn't stick, or it does but for a very short time.
> >>
> >> This leads to state keeping confusion between firmware and driver,
> >> driver thinking that dc6 is off and proceeds with modesets
> >> while firmware still keeps/thinks dc6 is on. Evidence from logs
> >> suggests that the dc6 is still on as flips start to timeout,
> >> leading to eventual gpu hang.
> >>
> >> Further complication is that to recover the hang, we reset while
> >> the DC6 is enabled. With this state on the reinit of the gpu side
> >> won't come out clean and rings rehang right after the init.
> >>
> >> Harden the detection of this situation and immediately disable
> >> DC6 if we disagree on state with the firmware. This should make it
> >> less probable for driver to do end up in a wrong conclusion and
> >> let things like modeset and gpu reset proceed with dc6 still
> >> enabled.
> >>
> >> References: https://bugs.freedesktop.org/show_bug.cgi?id=93768
> >> Suggested-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Patrik Jakobsson <patrik.jakobsson at linux.intel.com>
> >> Cc: Imre Deak <imre.deak at intel.com>
> >> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> >> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
> >> ---
> >> drivers/gpu/drm/i915/intel_runtime_pm.c | 31 +++++++++++++++++++++++++++----
> >> 1 file changed, 27 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> >> index bbca527184d0..5a21453e38e1 100644
> >> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> >> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> >> @@ -470,9 +470,23 @@ static void gen9_set_dc_state_debugmask_memory_up(
> >> }
> >> }
> >>
> >> +static int gen9_dc_state_verified(struct drm_i915_private *dev_priv,
> >> + u32 mask, u32 state)
> >> +{
> >> + int i;
> >> +
> >> + /* Observe the dc state with handful of reads */
> >> + for (i = 0; i < 3; i++) {
> >> + if (wait_for((I915_READ(DC_STATE_EN) & mask) == state, 100))
> >> + return false;
> >> + }
> >
> > Why two loops of reads?
> >
>
> I once saw the write stick but for a very short time, so with one wait_for we
> might miss it.
>
> > Since this is equivalent to
> > return wait_for((I915_READ(DC_STATE_EN) & mask) == state, 300) == 0;
> >
> > In this case
> >
> > WRITE(DC_STATE_EN, val & ~mask | state);
> > if (wait_for((READ(DC_STATE_END) & mask) = state, 300) {
> > /* oh noes */
> > }
> >
> > is more obvious (since we verify with a read of the same register, not
> > some magic ack sequence).
>
> I could open code the 2 checks here, would be more obvious yes.
>
> Btw what led you to think that the state doesn't stick? Looking
> at the code for correctness and then deducting that if that warn
> happens, there is no other explanation?
Yes. It has also been used in the past as a form of acknowledging state
changes, though usually it is a separate bit (i.e. a request bit and a
current state bit).
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list