[Intel-gfx] [PATCH v5 2/2] drm/i915: Acquire PUNIT->PMIC bus for intel_uncore_forcewake_reset()

Ingo Molnar mingo at kernel.org
Tue Oct 31 11:54:02 UTC 2017


* Daniel Vetter <daniel at ffwll.ch> wrote:

> On Tue, Oct 31, 2017 at 10:50:06AM +0100, Ingo Molnar wrote:
> > 
> > * Hans de Goede <j.w.r.degoede at gmail.com> wrote:
> > 
> > > intel_uncore_forcewake_reset() does forcewake puts and gets as such
> > > we need to make sure that no-one tries to access the PUNIT->PMIC bus
> > > (on systems where this bus is shared) while it runs, otherwise bad
> > > things happen.
> > > 
> > > Normally this is taken care of by the i915_pmic_bus_access_notifier()
> > > which does an intel_uncore_forcewake_get(FORCEWAKE_ALL) when some other
> > > driver tries to access the PMIC bus, so that later forcewake gets are
> > > no-ops (for the duration of the bus access).
> > > 
> > > But intel_uncore_forcewake_reset gets called in 3 cases:
> > > 1) Before registering the pmic_bus_access_notifier
> > > 2) After unregistering the pmic_bus_access_notifier
> > > 3) To reset forcewake state on a GPU reset
> > > 
> > > In all 3 cases the i915_pmic_bus_access_notifier() protection is
> > > insufficient.
> > > 
> > > This commit fixes this race by calling iosf_mbi_punit_acquire() before
> > > calling intel_uncore_forcewake_reset(). In the case where it is called
> > > directly after unregistering the pmic_bus_access_notifier, we need to
> > > hold the punit-lock over both calls to avoid a race where
> > > intel_uncore_fw_release_timer() may execute between the 2 calls.
> > > 
> > > Signed-off-by: Hans de Goede <hdegoede at redhat.com>
> > > Reviewed-by: Imre Deak <imre.deak at intel.com>
> > > ---
> > > Changes in v2:
> > > -Rebase on current (July 6th 2017) drm-next
> > > 
> > > Changes in v3:
> > > -Keep punit acquired / locked over the unregister + forcewake_reset
> > >  call combo to avoid a race hitting between the 2 calls
> > > -This requires modifying iosf_mbi_unregister_pmic_bus_access_notifier
> > >  to not take the lock itself, since we are the only users this is done
> > >  in this same commit
> > > 
> > > Changes in v4:
> > > -Fix missing rename in doc-comment
> > > -Add and use iosf_mbi_assert_punit_acquired() helper
> > > -Add missing acquire surrounding intel_uncore_forcewake_reset() inside
> > >  intel_uncore_check_forcewake_domains()
> > > -Add Imre's Reviewed-by
> > > 
> > > Changes in v5:
> > > -Separate out arch/x86 iosf_mbi changes into a separate patch
> > > ---
> > >  drivers/gpu/drm/i915/intel_uncore.c           | 17 +++++++++++++----
> > >  drivers/gpu/drm/i915/selftests/intel_uncore.c |  3 +++
> > >  2 files changed, 16 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> > > index 8c2ce81f01c2..0da81faf3981 100644
> > > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > > @@ -229,6 +229,7 @@ intel_uncore_fw_release_timer(struct hrtimer *timer)
> > >  	return HRTIMER_NORESTART;
> > >  }
> > >  
> > > +/* Note callers must have acquired the PUNIT->PMIC bus, before calling this. */
> > >  static void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv,
> > >  					 bool restore)
> > >  {
> > > @@ -237,6 +238,8 @@ static void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv,
> > >  	int retry_count = 100;
> > >  	enum forcewake_domains fw, active_domains;
> > >  
> > > +	iosf_mbi_assert_punit_acquired();
> > > +
> > >  	/* Hold uncore.lock across reset to prevent any register access
> > >  	 * with forcewake not set correctly. Wait until all pending
> > >  	 * timers are run before holding.
> > > @@ -416,14 +419,18 @@ static void __intel_uncore_early_sanitize(struct drm_i915_private *dev_priv,
> > >  				   GT_FIFO_CTL_RC6_POLICY_STALL);
> > >  	}
> > >  
> > > +	iosf_mbi_punit_acquire();
> > >  	intel_uncore_forcewake_reset(dev_priv, restore_forcewake);
> > > +	iosf_mbi_punit_release();
> > >  }
> > >  
> > >  void intel_uncore_suspend(struct drm_i915_private *dev_priv)
> > >  {
> > > -	iosf_mbi_unregister_pmic_bus_access_notifier(
> > > +	iosf_mbi_punit_acquire();
> > > +	iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(
> > >  		&dev_priv->uncore.pmic_bus_access_nb);
> > >  	intel_uncore_forcewake_reset(dev_priv, false);
> > > +	iosf_mbi_punit_release();
> > >  }
> > >  
> > >  void intel_uncore_resume_early(struct drm_i915_private *dev_priv)
> > > @@ -1315,12 +1322,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
> > >  
> > >  void intel_uncore_fini(struct drm_i915_private *dev_priv)
> > >  {
> > > -	iosf_mbi_unregister_pmic_bus_access_notifier(
> > > -		&dev_priv->uncore.pmic_bus_access_nb);
> > > -
> > >  	/* Paranoia: make sure we have disabled everything before we exit. */
> > >  	intel_uncore_sanitize(dev_priv);
> > > +
> > > +	iosf_mbi_punit_acquire();
> > > +	iosf_mbi_unregister_pmic_bus_access_notifier_unlocked(
> > > +		&dev_priv->uncore.pmic_bus_access_nb);
> > >  	intel_uncore_forcewake_reset(dev_priv, false);
> > > +	iosf_mbi_punit_release();
> > >  }
> > >  
> > >  static const struct reg_whitelist {
> > > diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> > > index 3cac22eb47ce..733d87fe7737 100644
> > > --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
> > > +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> > > @@ -148,7 +148,10 @@ static int intel_uncore_check_forcewake_domains(struct drm_i915_private *dev_pri
> > >  	for_each_set_bit(offset, valid, FW_RANGE) {
> > >  		i915_reg_t reg = { offset };
> > >  
> > > +		iosf_mbi_punit_acquire();
> > >  		intel_uncore_forcewake_reset(dev_priv, false);
> > > +		iosf_mbi_punit_release();
> > > +
> > >  		check_for_unclaimed_mmio(dev_priv);
> > >  
> > >  		(void)I915_READ(reg);
> > 
> > This patch looks like one massive layering violation. Why does the GPU code muck 
> > with the uncore hardware?
> 
> Because the hw is an even worse layering violation. There's way too much
> "magic" stuff going on in the background, which then sometimes doesn't
> work and we end up implementing hacks in drivers to paper over it.

Ok, fair enough I guess:

  Acked-by: Ingo Molnar <mingo at kernel.org>

Thanks,

	Ingo


More information about the dri-devel mailing list