[Intel-gfx] [PATCH 18/49] drm/i915: Reduce frequency of unspecific HSW reg debugging
Chris Wilson
chris at chris-wilson.co.uk
Fri Mar 27 09:12:16 PDT 2015
On Fri, Mar 27, 2015 at 12:34:05PM -0300, Paulo Zanoni wrote:
> 2015-03-27 8:01 GMT-03:00 Chris Wilson <chris at chris-wilson.co.uk>:
> > Delay the expensive read on the FPGA_DBG register from once per mmio to
> > once per forcewake section when we are doing the general wellbeing
> > check rather than the targetted error detection. This almost reduces
> > the overhead of the debug facility (for example when submitting execlists)
> > to zero whilst keeping the debug checks around.
> >
> > v2: Enable one-shot mmio debugging from the interrupt check as well as a
> > safeguard to catch invalid display writes from outside the powerwell.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > Cc: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_uncore.c | 56 ++++++++++++++++++++-----------------
> > 1 file changed, 30 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> > index ab5cc94588e1..0e32bbbcada8 100644
> > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > @@ -149,6 +149,30 @@ fw_domains_put(struct drm_i915_private *dev_priv, enum forcewake_domains fw_doma
> > }
> >
> > static void
> > +hsw_unclaimed_reg_detect(struct drm_i915_private *dev_priv)
> > +{
> > + static bool mmio_debug_once = true;
> > +
> > + if (i915.mmio_debug || !mmio_debug_once)
> > + return;
> > +
> > + if (__raw_i915_read32(dev_priv, FPGA_DBG) & FPGA_DBG_RM_NOCLAIM) {
> > + DRM_DEBUG("Unclaimed register detected, "
> > + "enabling oneshot unclaimed register reporting. "
> > + "Please use i915.mmio_debug=N for more information.\n");
> > + __raw_i915_write32(dev_priv, FPGA_DBG, FPGA_DBG_RM_NOCLAIM);
> > + i915.mmio_debug = mmio_debug_once--;
> > + }
> > +}
> > +
> > +static void
> > +fw_domains_put_debug(struct drm_i915_private *dev_priv, enum forcewake_domains fw_domains)
> > +{
> > + hsw_unclaimed_reg_detect(dev_priv);
> > + fw_domains_put(dev_priv, fw_domains);
> > +}
>
> This means we won't check during the forcewake puts that are on the
> register read/write macros. Is this intentional?
Not really. But the check still catches any mistakes there even though
we know they are by safe.
> I tried checking the
> FW code calls, and it seems to me that we're not really going to run
> hsw_unclaimed_reg_detect very frequently anymore. I wonder if there's
> the risk of staying a long time without running it. But maybe I'm just
> wrong.
It gets run after every set of register writes (where set is defined as
activity on a single cpu within 1ms). It gets run before the powerwell
is disabled. Look at the profiles, you will see that hsw detect is still
called quite frequently. And by virtue it does not need to be run very
often to catch issues anyway.
> > + if (HAS_FPGA_DBG_UNCLAIMED(dev) &&
> > + dev_priv->uncore.funcs.force_wake_put == fw_domains_put)
>
> My fear here is that simple changes to the FW code by a future
> programmer could unintentionally kill the unclaimed register detection
> feature, and we probably wouldn't notice for a looong time. Why not
> just omit this fw_domains_put check, since it is true for all
> platforms where HAS_FPG_DBG_UNCLAIMED is also true? The side effect of
> calling fw_domains_put() when we shouldn't is probably more noticeable
> than having unclaimed register detection gone.
Pardon?
> > + dev_priv->uncore.funcs.force_wake_put = fw_domains_put_debug;
> > +
> > /* All future platforms are expected to require complex power gating */
> > WARN_ON(dev_priv->uncore.fw_domains == 0);
> > }
> > @@ -1411,11 +1420,6 @@ int intel_gpu_reset(struct drm_device *dev)
> >
> > void intel_uncore_check_errors(struct drm_device *dev)
> > {
> > - struct drm_i915_private *dev_priv = dev->dev_private;
> > -
> > - if (HAS_FPGA_DBG_UNCLAIMED(dev) &&
> > - (__raw_i915_read32(dev_priv, FPGA_DBG) & FPGA_DBG_RM_NOCLAIM)) {
> > - DRM_ERROR("Unclaimed register before interrupt\n");
> > - __raw_i915_write32(dev_priv, FPGA_DBG, FPGA_DBG_RM_NOCLAIM);
> > - }
> > + if (HAS_FPGA_DBG_UNCLAIMED(dev))
> > + hsw_unclaimed_reg_detect(to_i915(dev));
>
> This means we won't check for unclaimed registers during interrupts if
> i915.mmio_debug is being used, which is probably not what we want.
It's exactly what you want. It still does the debug check if you have
mmio_debug unset. If you have mmio_debug set, it means you are debugging
i915 register mmio, since that is all we can reliably debug.
> One of the things that worries me a little is that now we'll be
> running a mostly-display-related feature only when certain pieces of
> non-display-related code run. Instead of doing the checks at the
> forcewake puts, maybe we could tie the frequency of the checks to
> something in the display code, or just do the check every X seconds. I
> don't really know what would be ideal here, I'm just throwing the
> ideas. I'm also not blocking this patch, just pointing things that
> could maybe be improved.
Sure, all you would need to do is add the check to every rpm_put() if you
feel paranoid (it will be run before the powerwell is dropped by
design).
> Since it's impacting performance, perhaps we could even completely
> kill unclaimed register detection from the normal use case, hiding it
> behind i915.mmio_debug=1 (and maybe a kconfig option)? We would then
> instruct QA and developers to always have the option enabled. Just
> like QA needs to have lockdep enabled, we could ask it to have
> mmio_debug enabled all the time too.
Whilst I like the idea, having debug code running in the wild (at a
frequency high enough to catch bugs, but low enough not to be noticed)
is invaluable.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list