[Intel-gfx] [PATCH 4/4] drm/i915: Reduce frequency of unspecific HSW reg debugging

Fri Sep 4 08:16:19 PDT 2015

On Fri, Sep 04, 2015 at 04:53:28PM +0200, Daniel Vetter wrote:
> On Fri, Sep 04, 2015 at 03:18:14PM +0300, Mika Kuoppala wrote:
> > Mika Kuoppala <mika.kuoppala at linux.intel.com> writes:
> > 
> > > Daniel Vetter <daniel at ffwll.ch> writes:
> > >
> > >> On Fri, Sep 04, 2015 at 11:40:26AM +0300, Mika Kuoppala wrote:
> > >>> Daniel Vetter <daniel at ffwll.ch> writes:
> > >>> 
> > >>> > On Thu, Sep 03, 2015 at 04:51:45PM -0300, Paulo Zanoni wrote:
> > >>> >> From: Chris Wilson <chris at chris-wilson.co.uk>
> > >>> >> 
> > >>> >> Delay the expensive read on the FPGA_DBG register from once per mmio to
> > >>> >> once per forcewake section when we are doing the general wellbeing
> > >>> >> check rather than the targetted error detection. This almost reduces
> > >>> >> the overhead of the debug facility (for example when submitting execlists)
> > >>> >> to zero whilst keeping the debug checks around.
> > >>> >> 
> > >>> >> v2: Enable one-shot mmio debugging from the interrupt check as well as a
> > >>> >>     safeguard to catch invalid display writes from outside the powerwell.
> > >>> >> v3 (from Paulo): rebase since gen9 addition and intel_uncore_check_errors
> > >>> >>     removal
> > >>> >> 
> > >>> >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > >>> >> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > >>> >> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> > >>> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > >>> >
> > >>> > I'm unclear how this interacts (or how it sould interact) with patch 2:
> > >>> > Forcwake is mostly for GT registers, but patch 2 also tries to optimize
> > >>> > forcwake for GT registers. Do we really need both?
> > >>> 
> > >>> Assuming the hardware detects access to unpowered domains and
> > >>> to unregistered ranges by setting this bit, I would say that patch 2
> > >>> is not needed. One could argue that patch 2 is somewhat harmful as
> > >>> current register access pattern affects the detection.
> > >>> 
> > >>> Also the commit message in patch 2 is not valid wrt the code.
> > >>> 
> > >>> With skl, the debug bit seems to decay with time, instead of being
> > >>> sticky. So in there we could argue that in patch 4/4, the reading
> > >>> should be done before (and after) the forcewake scope.
> > >>
> > >> Do we know where the bits decay? Could it be that the firmware (dmc) does
> > >> something with it, or maybe it gets reset when we change display power
> > >> wells?
> > >
> > > Now when trying to actually measure the decay time, I can't reproduce
> > > the same behaviour anymore. Now it is sticky. Up until the display
> > > power off clears the register without explicit clearing write.
> > >
> > > Now I just wonder why I saw decay in just less than millisecond,
> > > few weeks back. I am willing to blame my imperfect test setup on this,
> > > and something just cleared it behind my back.
> > >
> > 
> > Well, apparently this is in display power well, like Daniel
> > noted.  Which means that if display is off, the write won't
> > stick. (but stays in some pci write buffer for 0.5usecs?
> > until it vanishes, thus the decay)
> > 
> > Test setup was otherwise water proof, but if one does
> > skl debugging from console and bdw debugging through
> > ssh, the display power well states are quite different...
> 
> Hm, doesn't that mean that we can't do delayed checking since what we
> really want is spot unclaimed register writes when the power well is off?
> At least it was really useful to catch the oddball runtime pm management
> bug. So perhaps we indeed want the range-based filter and not this patch
> here?

Would it also make sense to enable/disable the checks from the power
well enable/disable hooks, so that we would minimize the overhead when
the power well is actually enabled?

-- 
Ville Syrjälä
Intel OTC