[Intel-gfx] [PATCH 4/4] drm/i915: Reduce frequency of unspecific HSW reg debugging

Fri Sep 4 04:45:48 PDT 2015

Daniel Vetter <daniel at ffwll.ch> writes:

> On Fri, Sep 04, 2015 at 11:40:26AM +0300, Mika Kuoppala wrote:
>> Daniel Vetter <daniel at ffwll.ch> writes:
>> 
>> > On Thu, Sep 03, 2015 at 04:51:45PM -0300, Paulo Zanoni wrote:
>> >> From: Chris Wilson <chris at chris-wilson.co.uk>
>> >> 
>> >> Delay the expensive read on the FPGA_DBG register from once per mmio to
>> >> once per forcewake section when we are doing the general wellbeing
>> >> check rather than the targetted error detection. This almost reduces
>> >> the overhead of the debug facility (for example when submitting execlists)
>> >> to zero whilst keeping the debug checks around.
>> >> 
>> >> v2: Enable one-shot mmio debugging from the interrupt check as well as a
>> >>     safeguard to catch invalid display writes from outside the powerwell.
>> >> v3 (from Paulo): rebase since gen9 addition and intel_uncore_check_errors
>> >>     removal
>> >> 
>> >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> >> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>> >> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
>> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
>> >
>> > I'm unclear how this interacts (or how it sould interact) with patch 2:
>> > Forcwake is mostly for GT registers, but patch 2 also tries to optimize
>> > forcwake for GT registers. Do we really need both?
>> 
>> Assuming the hardware detects access to unpowered domains and
>> to unregistered ranges by setting this bit, I would say that patch 2
>> is not needed. One could argue that patch 2 is somewhat harmful as
>> current register access pattern affects the detection.
>> 
>> Also the commit message in patch 2 is not valid wrt the code.
>> 
>> With skl, the debug bit seems to decay with time, instead of being
>> sticky. So in there we could argue that in patch 4/4, the reading
>> should be done before (and after) the forcewake scope.
>
> Do we know where the bits decay? Could it be that the firmware (dmc) does
> something with it, or maybe it gets reset when we change display power
> wells?

Now when trying to actually measure the decay time, I can't reproduce
the same behaviour anymore. Now it is sticky. Up until the display
power off clears the register without explicit clearing write.

Now I just wonder why I saw decay in just less than millisecond,
few weeks back. I am willing to blame my imperfect test setup on this,
and something just cleared it behind my back.

-Mika

>
>> Paulo, have you tried if this bit detects access to unpowered
>> domain with hsw/bdw?
>
> Yup, that's the main reason we have this, it's great for catching
> rpm/power wells bugs.
> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch