[Intel-gfx] [PATCH 4/4] drm/i915: Reduce frequency of unspecific HSW reg debugging

Fri Sep 4 05:18:14 PDT 2015

Mika Kuoppala <mika.kuoppala at linux.intel.com> writes:

> Daniel Vetter <daniel at ffwll.ch> writes:
>
>> On Fri, Sep 04, 2015 at 11:40:26AM +0300, Mika Kuoppala wrote:
>>> Daniel Vetter <daniel at ffwll.ch> writes:
>>> 
>>> > On Thu, Sep 03, 2015 at 04:51:45PM -0300, Paulo Zanoni wrote:
>>> >> From: Chris Wilson <chris at chris-wilson.co.uk>
>>> >> 
>>> >> Delay the expensive read on the FPGA_DBG register from once per mmio to
>>> >> once per forcewake section when we are doing the general wellbeing
>>> >> check rather than the targetted error detection. This almost reduces
>>> >> the overhead of the debug facility (for example when submitting execlists)
>>> >> to zero whilst keeping the debug checks around.
>>> >> 
>>> >> v2: Enable one-shot mmio debugging from the interrupt check as well as a
>>> >>     safeguard to catch invalid display writes from outside the powerwell.
>>> >> v3 (from Paulo): rebase since gen9 addition and intel_uncore_check_errors
>>> >>     removal
>>> >> 
>>> >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> >> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>>> >> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
>>> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com>
>>> >
>>> > I'm unclear how this interacts (or how it sould interact) with patch 2:
>>> > Forcwake is mostly for GT registers, but patch 2 also tries to optimize
>>> > forcwake for GT registers. Do we really need both?
>>> 
>>> Assuming the hardware detects access to unpowered domains and
>>> to unregistered ranges by setting this bit, I would say that patch 2
>>> is not needed. One could argue that patch 2 is somewhat harmful as
>>> current register access pattern affects the detection.
>>> 
>>> Also the commit message in patch 2 is not valid wrt the code.
>>> 
>>> With skl, the debug bit seems to decay with time, instead of being
>>> sticky. So in there we could argue that in patch 4/4, the reading
>>> should be done before (and after) the forcewake scope.
>>
>> Do we know where the bits decay? Could it be that the firmware (dmc) does
>> something with it, or maybe it gets reset when we change display power
>> wells?
>
> Now when trying to actually measure the decay time, I can't reproduce
> the same behaviour anymore. Now it is sticky. Up until the display
> power off clears the register without explicit clearing write.
>
> Now I just wonder why I saw decay in just less than millisecond,
> few weeks back. I am willing to blame my imperfect test setup on this,
> and something just cleared it behind my back.
>

Well, apparently this is in display power well, like Daniel
noted.  Which means that if display is off, the write won't
stick. (but stays in some pci write buffer for 0.5usecs?
until it vanishes, thus the decay)

Test setup was otherwise water proof, but if one does
skl debugging from console and bdw debugging through
ssh, the display power well states are quite different...

<blinks>

Perks of this job is that you get to laugh a alot,
to yourself :)

-Mika

> -Mika
>
>
>
>>
>>> Paulo, have you tried if this bit detects access to unpowered
>>> domain with hsw/bdw?
>>
>> Yup, that's the main reason we have this, it's great for catching
>> rpm/power wells bugs.
>> -Daniel
>> -- 
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx