[Intel-gfx] [PATCH 29/34] drm/i915: Drop fake breadcrumb irq

Chris Wilson chris at chris-wilson.co.uk
Thu Jan 24 18:18:44 UTC 2019


Quoting Tvrtko Ursulin (2019-01-24 17:55:20)
> 
> On 21/01/2019 22:21, Chris Wilson wrote:
> > Missed breadcrumb detection is defunct due to the tight coupling with
> 
> How it is defunct.. oh because there is no irq_count any more... could 
> you describe the transition from irq_count to irq_fired and then to 
> nothing briefly?

We don't have an independent intel_wait to distinguish irq completions
from dma_fence_signals. Everytime we call dma_fence_signal() we think we
saw an interrupt, and we complete fences very often before we see
interrupts... And then our test completely fails to setup the machine to
detect the missed breadcrumb as the requests get completed by anything
but the missed breadcrumb timer.

> > dma_fence signaling and the myriad ways we may signal fences from
> > everywhere but from an interrupt, i.e. we frequently signal a fence
> > before we even see its interrupt. This means that even if we miss an
> > interrupt for a fence, it still is signaled before our breadcrumb
> > hangcheck fires, so simplify the breadcrumb hangchecking by moving it
> > into the GPU hangcheck and forgo fake interrupts.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_debugfs.c           |  93 -----------
> >   drivers/gpu/drm/i915/i915_gpu_error.c         |   2 -
> >   drivers/gpu/drm/i915/i915_gpu_error.h         |   5 -
> >   drivers/gpu/drm/i915/intel_breadcrumbs.c      | 147 +-----------------
> >   drivers/gpu/drm/i915/intel_hangcheck.c        |   2 +
> >   drivers/gpu/drm/i915/intel_ringbuffer.h       |   5 -
> >   .../gpu/drm/i915/selftests/igt_live_test.c    |   7 -
> >   7 files changed, 5 insertions(+), 256 deletions(-)
> 
> With this balance of insertions and deletions I cannot decide if this is 
> easy or hard to review.
> 
> IGT uses intel_detect_and_clear_missed_interrupts a lot. What is the 
> plan there?

They are defunct. They no longer detect anything useful from the
previous patch, this just makes it official. igt has been prepped for
the loss of the debugfs.

Without this patch we get false positives from i915_missed_interrupt
instead.

I've tried and failed to replace the detection in an acceptable manner,
without a separate irq completion tracker it seems hopeless.

> > diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
> > index 5662d6fed523..a219c796e56d 100644
> > --- a/drivers/gpu/drm/i915/intel_hangcheck.c
> > +++ b/drivers/gpu/drm/i915/intel_hangcheck.c
> > @@ -275,6 +275,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
> >       for_each_engine(engine, dev_priv, id) {
> >               struct hangcheck hc;
> >   
> > +             intel_engine_signal_breadcrumbs(engine);
> > +
> 
> Sounds fine. So only downside is detecting missed interrupts gets 
> slower. But in practice they don't happen often?

In practice, no missed interrupts. *touch wood*
That was why fixing gen5-gen7 beforehand was so important.

Having a signal here and in retire_work, means that in the worst case
everything updates once a second. Enough for users to be able to
complain. But more than likely every frame update will flush the earlier
requests anyway, hence why we couldn't detect missed breadcrumbs in igt
in the first place.
-Chris


More information about the Intel-gfx mailing list