[Intel-gfx] [PATCH] drm/i915: Reduce number of register access during IVB+ interrupt handling

Paulo Zanoni przanoni at gmail.com
Mon Sep 23 16:39:11 CEST 2013


2013/9/23 Chris Wilson <chris at chris-wilson.co.uk>:
> On Mon, Sep 23, 2013 at 09:57:37AM -0300, Paulo Zanoni wrote:
>> 2013/9/23 Chris Wilson <chris at chris-wilson.co.uk>:
>> > Register access is particularly obnoxious on Sandybridge and later due to
>> > the extra work we must do around every read or write. The effect is
>> > magnified on Haswell, as we have per-operation sanity checking
>> > magnifying the number of reads and writes.
>> >
>> > Interrupt handling is supposed to be fast, yet due to the sanity checks
>> > around the register accesss it is not as fast as it could be. If we look
>> > closer, most of the common register operations are reading values we
>> > already know, and redundant flushes. Eliminate these by storing the
>> > desired values rather than reading them back during the interrupt
>> > routine.
>> >
>> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> > ---
>> >  drivers/gpu/drm/i915/i915_drv.h |  4 ++--
>> >  drivers/gpu/drm/i915/i915_irq.c | 22 ++++++++--------------
>> >  2 files changed, 10 insertions(+), 16 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> > index db8e4d0..f182a23 100644
>> > --- a/drivers/gpu/drm/i915/i915_drv.h
>> > +++ b/drivers/gpu/drm/i915/i915_drv.h
>> > @@ -1259,8 +1259,8 @@ typedef struct drm_i915_private {
>> >         /* DPIO indirect register protection */
>> >         struct mutex dpio_lock;
>> >
>> > -       /** Cached value of IMR to avoid reads in updating the bitfield */
>> > -       u32 irq_mask;
>> > +       /** Cached value of IMR/IER to avoid reads in updating the bitfield */
>> > +       u32 irq_mask, irq_enable;
>> >         u32 gt_irq_mask;
>> >         u32 pm_irq_mask;
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> > index d9eebca..76725c6 100644
>> > --- a/drivers/gpu/drm/i915/i915_irq.c
>> > +++ b/drivers/gpu/drm/i915/i915_irq.c
>> > @@ -1428,7 +1428,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>> >  {
>> >         struct drm_device *dev = (struct drm_device *) arg;
>> >         drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>> > -       u32 de_iir, gt_iir, de_ier, sde_ier = 0;
>> > +       u32 de_iir, gt_iir;
>> >         irqreturn_t ret = IRQ_NONE;
>> >
>> >         atomic_inc(&dev_priv->irq_received);
>> > @@ -1438,20 +1438,15 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>> >         intel_uncore_check_errors(dev);
>> >
>> >         /* disable master interrupt before clearing iir  */
>> > -       de_ier = I915_READ(DEIER);
>> > -       I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
>> > -       POSTING_READ(DEIER);
>> > +       I915_WRITE(DEIER, dev_priv->irq_enable & ~DE_MASTER_IRQ_CONTROL);
>> >
>> >         /* Disable south interrupts. We'll only write to SDEIIR once, so further
>> >          * interrupts will will be stored on its back queue, and then we'll be
>> >          * able to process them after we restore SDEIER (as soon as we restore
>> >          * it, we'll get an interrupt if SDEIIR still has something to process
>> >          * due to its back queue). */
>> > -       if (!HAS_PCH_NOP(dev)) {
>> > -               sde_ier = I915_READ(SDEIER);
>> > +       if (!HAS_PCH_NOP(dev))
>> >                 I915_WRITE(SDEIER, 0);
>> > -               POSTING_READ(SDEIER);
>> > -       }
>> >
>> >         gt_iir = I915_READ(GTIIR);
>> >         if (gt_iir) {
>> > @@ -1482,12 +1477,10 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>> >                 }
>> >         }
>> >
>> > -       I915_WRITE(DEIER, de_ier);
>> > +       if (!HAS_PCH_NOP(dev))
>> > +               I915_WRITE(SDEIER, ~0);
>>
>> Please don't change the relative order of DEIER and SDEIER. As far as
>> I remember, this order is required to prevent the complete stop of
>> SDEIER interrupts when we're getting too many of them (e.g.,
>> underruns).
>
> The comments refer to the ordering of DEIIR vs SDEIIR, not IER and in
> particular the master-irq enable.

Maybe we need better comments. I did these things in the beginning of
the year, so my memory is not exactly precise. But as far as I
remember, the idea of restoring SDEIER after we restore DEIER is that
with this, any pending south interrupts would propagate to the north
DEIMR bit, follow the interrupt route, then trigger a new interrupt,
which would call the interrupt handler again. If we restore SDEIER
first we won't trigger an interrupt (since the master bit in DEIER is
disabled) and we'll get stuck because the DEIER bit responsible for
the south interrupts only triggers an interrupt when it flips to 1,
but on this case it's already 1.

There's this email where I talk about this:
http://lists.freedesktop.org/archives/intel-gfx/2013-February/024962.html


But maybe I was wrong. Still, inverting the order should probably be
on its own patch due to the risks.


And since we're trying to save reads/writes, perhaps another idea
would be to only clear SDEIER after we read DEIIR and confirm that we
do have a south interrupt pending.


Also, just out of curiosity: do we get any noticeable performance
difference on any workload? Depending on the impact I'd even vote to
disable-by-default some of the debugging features.

> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre



-- 
Paulo Zanoni



More information about the Intel-gfx mailing list