[Intel-gfx] [PATCH] drm/i915/execlists: Verify context register state before execution

Mika Kuoppala mika.kuoppala at linux.intel.com
Thu Oct 31 14:52:33 UTC 2019


Chris Wilson <chris at chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2019-10-31 14:32:05)
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>> 
>> > Check that the context's ring register state still matches our
>> > expectations prior to execution.
>> >
>> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/gt/intel_lrc.c     | 73 ++++++++++++++++++++-----
>> >  drivers/gpu/drm/i915/gt/intel_lrc_reg.h |  4 +-
>> >  drivers/gpu/drm/i915/gt/selftest_lrc.c  |  4 +-
>> >  3 files changed, 63 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > index 5f61cd128d9c..666e70931524 100644
>> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> > @@ -1146,6 +1146,60 @@ execlists_schedule_out(struct i915_request *rq)
>> >       i915_request_put(rq);
>> >  }
>> >  
>> > +static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
>> > +{
>> > +     if (INTEL_GEN(engine->i915) >= 12)
>> > +             return 0x60;
>> > +     else if (INTEL_GEN(engine->i915) >= 9)
>> > +             return 0x54;
>> > +     else if (engine->class == RENDER_CLASS)
>> > +             return 0x58;
>> > +     else
>> > +             return -1;
>> > +}
>> > +
>> > +static void
>> > +execlists_check_context(const struct intel_context *ce,
>> > +                     const struct intel_engine_cs *engine)
>> > +{
>> > +     const struct intel_ring *ring = ce->ring;
>> > +     u32 *regs = ce->lrc_reg_state;
>> > +     int x;
>> > +
>> > +     if (regs[CTX_RING_START] != i915_ggtt_offset(ring->vma)) {
>> > +             pr_err_once("%s: context submitted with incorrect RING_BUFFER_START [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_RING_START],
>> > +                         i915_ggtt_offset(ring->vma));
>> > +             regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
>> > +     }
>> > +
>> > +     if ((regs[CTX_RING_CTL] & ~(RING_WAIT | RING_WAIT_SEMAPHORE)) !=
>> > +         (RING_CTL_SIZE(ring->size) | RING_VALID)) {
>> > +             pr_err_once("%s: context submitted with incorrect RING_BUFFER_CONTROL [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_RING_CTL],
>> > +                         (u32)(RING_CTL_SIZE(ring->size) | RING_VALID));
>> > +             regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID;
>> 
>> We are going to submit by clearing the waits. First I thought this will
>> lead to disaster but the hardware works so that we clear on writing '1'.
>
> Afaik, they are only indicator bits. So the HW ignores those bits when
> we write to the register.
>

I did check on reviewing. For Gen12 it states that: writing 0 == no
effect, writing 1 == clear.

-Mika


>> > +     if (regs[CTX_BB_STATE] != RING_BB_PPGTT) {
>> > +             pr_err_once("%s: context submitted with incorrect BB_STATE [%08x], expected %08x\n",
>> > +                         engine->name,
>> > +                         regs[CTX_BB_STATE],
>> > +                         RING_BB_PPGTT);
>> > +             regs[CTX_BB_STATE] = RING_BB_PPGTT;
>> > +     }
>> > +
>> > +     x = lrc_ring_mi_mode(engine);
>> > +     if (x != -1 && regs[x + 1] & STOP_RING) {
>> > +             pr_err_once("%s: context submitted with STOP_RING [%08x] in RING_MI_MODE\n",
>> > +                         engine->name, regs[x + 1]);
>> > +             regs[x + 1] &= ~STOP_RING;
>> > +             regs[x + 1] |= STOP_RING << 16;
>> 
>> Ok you want only to care about this one. Was pondering of restoring rest
>> of the state from previous.
>
> It was mostly in the spirit of "now that we are here, we might as well
> fix it up". It's mainly the paranoia in trying to ascertain if we are
> feeding garbage into the GPU. There's probably a lot more we can check,
> but the ring registers are essential :)
>
>> Will the hardware even care on masks at this point or is the saving part
>> setting them accordingly?
>
> Aiui, it uses the masks on the implicit LRI in the context restore to
> control where or not to actually apply the bits. Not that I have checked
> to see what the HW is doing (we will see if it ever flags an error), but
> memory says from elsewere it uses 0xffffxxxx.
> -Chris


More information about the Intel-gfx mailing list