[Intel-gfx] [PATCH 12/20] drm/i915/selftests: Verify LRC isolation

Fri Feb 28 12:13:03 UTC 2020

Chris Wilson <chris at chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2020-02-28 11:30:21)
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>> > +     x = 0;
>> > +     dw = 0;
>> > +     hw = engine->pinned_default_state;
>> > +     hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw);
>> > +     do {
>> > +             u32 lri = hw[dw];
>> > +
>> > +             if (lri == 0) {
>> > +                     dw++;
>> > +                     continue;
>> > +             }
>> > +
>> > +             if ((lri & GENMASK(31, 23)) != MI_INSTR(0x22, 0)) {
>> > +                     lri &= 0x7f;
>> > +                     dw += lri + 2;
>> > +                     continue;
>> > +             }
>> > +
>> > +             lri &= 0x7f;
>> > +             lri++;
>> > +             dw++;
>> 
>> The pattern for weeding out everything except lris starts
>> to stand out as a boilerplate. 
>
> Ssh. Just do a visual replacement with for_each_register()
>
>> 
>> > +
>> > +             while (lri) {
>> > +                     if (!is_moving(A[0][x], A[1][x]) &&
>> > +                         (A[0][x] != B[0][x] || A[1][x] != B[1][x])) {
>> > +                             switch (hw[dw] & 4095) {
>> > +                             case 0x30: /* RING_HEAD */
>> > +                             case 0x34: /* RING_TAIL */
>> > +                                     break;
>> > +
>> > +                             default:
>> > +                                     pr_err("%s[%d]: Mismatch for register %4x, default %08x, reference %08x, result (%08x, %08x), poison %08x, context %08x\n",
>> 
>> 0x left out on all for compactness or by accident?
>
> I tend not to use 0x, unless it looks odd or we need machine parsing :) 
> Here the compactness helps with a very long line.
>
>> > +                                            engine->name, x,
>> > +                                            hw[dw], hw[dw + 1],
>> > +                                            A[0][x], B[0][x], B[1][x],
>> > +                                            poison, lrc[dw + 1]);
>> > +                                     err = -EINVAL;
>> > +                                     break;
>> > +                             }
>> > +                     }
>> > +                     dw += 2;
>> > +                     lri -= 2;
>> > +                     x++;
>> > +             }
>> > +     } while (dw < PAGE_SIZE / sizeof(u32) &&
>> > +              (hw[dw] & ~BIT(0)) != MI_BATCH_BUFFER_END);
>> > +
>> > +     i915_gem_object_unpin_map(ce->state->obj);
>> > +err_B1:
>> > +     i915_gem_object_unpin_map(result[1]->obj);
>> > +err_B0:
>> > +     i915_gem_object_unpin_map(result[0]->obj);
>> > +err_A1:
>> > +     i915_gem_object_unpin_map(ref[1]->obj);
>> > +err_A0:
>> > +     i915_gem_object_unpin_map(ref[0]->obj);
>> > +     return err;
>> > +}
>> > +
>> > +static int __lrc_isolation(struct intel_engine_cs *engine, u32 poison)
>> > +{
>> > +     u32 *sema = memset32(engine->status_page.addr + 1000, 0, 1);
>> > +     struct i915_vma *ref[2], *result[2];
>> > +     struct intel_context *A, *B;
>> > +     struct i915_request *rq;
>> > +     int err;
>> > +
>> > +     A = intel_context_create(engine);
>> > +     if (IS_ERR(A))
>> > +             return PTR_ERR(A);
>> > +
>> > +     B = intel_context_create(engine);
>> > +     if (IS_ERR(B)) {
>> > +             err = PTR_ERR(B);
>> > +             goto err_A;
>> > +     }
>> > +
>> > +     ref[0] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(ref[0])) {
>> > +             err = PTR_ERR(ref[0]);
>> > +             goto err_B;
>> > +     }
>> > +
>> > +     ref[1] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(ref[1])) {
>> > +             err = PTR_ERR(ref[1]);
>> > +             goto err_ref0;
>> > +     }
>> > +
>> > +     rq = record_registers(A, ref[0], ref[1], sema);
>> > +     if (IS_ERR(rq)) {
>> > +             err = PTR_ERR(rq);
>> > +             goto err_ref1;
>> > +     }
>> > +
>> > +     WRITE_ONCE(*sema, 1);
>> > +     wmb();
>> 
>> So with this you get reference base for before and after...
>> 
>> > +
>> > +     if (i915_request_wait(rq, 0, HZ / 2) < 0) {
>> > +             i915_request_put(rq);
>> > +             err = -ETIME;
>> > +             goto err_ref1;
>> > +     }
>> > +     i915_request_put(rq);
>> > +
>> > +     result[0] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(result[0])) {
>> > +             err = PTR_ERR(result[0]);
>> > +             goto err_ref1;
>> > +     }
>> > +
>> > +     result[1] = create_user_vma(A->vm, SZ_64K);
>> > +     if (IS_ERR(result[1])) {
>> > +             err = PTR_ERR(result[1]);
>> > +             goto err_result0;
>> > +     }
>> > +
>> > +     rq = record_registers(A, result[0], result[1], sema);
>> > +     if (IS_ERR(rq)) {
>> > +             err = PTR_ERR(rq);
>> > +             goto err_result1;
>> > +     }
>> > +
>> > +     err = poison_registers(B, poison, sema);
>> 
>> ..and apparently the poisoning releases the semaphore
>> triggering the preemption?
>
> Correct.
>
>> How can you make preempt happen deterministically with this?
>
> We use MI_ARB_ONOFF to ensure that we can only be preempted within the
> semaphore.

Ah yes this give you the resolution you want.

>
>> Released semaphore generates interrupt and you have it ready
>> with maximum prio?
>
> On submission of the I915_PRIORITY_BARRIER request, we will immediately
> submit it to ELSP, and so queue the preemption request to HW. Since we
> disable preemption in record_registers() until it hits the semaphore, we
> know that the second batch must run only when the first is waiting on
> that semaphore.

Thanks for the explanation.

>  
>> I am also puzzled why there is a need for two set of reference
>> values?
>
> One is without preemption from a second context (so just one context
> running on the HW). This reference should match the context image
> before/after. The other is after having been preempted between the
> two reads. This /should/ also match its context image...
>
> The first checks our mechanism for reading back the registers and
> comparing them to the context, the second is the actual test.

I remember yesterday pondering in irc that it would be nice to have
an machanism to test the machinery. You must have been confused
that what do I really mean as this was the mechanism I was looking
for but didn't undestand =)

I like that it adapts to context image content and the
relative mmio trick should work for gens to come.

Reviewed-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>

> -Chris