[Intel-gfx] [PATCH] drm/i915/selftests: Exercise CS TLB invalidation
Mika Kuoppala
mika.kuoppala at linux.intel.com
Thu Sep 19 13:39:59 UTC 2019
Chris Wilson <chris at chris-wilson.co.uk> writes:
> Quoting Mika Kuoppala (2019-09-19 13:57:45)
>> Chris Wilson <chris at chris-wilson.co.uk> writes:
>>
>> > Check that we are correctly invalidating the TLB at the start of a
>> > batch after updating the GTT.
>> >
>> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>> > ---
>> > drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 227 ++++++++++++++++++
>> > 1 file changed, 227 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
>> > index 598c18d10640..f8709b332bd7 100644
>> > --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
>> > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
>> > @@ -25,13 +25,16 @@
>> > #include <linux/list_sort.h>
>> > #include <linux/prime_numbers.h>
>> >
>> > +#include "gem/i915_gem_context.h"
>> > #include "gem/selftests/mock_context.h"
>> > +#include "gt/intel_context.h"
>> >
>> > #include "i915_random.h"
>> > #include "i915_selftest.h"
>> >
>> > #include "mock_drm.h"
>> > #include "mock_gem_device.h"
>> > +#include "igt_flush_test.h"
>> >
>> > static void cleanup_freed_objects(struct drm_i915_private *i915)
>> > {
>> > @@ -1705,6 +1708,229 @@ int i915_gem_gtt_mock_selftests(void)
>> > return err;
>> > }
>> >
>> > +static int context_sync(struct intel_context *ce)
>> > +{
>> > + struct i915_request *rq;
>> > + long timeout;
>> > +
>> > + rq = intel_context_create_request(ce);
>> > + if (IS_ERR(rq))
>> > + return PTR_ERR(rq);
>> > +
>> > + i915_request_get(rq);
>> > + i915_request_add(rq);
>> > +
>> > + timeout = i915_request_wait(rq, 0, HZ / 5);
>> > + i915_request_put(rq);
>> > +
>> > + return timeout < 0 ? -EIO : 0;
>> > +}
>> > +
>> > +static int submit_batch(struct intel_context *ce, u64 addr)
>> > +{
>> > + struct i915_request *rq;
>> > + int err;
>> > +
>> > + rq = intel_context_create_request(ce);
>> > + if (IS_ERR(rq))
>> > + return PTR_ERR(rq);
>> > +
>> > + err = 0;
>> > + if (rq->engine->emit_init_breadcrumb) /* detect a hang */
>>
>> for seqno write?
>
> If we expect an initial breadcrumb, we use it during reset to identify
> if we are inside a payload (as opposed to being inside the semaphore
> wait). So we need to emit the breadcrumb if we may hang in our batch.
>
>> > + err = rq->engine->emit_init_breadcrumb(rq);
>> > + if (err == 0)
>> > + err = rq->engine->emit_bb_start(rq, addr, 0, 0);
>> > +
>>
>> In context_sync part we grabbed a reference. In here we
>> don't. I don't see how we can get away without it even
>> if we don't wait in here.
>
> Hmm, I suppose you can argue that we do now have a later deref in the
> spinner. That wasn't there before...
>
>> > + vma = i915_vma_instance(out, vm, NULL);
>> > + if (IS_ERR(vma)) {
>> > + err = PTR_ERR(vma);
>> > + goto out_batch;
>> > + }
>> > +
>> > + err = i915_vma_pin(vma, 0, 0,
>> > + PIN_USER |
>> > + PIN_OFFSET_FIXED |
>> > + (vm->total - PAGE_SIZE));
>> > + if (err)
>> > + goto out_out;
>>
>> out_put?
>
> Joonas likes out: for normal exit paths that double for error handling.
Ok. Fine with me. I just like the last part to describe what the first part
of onion out does. Don't have to so much scroll back and forth
in editor.
>
>> Oh and we don't have to do anything with the instance on
>> error paths?
>
> No. The vma does not yet have an independent lifetime from the object
> (they are all owned by objects currently).
>
>> > + /* Replace the TLB with target batches */
>> > + for (i = 0; i < count; i++) {
>> > + u32 *cs = batch + i * 64 / sizeof(*cs);
>> > + u64 addr;
>> > +
>> > + vma->node.start = offset + i * PAGE_SIZE;
>>
>> on previous loop, we have now primed the pte to tlb in here?
>
> We're using the same PTE as before, in the expectation that the TLB
> still contains that lookup.
>
>> > + vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
>>
>> ..now changing in it here...
>>
>> > +
>> > + addr = vma->node.start + i * 64;
>> > + cs[4] = MI_NOOP;
>> > + cs[6] = lower_32_bits(addr);
>> > + cs[7] = upper_32_bits(addr);
>> > + wmb();
>> > +
>> > + err = submit_batch(ce, addr);
>>
>> in hope that with this submission hardware will see the old one and
>> completely miss the store+bb start?
>
> The hope is we see the right page of course! The test is to detect when
> it sees the old page instead.
Right, I guess it just depends who is hoping wrt to outcome =)
>
>> Perhaps rewiring a more dummy emit only to prove this case
>> is pushing it.
>>
>> But I am curious to know if you did try it out by removing
>> the invalidate on the emits and managed to bring
>> out the missing writes?
>
> We can demonstrate it on Tigerlake :)
>
> Indeed it detects the remove of MI_INVALIDATE_TLB elsewhere.
Neat addition to our triaging kit this will be.
-Mika
More information about the Intel-gfx
mailing list