[Intel-gfx] [PATCH] drm/i915/selftests: Exercise CS TLB invalidation
Chris Wilson
chris at chris-wilson.co.uk
Thu Sep 19 13:08:38 UTC 2019
Quoting Mika Kuoppala (2019-09-19 13:57:45)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
> > Check that we are correctly invalidating the TLB at the start of a
> > batch after updating the GTT.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> > ---
> > drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 227 ++++++++++++++++++
> > 1 file changed, 227 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> > index 598c18d10640..f8709b332bd7 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> > @@ -25,13 +25,16 @@
> > #include <linux/list_sort.h>
> > #include <linux/prime_numbers.h>
> >
> > +#include "gem/i915_gem_context.h"
> > #include "gem/selftests/mock_context.h"
> > +#include "gt/intel_context.h"
> >
> > #include "i915_random.h"
> > #include "i915_selftest.h"
> >
> > #include "mock_drm.h"
> > #include "mock_gem_device.h"
> > +#include "igt_flush_test.h"
> >
> > static void cleanup_freed_objects(struct drm_i915_private *i915)
> > {
> > @@ -1705,6 +1708,229 @@ int i915_gem_gtt_mock_selftests(void)
> > return err;
> > }
> >
> > +static int context_sync(struct intel_context *ce)
> > +{
> > + struct i915_request *rq;
> > + long timeout;
> > +
> > + rq = intel_context_create_request(ce);
> > + if (IS_ERR(rq))
> > + return PTR_ERR(rq);
> > +
> > + i915_request_get(rq);
> > + i915_request_add(rq);
> > +
> > + timeout = i915_request_wait(rq, 0, HZ / 5);
> > + i915_request_put(rq);
> > +
> > + return timeout < 0 ? -EIO : 0;
> > +}
> > +
> > +static int submit_batch(struct intel_context *ce, u64 addr)
> > +{
> > + struct i915_request *rq;
> > + int err;
> > +
> > + rq = intel_context_create_request(ce);
> > + if (IS_ERR(rq))
> > + return PTR_ERR(rq);
> > +
> > + err = 0;
> > + if (rq->engine->emit_init_breadcrumb) /* detect a hang */
>
> for seqno write?
If we expect an initial breadcrumb, we use it during reset to identify
if we are inside a payload (as opposed to being inside the semaphore
wait). So we need to emit the breadcrumb if we may hang in our batch.
> > + err = rq->engine->emit_init_breadcrumb(rq);
> > + if (err == 0)
> > + err = rq->engine->emit_bb_start(rq, addr, 0, 0);
> > +
>
> In context_sync part we grabbed a reference. In here we
> don't. I don't see how we can get away without it even
> if we don't wait in here.
Hmm, I suppose you can argue that we do now have a later deref in the
spinner. That wasn't there before...
> > + vma = i915_vma_instance(out, vm, NULL);
> > + if (IS_ERR(vma)) {
> > + err = PTR_ERR(vma);
> > + goto out_batch;
> > + }
> > +
> > + err = i915_vma_pin(vma, 0, 0,
> > + PIN_USER |
> > + PIN_OFFSET_FIXED |
> > + (vm->total - PAGE_SIZE));
> > + if (err)
> > + goto out_out;
>
> out_put?
Joonas likes out: for normal exit paths that double for error handling.
> Oh and we don't have to do anything with the instance on
> error paths?
No. The vma does not yet have an independent lifetime from the object
(they are all owned by objects currently).
> > + /* Replace the TLB with target batches */
> > + for (i = 0; i < count; i++) {
> > + u32 *cs = batch + i * 64 / sizeof(*cs);
> > + u64 addr;
> > +
> > + vma->node.start = offset + i * PAGE_SIZE;
>
> on previous loop, we have now primed the pte to tlb in here?
We're using the same PTE as before, in the expectation that the TLB
still contains that lookup.
> > + vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
>
> ..now changing in it here...
>
> > +
> > + addr = vma->node.start + i * 64;
> > + cs[4] = MI_NOOP;
> > + cs[6] = lower_32_bits(addr);
> > + cs[7] = upper_32_bits(addr);
> > + wmb();
> > +
> > + err = submit_batch(ce, addr);
>
> in hope that with this submission hardware will see the old one and
> completely miss the store+bb start?
The hope is we see the right page of course! The test is to detect when
it sees the old page instead.
> Perhaps rewiring a more dummy emit only to prove this case
> is pushing it.
>
> But I am curious to know if you did try it out by removing
> the invalidate on the emits and managed to bring
> out the missing writes?
We can demonstrate it on Tigerlake :)
Indeed it detects the remove of MI_INVALIDATE_TLB elsewhere.
-Chris
More information about the Intel-gfx
mailing list