[Intel-gfx] [PATCH v2 3/3] drm/i915/selftests: Exercise adding requests to a full GGTT
Chris Wilson
chris at chris-wilson.co.uk
Wed Oct 11 12:45:05 UTC 2017
Quoting Tvrtko Ursulin (2017-10-11 13:33:28)
>
> On 10/10/2017 22:38, Chris Wilson wrote:
> > A bug recently encountered involved the issue where are we were
> > submitting requests to different ppGTT, each would pin a segment of the
> > GGTT for its logical context and ring. However, this is invisible to
> > eviction as we do not tie the context/ring VMA to a request and so do
> > not automatically wait upon it them (instead they are marked as pinned,
> > prevent eviction entirely). Instead the eviction code must flush those
>
> preventing?
>
> > contexts by switching to the kernel context. This selftest tries to
> > fill the GGTT with contexts to exercise a path where the
> > switch-to-kernel-context failed to make forward progress and we fail
> > with ENOSPC.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > ---
> > drivers/gpu/drm/i915/selftests/i915_gem_evict.c | 121 +++++++++++++++++++++
> > .../gpu/drm/i915/selftests/i915_live_selftests.h | 1 +
> > drivers/gpu/drm/i915/selftests/mock_context.c | 6 +-
> > 3 files changed, 123 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> > index 5ea373221f49..53df8926be7f 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
> > @@ -24,6 +24,8 @@
> >
> > #include "../i915_selftest.h"
> >
> > +#include "mock_context.h"
> > +#include "mock_drm.h"
> > #include "mock_gem_device.h"
> >
> > static int populate_ggtt(struct drm_i915_private *i915)
> > @@ -325,6 +327,116 @@ static int igt_evict_vm(void *arg)
> > return err;
> > }
> >
> > +static int igt_evict_contexts(void *arg)
> > +{
> > + struct drm_i915_private *i915 = arg;
> > + struct intel_engine_cs *engine;
> > + enum intel_engine_id id;
> > + struct reserved {
> > + struct drm_mm_node node;
> > + struct reserved *next;
> > + } *reserved = NULL;
> > + unsigned long count;
> > + int err = 0;
> > +
> > + /* Make the GGTT appear small (but leave just enough to function) */
>
> How does it do that?
>
> > + count = 0;
> > + mutex_lock(&i915->drm.struct_mutex);
> > + do {
> > + struct reserved *r;
> > +
> > + r = kcalloc(1, sizeof(*r), GFP_KERNEL);
> > + if (!r) {
> > + err = -ENOMEM;
> > + goto out_locked;
> > + }
> > +
> > + if (i915_gem_gtt_insert(&i915->ggtt.base, &r->node,
> > + 1ul << 20, 0, I915_COLOR_UNEVICTABLE,
> > + 16ul << 20, i915->ggtt.base.total,
> > + PIN_NOEVICT)) {
> > + kfree(r);
> > + break;
> > + }
> > +
> > + r->next = reserved;
> > + reserved = r;
> > +
> > + count++;
> > + } while (1);
> > + mutex_unlock(&i915->drm.struct_mutex);
> > + pr_info("Filled GGTT with %lu 1MiB nodes\n", count);
>
> Oh right, this helps. :)
>
> > +
> > + /* Overfill the GGTT with context objects and so try to evict one. */
>
> Can GGTT be divisible by 1MiB in which case the above filling would fill
> everything and make any eviction impossible? Do you need an assert that
> there is a little bit of space left for below to work?
There's a subtle restriction that we only try to fill above 16MiB in the
GGTT. So the assumption is that leaves us enough space to fit at least
one request/context (and its preliminary setup, such as golden render
state). Failure to have enough GGTT space to issue that first request
leads to fun. Might have to make that a little more tolerant in future.
> > + for_each_engine(engine, i915, id) {
> > + struct i915_sw_fence *fence;
> > + struct drm_file *file;
> > + unsigned long count = 0;
>
> No shadows variable warnings here?
W=1, brave man! My bad forgot to remove as I reused it for the counter
above.
> > + mutex_lock(&i915->drm.struct_mutex);
> > + do {
> > + struct drm_i915_gem_request *rq;
> > + struct i915_gem_context *ctx;
> > +
> > + ctx = live_context(i915, file);
> > + if (!ctx)
> > + break;
> > +
> > + rq = i915_gem_request_alloc(engine, ctx);
> > + if (IS_ERR(rq)) {
> > + if (PTR_ERR(rq) != -ENOMEM) {
> > + pr_err("Unexpected error from request alloc (ctx hw id %u, on %s): %d\n",
> > + ctx->hw_id, engine->name,
> > + (int)PTR_ERR(rq));
> > + err = PTR_ERR(rq);
> > + }
> > + break;
>
> Comment on why it is OK to stop the test on ENOMEM would be good.
Because the first versions ran out of memory before filling the GGTT :)
Now that there is a reasonable cap on the GGTT size, we can just let it
ENOMEM and fixup later if required.
-Chris
More information about the Intel-gfx
mailing list