[Intel-gfx] [PATCH 20/27] drm/i915: Remove logical HW ID
Chris Wilson
chris at chris-wilson.co.uk
Wed Sep 25 12:51:38 UTC 2019
Quoting Tvrtko Ursulin (2019-09-25 13:41:10)
> [+ Daniele, I think he might want to have a look at this.]
>
> On 25/09/2019 11:01, Chris Wilson wrote:
> > With the introduction of ctx->engines[] we allow multiple logical
> > contexts to be used on the same engine (e.g. with virtual engines). Each
> > logical context requires a unique tag in order for context-switching to
> > occur correctly between them.
> >
> > We only need to keep a unique tag for the active lifetime of the
> > context, and for as long as we need to identify that context. The HW
> > uses the tag to determine if it should use a lite-restore (why not the
> > LRCA?) and passes the tag back for various status identifies. The only
> > status we need to track is for OA, so when using perf, we assign the
> > specific context a unique tag.
> >
> > Fixes: 976b55f0e1db ("drm/i915: Allow a context to define its set of engines")
>
> Fixes? Why?
The above paragraph explains that as we give two distinct contexts the
same tag then it will only perform (according to bspec) a lite-restore.
-Chris
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > index da4b45aa84b1..7bce176e696c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
> > @@ -296,10 +296,12 @@ struct intel_engine_cs {
> > u8 uabi_class;
> > u8 uabi_instance;
> >
> > + u32 uabi_capabilities;
> > u32 context_size;
> > u32 mmio_base;
> >
> > - u32 uabi_capabilities;
> > + unsigned int context_tag;
> > +#define NUM_CONTEXT_TAG (256)
>
> AFAICT this define now defines how deep ELSP we can have before we start
> getting hard to debug problems. GuC angle needs considering here.
The guc is immaterial here, since they use their own mechanism. It just
needs to be a value larger than 2 * EXECLIST_MAX_PORTS and less then
BIT(10).
> > @@ -978,6 +965,15 @@ __execlists_schedule_in(struct i915_request *rq)
> >
> > intel_context_get(ce);
> >
> > + if (ce->tag) {
> > + ce->lrc_desc |= (u64)ce->tag << 32;
> > + } else {
> > + ce->lrc_desc &= ~GENMASK_ULL(47, 37);
> > + ce->lrc_desc |=
> > + (u64)(engine->context_tag++ % NUM_CONTEXT_TAG) <<
> > + GEN11_SW_CTX_ID_SHIFT;
> > + }
>
> So hw id is valid only while context is in ELSP and it changes with
> every submission except in the OA case?
Aye. OA being a pita.
> Shifts and masks look dodgy. For >= gen11 the current code has the hw_id
> in [37, 42], otherwise [32, 52]. This patch does not seem to handle the
> differences between gens.
Because the values we supply do not matter (they are cookies), just the
lifetime.
> > +
> > intel_gt_pm_get(engine->gt);
> > execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
> > intel_engine_context_in(engine);
> > @@ -2224,7 +2220,6 @@ static void execlists_context_unpin(struct intel_context *ce)
> > check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE,
> > ce->engine);
> >
> > - i915_gem_context_unpin_hw_id(ce->gem_context);
> > i915_gem_object_unpin_map(ce->state->obj);
> > intel_ring_reset(ce->ring, ce->ring->tail);
> > }
> > @@ -2274,18 +2269,12 @@ __execlists_context_pin(struct intel_context *ce,
> > goto unpin_active;
> > }
> >
> > - ret = i915_gem_context_pin_hw_id(ce->gem_context);
> > - if (ret)
> > - goto unpin_map;
> > -
> > ce->lrc_desc = lrc_descriptor(ce, engine);
> > ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
> > __execlists_update_reg_state(ce, engine);
> >
> > return 0;
> >
> > -unpin_map:
> > - i915_gem_object_unpin_map(ce->state->obj);
> > unpin_active:
> > intel_context_active_release(ce);
> > err:
> > diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
> > index 343d79c1cb7e..04a5a0d90823 100644
> > --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> > +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> > @@ -1564,27 +1564,10 @@ vgpu_id_show(struct device *dev, struct device_attribute *attr,
> > return sprintf(buf, "\n");
> > }
> >
> > -static ssize_t
> > -hw_id_show(struct device *dev, struct device_attribute *attr,
> > - char *buf)
> > -{
> > - struct mdev_device *mdev = mdev_from_dev(dev);
> > -
> > - if (mdev) {
> > - struct intel_vgpu *vgpu = (struct intel_vgpu *)
> > - mdev_get_drvdata(mdev);
> > - return sprintf(buf, "%u\n",
> > - vgpu->submission.shadow[0]->gem_context->hw_id);
> > - }
> > - return sprintf(buf, "\n");
> > -}
> > -
> > static DEVICE_ATTR_RO(vgpu_id);
> > -static DEVICE_ATTR_RO(hw_id);
> >
> > static struct attribute *intel_vgpu_attrs[] = {
> > &dev_attr_vgpu_id.attr,
> > - &dev_attr_hw_id.attr,
> > NULL
> > };
> >
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index fd41505d52ec..8caaa446490f 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -1504,9 +1504,6 @@ static int i915_context_status(struct seq_file *m, void *unused)
> > struct intel_context *ce;
> >
> > seq_puts(m, "HW context ");
> > - if (!list_empty(&ctx->hw_id_link))
> > - seq_printf(m, "%x [pin %u]", ctx->hw_id,
> > - atomic_read(&ctx->hw_id_pin_count));
> > if (ctx->pid) {
> > struct task_struct *task;
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index a86d28bda6dd..47239df653f2 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -471,9 +471,9 @@ static void error_print_context(struct drm_i915_error_state_buf *m,
> > const char *header,
> > const struct drm_i915_error_context *ctx)
> > {
> > - err_printf(m, "%s%s[%d] hw_id %d, prio %d, guilty %d active %d\n",
> > - header, ctx->comm, ctx->pid, ctx->hw_id,
> > - ctx->sched_attr.priority, ctx->guilty, ctx->active);
> > + err_printf(m, "%s%s[%d] prio %d, guilty %d active %d\n",
> > + header, ctx->comm, ctx->pid, ctx->sched_attr.priority,
> > + ctx->guilty, ctx->active);
> > }
> >
> > static void error_print_engine(struct drm_i915_error_state_buf *m,
> > @@ -1262,7 +1262,6 @@ static bool record_context(struct drm_i915_error_context *e,
> > rcu_read_unlock();
> > }
> >
> > - e->hw_id = ctx->hw_id;
> > e->sched_attr = ctx->sched;
> > e->guilty = atomic_read(&ctx->guilty_count);
> > e->active = atomic_read(&ctx->active_count);
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
> > index caaed5093d95..4dc36d6ee3a2 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.h
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.h
> > @@ -117,7 +117,6 @@ struct i915_gpu_state {
> > struct drm_i915_error_context {
> > char comm[TASK_COMM_LEN];
> > pid_t pid;
> > - u32 hw_id;
> > int active;
> > int guilty;
> > struct i915_sched_attr sched_attr;
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> > index 80055501eccb..d36ba248d438 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -1283,22 +1283,15 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
> > } else {
> > stream->specific_ctx_id_mask =
> > (1U << GEN8_CTX_ID_WIDTH) - 1;
> > - stream->specific_ctx_id =
> > - upper_32_bits(ce->lrc_desc);
> > - stream->specific_ctx_id &=
> > - stream->specific_ctx_id_mask;
> > + stream->specific_ctx_id = stream->specific_ctx_id_mask;
> > }
> > break;
> >
> > case 11:
> > case 12: {
> > stream->specific_ctx_id_mask =
> > - ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32) |
> > - ((1U << GEN11_ENGINE_INSTANCE_WIDTH) - 1) << (GEN11_ENGINE_INSTANCE_SHIFT - 32) |
> > - ((1 << GEN11_ENGINE_CLASS_WIDTH) - 1) << (GEN11_ENGINE_CLASS_SHIFT - 32);
> > - stream->specific_ctx_id = upper_32_bits(ce->lrc_desc);
> > - stream->specific_ctx_id &=
> > - stream->specific_ctx_id_mask;
> > + ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32);
> > + stream->specific_ctx_id = stream->specific_ctx_id_mask;
> > break;
> > }
> >
> > @@ -1306,6 +1299,8 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
> > MISSING_CASE(INTEL_GEN(i915));
> > }
> >
> > + ce->tag = stream->specific_ctx_id_mask;
> > +
> > DRM_DEBUG_DRIVER("filtering on ctx_id=0x%x ctx_id_mask=0x%x\n",
> > stream->specific_ctx_id,
> > stream->specific_ctx_id_mask);
> > @@ -1324,12 +1319,14 @@ static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
> > {
> > struct intel_context *ce;
> >
> > - stream->specific_ctx_id = INVALID_CTX_ID;
> > - stream->specific_ctx_id_mask = 0;
> > -
> > ce = fetch_and_zero(&stream->pinned_ctx);
> > - if (ce)
> > + if (ce) {
> > + ce->tag = 0;
> > intel_context_unpin(ce);
> > + }
> > +
> > + stream->specific_ctx_id = INVALID_CTX_ID;
> > + stream->specific_ctx_id_mask = 0;
>
> OA hunks looks dodgy as well. ce->tag is set to
> stream->specific_ctx_id_mask while I think it should be id. Furthermore
> id is assigned from mask which is fixed and not unique for different
> contexts.
It is singular in the entire system. The mask is a guaranteed unique
value.
> > diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
> > index 1c9db08f7c28..ac1ff558eb90 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_vma.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
> > @@ -170,7 +170,7 @@ static int igt_vma_create(void *arg)
> > }
> >
> > nc = 0;
> > - for_each_prime_number(num_ctx, MAX_CONTEXT_HW_ID) {
> > + for_each_prime_number(num_ctx, 2 * NUM_CONTEXT_TAG) {
> > for (; nc < num_ctx; nc++) {
> > ctx = mock_context(i915, "mock");
> > if (!ctx)
> >
>
> Unless I am missing something I see this patch as simplification and not
> a fix. If we can get away with it in the GuC world then it's quite a big
> simplification so it's fine by me.
Bspec says that for lrc_desc with matching tags (at least for gen8-gen11
I believe), it performs a lite restore (regardless of lrca). Ergo it is
quite a major fix.
-Chris
More information about the Intel-gfx
mailing list