[PATCH] drm/xe/oa: Disallow OA from being enabled on active exec_queue's

Dixit, Ashutosh ashutosh.dixit at intel.com
Thu Nov 21 04:22:05 UTC 2024


On Tue, 19 Nov 2024 14:09:07 -0800, Matthew Brost wrote:
>

Hi Matt,

> On Tue, Nov 19, 2024 at 01:08:49PM -0800, Dixit, Ashutosh wrote:
> > On Tue, 19 Nov 2024 06:44:51 -0800, Matthew Brost wrote:
> > >
> > > On Mon, Nov 18, 2024 at 05:32:56PM -0800, Ashutosh Dixit wrote:
> > > > Enabling OA on an exec_queue toggles the OAC_CONTEXT_ENABLE bit in
> > > > CTXT_SR_CTL register. Toggling this bit changes the size and layout of the
> > > > underlying HW context image. Therefore, enabling OA on an already active
> > > > exec_queue (as currently implemented in xe) is an invalid operation and can
> > > > cause hangs. Therefore, disallow OA from being enabled on active
> > > > exec_queue's (here, by active we mean a context on which submissions have
> > > > previously happened).
> > > >
> > >
> > > This is something we will need to keep on eye on then because in various
> > > experimental code I've played around enabling exec queues upon creation.
> > > e.g., If we want to allocate a doorbell. I seem to recall Habana wanting
> > > to enable exec queues upon creation too.
> >
> > The real requirement here is that HW context image should not have been
> > loaded before OA is enabled on the exec queue. That is what happens today
> > in the ENABLED state, correct, when user space submissions start?
> >
>
> Yea, that is the current flow. If we change this to enable an exec queue
> upon allocation we probably could check for ring head & tail == 0.
>
> Also note LR queues with preemption fences do toggle the enabled state
> too. Does OA apply to those queues? If so, a registered check may be
> better than enabled or maybe just start with ring head & tail == 0.

Yes I could do 'ring head & tail == 0' but not sure how it works, since
ring head and tail could have wrapped round back to 0 (giving the false
impression that exec_queue was not active when it was).

I am assuming you mean the ring is "0 to N" rather than "gtt_addr to
gtt_addr + N", correct?

Thanks.
--
Ashutosh

> > If operations such as doorbell are only management requests to GuC (which
> > don't cause HW context image to be loaded) and if we can name a new state
> > when the exec queue is handed off to userspace for starting submissions, we
> > should be able to stay with this approach.
> >
> > > Just curious if it was ever explored having exec queue creation
> > > extension which enables OA? It seems like this is something we may need
> > > at some point if our exec queue creation semantics change of course
> > > being careful to not break existing flows.
> >
> > Yeah I did think of it but didn't want to change the uapi.
> >
>
> Makes sense.
>
> > Also, a different implementation is possible which avoids this resizing of
> > the context image altogether. It requires the kernel OA code submit its
> > submissions on the user exec queue (and use that exec queue's VM, currently
> > OA code uses a kernel exec queue). There are some reasons I don't want to
> > implement that just yet, but worst case, we can do that if absolutely
> > needed.
> >
> > Thanks.
> > --
> > Ashutosh
> >
> > > > Transition from 1 -> 0 for this bit was disallowed in
> > > > '0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream
> > > > close")'. Here we disallow the 0 -> 1 transition on active contexts.
> > > >
> > > > v2: Don't export exec_queue_enabled, define new xe_exec_queue_op (M Brost)
> > > >     Directly check OAC_CONTEXT_ENABLE bit from context image (J Cavitt)
> > > >
> > > > Bspec: 60314
> > > > Fixes: 2f4a730fcd2d ("drm/xe/oa: Add OAR support")
> > > > Cc: stable at vger.kernel.org
> > > > Signed-off-by: Ashutosh Dixit <ashutosh.dixit at intel.com>
> > > > ---
> > > >  drivers/gpu/drm/xe/xe_exec_queue_types.h |  2 ++
> > > >  drivers/gpu/drm/xe/xe_guc_submit.c       |  1 +
> > > >  drivers/gpu/drm/xe/xe_oa.c               | 13 +++++++++++++
> > > >  3 files changed, 16 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > > index 1158b6062a6cd..b88d617c37b33 100644
> > > > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > > @@ -184,6 +184,8 @@ struct xe_exec_queue_ops {
> > > >	void (*resume)(struct xe_exec_queue *q);
> > > >	/** @reset_status: check exec queue reset status */
> > > >	bool (*reset_status)(struct xe_exec_queue *q);
> > > > +	/** @enabled: check if exec queue is in enabled state */
> > > > +	bool (*enabled)(struct xe_exec_queue *q);
> > > >  };
> > > >
> > > >  #endif
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > index f9ecee5364d82..b9b9cdb6f768b 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > @@ -1660,6 +1660,7 @@ static const struct xe_exec_queue_ops guc_exec_queue_ops = {
> > > >	.suspend_wait = guc_exec_queue_suspend_wait,
> > > >	.resume = guc_exec_queue_resume,
> > > >	.reset_status = guc_exec_queue_reset_status,
> > > > +	.enabled = exec_queue_enabled,
> > > >  };
> > > >
> > > >  static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
> > > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
> > > > index 8dd55798ab312..4a7440c40978c 100644
> > > > --- a/drivers/gpu/drm/xe/xe_oa.c
> > > > +++ b/drivers/gpu/drm/xe/xe_oa.c
> > > > @@ -2066,6 +2066,19 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f
> > > >		if (XE_IOCTL_DBG(oa->xe, !param.exec_q))
> > > >			return -ENOENT;
> > > >
> > > > +		/*
> > > > +		 * Disallow OA from being enabled on active exec_queue's. Enabling OA sets the
> > > > +		 * OAC_CONTEXT_ENABLE bit in CTXT_SR_CTL register. Toggling the bit changes
> > > > +		 * the size and layout of the underlying HW context image and can cause hangs.
> > > > +		 */
> > > > +		if (XE_IOCTL_DBG(oa->xe,
> > > > +				 !(xe_lrc_read_ctx_reg(param.exec_q->lrc[0],
> > > > +				       CTX_CONTEXT_CONTROL) & CTX_CTRL_OAC_CONTEXT_ENABLE) &&
> > > > +				 param.exec_q->ops->enabled(param.exec_q))) {
> > > > +			ret = -EADDRINUSE;
> > > > +			goto err_exec_q;
> > > > +		}
> > > > +
> > > >		if (param.exec_q->width > 1)
> > > >			drm_dbg(&oa->xe->drm, "exec_q->width > 1, programming only exec_q->lrc[0]\n");
> > > >	}
> > > > --
> > > > 2.41.0
> > > >


More information about the Intel-xe mailing list