[PATCH] drm/xe/oa: Disallow OA from being enabled on active exec_queue's

Ashutosh Dixit ashutosh.dixit at intel.com
Tue Nov 19 01:32:56 UTC 2024


Enabling OA on an exec_queue toggles the OAC_CONTEXT_ENABLE bit in
CTXT_SR_CTL register. Toggling this bit changes the size and layout of the
underlying HW context image. Therefore, enabling OA on an already active
exec_queue (as currently implemented in xe) is an invalid operation and can
cause hangs. Therefore, disallow OA from being enabled on active
exec_queue's (here, by active we mean a context on which submissions have
previously happened).

Transition from 1 -> 0 for this bit was disallowed in
'0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream
close")'. Here we disallow the 0 -> 1 transition on active contexts.

v2: Don't export exec_queue_enabled, define new xe_exec_queue_op (M Brost)
    Directly check OAC_CONTEXT_ENABLE bit from context image (J Cavitt)

Bspec: 60314
Fixes: 2f4a730fcd2d ("drm/xe/oa: Add OAR support")
Cc: stable at vger.kernel.org
Signed-off-by: Ashutosh Dixit <ashutosh.dixit at intel.com>
---
 drivers/gpu/drm/xe/xe_exec_queue_types.h |  2 ++
 drivers/gpu/drm/xe/xe_guc_submit.c       |  1 +
 drivers/gpu/drm/xe/xe_oa.c               | 13 +++++++++++++
 3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h
index 1158b6062a6cd..b88d617c37b33 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
@@ -184,6 +184,8 @@ struct xe_exec_queue_ops {
 	void (*resume)(struct xe_exec_queue *q);
 	/** @reset_status: check exec queue reset status */
 	bool (*reset_status)(struct xe_exec_queue *q);
+	/** @enabled: check if exec queue is in enabled state */
+	bool (*enabled)(struct xe_exec_queue *q);
 };
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index f9ecee5364d82..b9b9cdb6f768b 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -1660,6 +1660,7 @@ static const struct xe_exec_queue_ops guc_exec_queue_ops = {
 	.suspend_wait = guc_exec_queue_suspend_wait,
 	.resume = guc_exec_queue_resume,
 	.reset_status = guc_exec_queue_reset_status,
+	.enabled = exec_queue_enabled,
 };
 
 static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q)
diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
index 8dd55798ab312..4a7440c40978c 100644
--- a/drivers/gpu/drm/xe/xe_oa.c
+++ b/drivers/gpu/drm/xe/xe_oa.c
@@ -2066,6 +2066,19 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f
 		if (XE_IOCTL_DBG(oa->xe, !param.exec_q))
 			return -ENOENT;
 
+		/*
+		 * Disallow OA from being enabled on active exec_queue's. Enabling OA sets the
+		 * OAC_CONTEXT_ENABLE bit in CTXT_SR_CTL register. Toggling the bit changes
+		 * the size and layout of the underlying HW context image and can cause hangs.
+		 */
+		if (XE_IOCTL_DBG(oa->xe,
+				 !(xe_lrc_read_ctx_reg(param.exec_q->lrc[0],
+				       CTX_CONTEXT_CONTROL) & CTX_CTRL_OAC_CONTEXT_ENABLE) &&
+				 param.exec_q->ops->enabled(param.exec_q))) {
+			ret = -EADDRINUSE;
+			goto err_exec_q;
+		}
+
 		if (param.exec_q->width > 1)
 			drm_dbg(&oa->xe->drm, "exec_q->width > 1, programming only exec_q->lrc[0]\n");
 	}
-- 
2.41.0



More information about the Intel-xe mailing list