[PATCH v7 0/7] drm/xe/oa: xe_syncs for OA
Ashutosh Dixit
ashutosh.dixit at intel.com
Tue Oct 22 20:03:45 UTC 2024
OA stream configuration submits batches which can be queued behind other
(say workload) batches. Also, in some cases, additional delay is needed for
an OA configuration to take effect, even after programming batches have
completed executing on HW.
Mesa has use cases where a single workload is replayed repeatedly on the
GPU, each time with a different OA configuration (or metric set), in order
to capture different aspects of workload performance. This requires that OA
configuration takes effect at precisely the correct input batch and also
userspace is correctly informed when a new configuration has been activated
(at batch granularity).
In the previous implementation this is implemented by introducing a delay
in the stream open and reconfiguration ioctl's. This works, except that we
introdce a bubble in the userspace pipeline (the pipeline stalls during the
delays in calls into these ioctl's). Mesa prefers that such pipeline stalls
don't happen.
In this series this problem is solved using xe_sync arrays, similar to
xe_exec and vm_bind. Here OA re-configuration can be made to wait till
input fences signal and OA will signal output fences after a new
configuration has been activated. This can of course be done without
stalling the userspace pipeline.
Reviewed Mesa MR which consumes new uapi introduced here:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31283
v2: Address review comments from Matt Brost, Jonathan Cavitt and Jose Souza
v3: Changes to Patch 4 and Patch 7 to address review comments from Matt
Brost and Jonathan Cavitt
v4: Change to Patch 6 in response to Jose Souza
v5: Change to Patch 4 to fix potenatial uaf
v6: Changes to Patch 1 (v3) and Patch 4 (v5)
v7: Rebase and add reference to Mesa MR above
Test-with: 20241022171658.1181667-1-ashutosh.dixit at intel.com
Ashutosh Dixit (7):
drm/xe/oa: Separate batch submission from waiting for completion
drm/xe/oa/uapi: Define and parse OA sync properties
drm/xe/oa: Add input fence dependencies
drm/xe/oa: Signal output fences
drm/xe/oa: Move functions up so they can be reused for config ioctl
drm/xe/oa: Add syncs support to OA config ioctl
drm/xe/oa: Allow only certain property changes from config
drivers/gpu/drm/xe/xe_oa.c | 667 +++++++++++++++++++++----------
drivers/gpu/drm/xe/xe_oa_types.h | 12 +
drivers/gpu/drm/xe/xe_query.c | 2 +-
include/uapi/drm/xe_drm.h | 17 +
4 files changed, 489 insertions(+), 209 deletions(-)
--
2.41.0
More information about the Intel-xe
mailing list