[PATCH v3 5/6] drm/xe: Don't compare GT ID to GT count when determining valid GTs
Cavitt, Jonathan
jonathan.cavitt at intel.com
Mon Jun 30 22:08:38 UTC 2025
-----Original Message-----
From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Matt Roper
Sent: Monday, June 30, 2025 10:35 AM
To: intel-xe at lists.freedesktop.org
Cc: Roper, Matthew D <matthew.d.roper at intel.com>
Subject: [PATCH v3 5/6] drm/xe: Don't compare GT ID to GT count when determining valid GTs
>
> On current platforms with multiple GTs, all of the GT IDs are
> consecutive; as a result we know that the GT IDs range from 0 to
> gt_count-1 and can determine if a GT ID is valid by comparing against
> the count. The consecutive nature of GT IDs may not hold true on future
> platforms if/when we have platforms that are both multi-tile and have
> multiple GTs within each tile. Once such platforms exist, it's quite
> possible that we could wind up with something like a GT list composed of
> IDs 0, 2, and 3 with no GT 1 (which would be a 2-tile platform with
> media only on the second tile).
>
> To future-proof the code we should stop comparing against the GT count
> to determine whether a GT ID is valid or not. Instead we should do an
> actual lookup of the ID to determine whether the GT exists. This also
> means that our GT loop macro should not end at the GT count, but should
> rather examine the entire space up to (# of tiles) * (max GT per tile)
> to ensure it doesn't stop prematurely.
>
> Signed-off-by: Matt Roper <matthew.d.roper at intel.com>
Maybe it would be good to have a macro to generate the
(# of tiles) * (max GT per tile) value (or just save that value to xe->info),
as well as a mask of all the active GTs. So, in the above example, we'd have
4 max GTs with an active mask of 1011.
It would mean we wouldn't need to call into xe_device_get_gt in
xe_exec_queue_create_ioctl or xe_hw_engine_lookup (just
compare eci[0].gt_id to the active GT mask instead).
This isn't a blocking ask, mind. Just a potential optimization to consider.
LGTM otherwise.
Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
-Jonathan Cavitt
> ---
> drivers/gpu/drm/xe/xe_device.h | 6 +-----
> drivers/gpu/drm/xe/xe_eu_stall.c | 6 ++++--
> drivers/gpu/drm/xe/xe_exec_queue.c | 2 +-
> drivers/gpu/drm/xe/xe_hw_engine.c | 3 ++-
> 4 files changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 4e719d398c88..f0eb8150f185 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -127,12 +127,8 @@ static inline bool xe_device_uc_enabled(struct xe_device *xe)
> for ((id__) = 1; (id__) < (xe__)->info.tile_count; (id__)++) \
> for_each_if((tile__) = &(xe__)->tiles[(id__)])
>
> -/*
> - * FIXME: This only works for now since multi-tile and standalone media
> - * happen to be mutually exclusive. Future platforms may change this...
> - */
> #define for_each_gt(gt__, xe__, id__) \
> - for ((id__) = 0; (id__) < (xe__)->info.gt_count; (id__)++) \
> + for ((id__) = 0; (id__) < (xe__)->info.tile_count * (xe__)->info.max_gt_per_tile; (id__)++) \
> for_each_if((gt__) = xe_device_get_gt((xe__), (id__)))
>
> static inline struct xe_force_wake *gt_to_fw(struct xe_gt *gt)
> diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> index 96732613b4b7..af7916315ac6 100644
> --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> @@ -258,11 +258,13 @@ static int set_prop_eu_stall_wait_num_reports(struct xe_device *xe, u64 value,
> static int set_prop_eu_stall_gt_id(struct xe_device *xe, u64 value,
> struct eu_stall_open_properties *props)
> {
> - if (value >= xe->info.gt_count) {
> + struct xe_gt *gt = xe_device_get_gt(xe, value);
> +
> + if (!gt) {
> drm_dbg(&xe->drm, "Invalid GT ID %llu for EU stall sampling\n", value);
> return -EINVAL;
> }
> - props->gt = xe_device_get_gt(xe, value);
> + props->gt = gt;
> return 0;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> index fee22358cc09..8991b4aed440 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -610,7 +610,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
> if (XE_IOCTL_DBG(xe, err))
> return -EFAULT;
>
> - if (XE_IOCTL_DBG(xe, eci[0].gt_id >= xe->info.gt_count))
> + if (XE_IOCTL_DBG(xe, !xe_device_get_gt(xe, eci[0].gt_id)))
> return -EINVAL;
>
> if (args->flags & DRM_XE_EXEC_QUEUE_LOW_LATENCY_HINT)
> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> index 3439c8522d01..796ba8c34a16 100644
> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> @@ -1059,12 +1059,13 @@ struct xe_hw_engine *
> xe_hw_engine_lookup(struct xe_device *xe,
> struct drm_xe_engine_class_instance eci)
> {
> + struct xe_gt *gt = xe_device_get_gt(xe, eci.gt_id);
> unsigned int idx;
>
> if (eci.engine_class >= ARRAY_SIZE(user_to_xe_engine_class))
> return NULL;
>
> - if (eci.gt_id >= xe->info.gt_count)
> + if (!gt)
> return NULL;
>
> idx = array_index_nospec(eci.engine_class,
> --
> 2.49.0
>
>
More information about the Intel-xe
mailing list