[PATCH v3 5/6] drm/xe: Don't compare GT ID to GT count when determining valid GTs

Cavitt, Jonathan jonathan.cavitt at intel.com
Mon Jun 30 22:08:38 UTC 2025


-----Original Message-----
From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of Matt Roper
Sent: Monday, June 30, 2025 10:35 AM
To: intel-xe at lists.freedesktop.org
Cc: Roper, Matthew D <matthew.d.roper at intel.com>
Subject: [PATCH v3 5/6] drm/xe: Don't compare GT ID to GT count when determining valid GTs
> 
> On current platforms with multiple GTs, all of the GT IDs are
> consecutive; as a result we know that the GT IDs range from 0 to
> gt_count-1 and can determine if a GT ID is valid by comparing against
> the count.  The consecutive nature of GT IDs may not hold true on future
> platforms if/when we have platforms that are both multi-tile and have
> multiple GTs within each tile.  Once such platforms exist, it's quite
> possible that we could wind up with something like a GT list composed of
> IDs 0, 2, and 3 with no GT 1 (which would be a 2-tile platform with
> media only on the second tile).
> 
> To future-proof the code we should stop comparing against the GT count
> to determine whether a GT ID is valid or not.  Instead we should do an
> actual lookup of the ID to determine whether the GT exists.  This also
> means that our GT loop macro should not end at the GT count, but should
> rather examine the entire space up to (# of tiles) * (max GT per tile)
> to ensure it doesn't stop prematurely.
> 
> Signed-off-by: Matt Roper <matthew.d.roper at intel.com>

Maybe it would be good to have a macro to generate the
(# of tiles) * (max GT per tile) value (or just save that value to xe->info),
as well as a mask of all the active GTs.  So, in the above example, we'd have
4 max GTs with an active mask of 1011.

It would mean we wouldn't need to call into xe_device_get_gt in
xe_exec_queue_create_ioctl or xe_hw_engine_lookup (just
compare eci[0].gt_id to the active GT mask instead).

This isn't a blocking ask, mind.  Just a potential optimization to consider.
LGTM otherwise.
Reviewed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
-Jonathan Cavitt

> ---
>  drivers/gpu/drm/xe/xe_device.h     | 6 +-----
>  drivers/gpu/drm/xe/xe_eu_stall.c   | 6 ++++--
>  drivers/gpu/drm/xe/xe_exec_queue.c | 2 +-
>  drivers/gpu/drm/xe/xe_hw_engine.c  | 3 ++-
>  4 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 4e719d398c88..f0eb8150f185 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -127,12 +127,8 @@ static inline bool xe_device_uc_enabled(struct xe_device *xe)
>  	for ((id__) = 1; (id__) < (xe__)->info.tile_count; (id__)++) \
>  		for_each_if((tile__) = &(xe__)->tiles[(id__)])
>  
> -/*
> - * FIXME: This only works for now since multi-tile and standalone media
> - * happen to be mutually exclusive.  Future platforms may change this...
> - */
>  #define for_each_gt(gt__, xe__, id__) \
> -	for ((id__) = 0; (id__) < (xe__)->info.gt_count; (id__)++) \
> +	for ((id__) = 0; (id__) < (xe__)->info.tile_count * (xe__)->info.max_gt_per_tile; (id__)++) \
>  		for_each_if((gt__) = xe_device_get_gt((xe__), (id__)))
>  
>  static inline struct xe_force_wake *gt_to_fw(struct xe_gt *gt)
> diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
> index 96732613b4b7..af7916315ac6 100644
> --- a/drivers/gpu/drm/xe/xe_eu_stall.c
> +++ b/drivers/gpu/drm/xe/xe_eu_stall.c
> @@ -258,11 +258,13 @@ static int set_prop_eu_stall_wait_num_reports(struct xe_device *xe, u64 value,
>  static int set_prop_eu_stall_gt_id(struct xe_device *xe, u64 value,
>  				   struct eu_stall_open_properties *props)
>  {
> -	if (value >= xe->info.gt_count) {
> +	struct xe_gt *gt = xe_device_get_gt(xe, value);
> +
> +	if (!gt) {
>  		drm_dbg(&xe->drm, "Invalid GT ID %llu for EU stall sampling\n", value);
>  		return -EINVAL;
>  	}
> -	props->gt = xe_device_get_gt(xe, value);
> +	props->gt = gt;
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> index fee22358cc09..8991b4aed440 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -610,7 +610,7 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>  	if (XE_IOCTL_DBG(xe, err))
>  		return -EFAULT;
>  
> -	if (XE_IOCTL_DBG(xe, eci[0].gt_id >= xe->info.gt_count))
> +	if (XE_IOCTL_DBG(xe, !xe_device_get_gt(xe, eci[0].gt_id)))
>  		return -EINVAL;
>  
>  	if (args->flags & DRM_XE_EXEC_QUEUE_LOW_LATENCY_HINT)
> diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> index 3439c8522d01..796ba8c34a16 100644
> --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> @@ -1059,12 +1059,13 @@ struct xe_hw_engine *
>  xe_hw_engine_lookup(struct xe_device *xe,
>  		    struct drm_xe_engine_class_instance eci)
>  {
> +	struct xe_gt *gt = xe_device_get_gt(xe, eci.gt_id);
>  	unsigned int idx;
>  
>  	if (eci.engine_class >= ARRAY_SIZE(user_to_xe_engine_class))
>  		return NULL;
>  
> -	if (eci.gt_id >= xe->info.gt_count)
> +	if (!gt)
>  		return NULL;
>  
>  	idx = array_index_nospec(eci.engine_class,
> -- 
> 2.49.0
> 
> 


More information about the Intel-xe mailing list