[Intel-xe] [PATCH v2 34/50] drm/xe/uapi: Move memory_region masks from GT to engine

Rodrigo Vivi rodrigo.vivi at intel.com
Thu Nov 16 03:31:41 UTC 2023


On Thu, Nov 09, 2023 at 04:04:33PM -0500, Rodrigo Vivi wrote:
> On Thu, Nov 09, 2023 at 02:50:04PM -0500, Souza, Jose wrote:
> > On Thu, 2023-11-09 at 13:46 -0500, Rodrigo Vivi wrote:
> > > On Thu, Nov 09, 2023 at 04:35:19PM +0000, Souza, Jose wrote:
> > > > On Thu, 2023-11-09 at 08:29 -0800, José Roberto de Souza wrote:
> > > > > On Fri, 2023-11-03 at 14:34 +0000, Francois Dugast wrote:
> > > > > > From: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > > > 
> > > > > > In the Tiled platforms, the memory is more tied to the Tile
> > > > > > than to the GT.
> > > > > > The distance (near vs far) makes more sense from the Engine
> > > > > > perspective than from the GT perspective.
> > > > > 
> > > > > why not add a uAPI to query tile information?
> > > > > this is duplicating a tile information onto every engine of that tile.
> > > > > we could leave reserved fields in the tile uAPI to include additional information that might be relevant in future.
> > > 
> > > This is not necessarily a tile information. In PVC, truly the mem_region is tied to the tile,
> > > but we don't want to fix the uapi in only one platform.
> > > Like in the previous, the mem_region was per GT. who knows the future?!
> > 
> > Older platforms had one gt and one tile, MTL has 1 tile and 2 gts and in both cases it matches with having a tile query.
> 
> but we can have a platform with multiple tiles and no memory on any tile directly.
> but adding the api there you are limiting your future. or sentenced to have a
> version 2 of the uapi.
> 
> > 
> > > 
> > > But the engine needs the information on which mem_region could be better,
> > > regardless of if it lives along the same gt, or the same tile, or outside.
> > > So, near and far sounded the most generic and future proof way.
> > 
> > Memory regions are also used here in drm_xe_vm_bind_op.tile_mask/pt_placement_hint.
> > To me it looks odd that I need to go trough all engines to know what are available tiles.
> 
> that won't be the case. we are going to change that to the sched_group_mask.
> 
> > 
> > Other way to make it future prof is keep this information in gt query, each engine will always belong to one GT and each GT will always belong to one
> > tile.
> > This way it do not matters if the memory_region is tile specific information or a GT specific information the uAPI will be consistent with less
> > duplication than putting it in hw engine.
> 
> Okay, with this I can agree. Although it might not be necessarily the most future
> proof one, but since all engines live in the GT and the 'distance' to any memory
> would likely be the same for every engine, then we could have this way.
> 
> Oh, I though about that. that was another reason why I kept the 2 patches separated ;)
> So we can drop the second and keep the renaming.
> It would be okay by me.

Jose, I was trying to drop this, but then it waterfalls to the next patches in
a very bad way.

I was planning to have engine info as the informational part of the engine:
tile, gt, memory "distance", and convert the eci's gt_id to a generic number
called sched_group.

In the current platforms and schema sched_group == gt_id, but I didn't want
to make that a hard tie.

Sched groups could be a group of engines that can be used in parallel execution
or virtual balance and it is a number 0..n regardless the tile where it lives.

gt_id could be per tile and not global for instance. Or even we could have
sched_groups that goes across different GTs.

So, I kept the patch that adds tile_id and gt_id to the info and
convert eci's gt_id to sched_group. However the IGT patch is becoming an
ugly monster with places using engine_info and other places using the eci
and even mixed cases.

IF we drop this patch here, I believe it is good to also drop the sched_group
one, but then the engine_info starts to not make sense at all anymore and
I would also drop that ad simply add all the information to inside eci.

What are your thoughts on this, since you were the first to envision the
engine_info anyway.

Thanks,
Rodrigo.

> 
> > 
> > > 
> > > > 
> > > > other issue here and in other patches of this huge patch series.
> > > > 
> > > > a previous patch in this series renamed near_mem_regions, then this one moves it to other struct... please drop the first patch and rename and move it
> > > > into a single patch.
> > > 
> > > okay, rename and move in the same patch kind of makes sense. and patches
> > > could be squashed together. But when doing the IGT on the side, I felt
> > > that small changes were better, so renames goes with sed commands and
> > > the patch was small and clear. But I don't mind if they get squashed in
> > > the end.
> > > 
> > > > 
> > > > a series as big as this one will cause reviews in KMD and UMD to take a while...
> > > 
> > > that's unfortunate indeed. But big patches also don't help much to speed up reviews.
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > So, let's move this out from the GT and into the engine info.
> > > > > > 
> > > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/xe/xe_query.c | 14 +++++++-------
> > > > > >  include/uapi/drm/xe_drm.h     | 27 ++++++++++++++-------------
> > > > > >  2 files changed, 21 insertions(+), 20 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c
> > > > > > index aa5743e2e4d0..49a9b36f1193 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_query.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_query.c
> > > > > > @@ -217,6 +217,13 @@ static int query_engines(struct xe_device *xe,
> > > > > >  				hwe->logical_instance;
> > > > > >  			hw_engine_info[i].instance.gt_id = gt->info.id;
> > > > > >  			hw_engine_info[i].instance.pad = 0;
> > > > > > +			if (!IS_DGFX(xe))
> > > > > > +				hw_engine_info[i].near_mem_regions = 0x1;
> > > > > > +			else
> > > > > > +				hw_engine_info[i].near_mem_regions =
> > > > > > +					BIT(gt_to_tile(gt)->id) << 1;
> > > > > > +			hw_engine_info[i].far_mem_regions = xe->info.mem_region_mask ^
> > > > > > +				hw_engine_info[i].near_mem_regions;
> > > > > >  			memset(hw_engine_info->reserved, 0, sizeof(hw_engine_info->reserved));
> > > > > >  
> > > > > >  			i++;
> > > > > > @@ -377,13 +384,6 @@ static int query_gt_list(struct xe_device *xe, struct drm_xe_device_query *query
> > > > > >  			gt_list->gt_list[id].type = DRM_XE_QUERY_GT_TYPE_MAIN;
> > > > > >  		gt_list->gt_list[id].gt_id = gt->info.id;
> > > > > >  		gt_list->gt_list[id].clock_freq = gt->info.clock_freq;
> > > > > > -		if (!IS_DGFX(xe))
> > > > > > -			gt_list->gt_list[id].near_mem_regions = 0x1;
> > > > > > -		else
> > > > > > -			gt_list->gt_list[id].near_mem_regions =
> > > > > > -				BIT(gt_to_tile(gt)->id) << 1;
> > > > > > -		gt_list->gt_list[id].far_mem_regions = xe->info.mem_region_mask ^
> > > > > > -			gt_list->gt_list[id].near_mem_regions;
> > > > > >  	}
> > > > > >  
> > > > > >  	if (copy_to_user(query_ptr, gt_list, size)) {
> > > > > > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > > > > > index 5164ed150a2e..8e84ef6fd46e 100644
> > > > > > --- a/include/uapi/drm/xe_drm.h
> > > > > > +++ b/include/uapi/drm/xe_drm.h
> > > > > > @@ -228,6 +228,20 @@ struct drm_xe_query_engine_info {
> > > > > >  	/** @instance: The @drm_xe_engine_class_instance */
> > > > > >  	struct drm_xe_engine_class_instance instance;
> > > > > >  
> > > > > > +	/**
> > > > > > +	 * @near_mem_regions: Bit mask of instances from
> > > > > > +	 * drm_xe_query_mem_regions that is near this engine.
> > > > > > +	 */
> > > > > > +	__u64 near_mem_regions;
> > > > > > +	/**
> > > > > > +	 * @far_mem_regions: Bit mask of instances from
> > > > > > +	 * drm_xe_query_mem_regions that is far from this engine.
> > > > > > +	 * In general, it has extra indirections when compared to the
> > > > > > +	 * @near_mem_regions. For a discrete device this could mean system
> > > > > > +	 * memory and memory living in a different Tile.
> > > > > > +	 */
> > > > > > +	__u64 far_mem_regions;
> > > > > > +
> > > > > >  	/** @reserved: Reserved */
> > > > > >  	__u64 reserved[3];
> > > > > >  };
> > > > > > @@ -401,19 +415,6 @@ struct drm_xe_query_gt {
> > > > > >  	__u16 gt_id;
> > > > > >  	/** @clock_freq: A clock frequency for timestamp */
> > > > > >  	__u32 clock_freq;
> > > > > > -	/**
> > > > > > -	 * @near_mem_regions: Bit mask of instances from
> > > > > > -	 * drm_xe_query_mem_regions that is near the current engines of this GT.
> > > > > > -	 */
> > > > > > -	__u64 near_mem_regions;
> > > > > > -	/**
> > > > > > -	 * @far_mem_regions: Bit mask of instances from
> > > > > > -	 * drm_xe_query_mem_regions that is far from the engines of this GT.
> > > > > > -	 * In general, it has extra indirections when compared to the
> > > > > > -	 * @near_mem_regions. For a discrete device this could mean system
> > > > > > -	 * memory and memory living in a different Tile.
> > > > > > -	 */
> > > > > > -	__u64 far_mem_regions;
> > > > > >  	/** @reserved: Reserved */
> > > > > >  	__u64 reserved[8];
> > > > > >  };
> > > > > 
> > > > 
> > 


More information about the Intel-xe mailing list