[Intel-xe] [RFC 2/5] drm/xe/uapi: Document drm_xe_query_gt.

Rodrigo Vivi rodrigo.vivi at intel.com
Thu Sep 14 14:07:46 UTC 2023


On Thu, Sep 14, 2023 at 09:57:15AM -0400, Souza, Jose wrote:
> On Wed, 2023-09-13 at 17:55 -0400, Rodrigo Vivi wrote:
> > On Wed, Sep 13, 2023 at 02:46:30PM -0700, Matt Roper wrote:
> > > On Wed, Sep 13, 2023 at 05:12:57PM -0400, Rodrigo Vivi wrote:
> > > > On Wed, Sep 13, 2023 at 12:55:06PM -0700, Matt Roper wrote:
> > > > > On Fri, Sep 08, 2023 at 04:32:59PM -0400, Rodrigo Vivi wrote:
> > > > > > Split drm_xe_query_gt out of the gt list one in order to better
> > > > > > document it. No functional change at this point.
> > > > > > 
> > > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > > > ---
> > > > > >  include/uapi/drm/xe_drm.h | 65 ++++++++++++++++++++++++++-------------
> > > > > >  1 file changed, 43 insertions(+), 22 deletions(-)
> > > > > > 
> > > > > > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > > > > > index 51c4ef5dee6d..4bcee709bef9 100644
> > > > > > --- a/include/uapi/drm/xe_drm.h
> > > > > > +++ b/include/uapi/drm/xe_drm.h
> > > > > > @@ -262,6 +262,47 @@ struct drm_xe_query_config {
> > > > > >  	__u64 info[];
> > > > > >  };
> > > > > >  
> > > > > > +/**
> > > > > > + * struct drm_xe_query_gt - describe an individual GT.
> > > > > > + *
> > > > > > + * To be used with drm_xe_query_gts, which will return a list with all the
> > > > > > + * existing GT individual descriptions.
> > > > > > + * Graphics Technology (GT) is a subset of a GPU/tile that is responsible for
> > > > > > + * implementing graphics and/or media operations.
> > > > > > + */
> > > > > > +struct drm_xe_query_gt {
> > > > > > +#define XE_QUERY_GT_TYPE_MAIN		0
> > > > > > +#define XE_QUERY_GT_TYPE_REMOTE		1
> > > > > 
> > > > > I know the goal of this patch is just to add kerneldoc, but while we're
> > > > > looking at this, one other thing we'll need to decide before finalizing
> > > > > the uapi is whether there's any value in keeping these two separate.
> > > > > I.e., does any userspace actually care about the difference between
> > > > > "render GT on first tile" vs "render GT on other tiles?"  Given that GT
> > > > > at the uapi level is pretty much only concerned with power management, I
> > > > > can't really think of anywhere that the distinction would matter (but we
> > > > > should still check with the userspace teams).
> > > > 
> > > > Cc: Jose.
> > > > Mesa is using this, but I believe a XE_QUERY_GT_TYPE_RENDER could be
> > > > enough. And in case we have multiple GTs then they just get the data
> > > > from whatever RENDER if finds first and stop?
> 
> What about PVC? XE_QUERY_GT_TYPE_RENDER may confuse compute folks, maybe a better name to include both render and compute...
> 
> Also could we have a guarantee that the main tile will always be instance = 0 for XE_QUERY_GT_TYPE_RENDER(or whatever name)?
> Just in case we need to do some special handling based on a main tile parameter.

yeap, I was also wondering the same.
Let's keep as main, secondary and media then.

> 
> > > > 
> > > > > 
> > > > > > +#define XE_QUERY_GT_TYPE_MEDIA		2
> > > > > > +	/** @type: GT type: Main, Remote, or Media */
> > > > > > +	__u16 type;
> > > > > > +	/** @instance: Instance of this GT in the GT list */
> > > > > > +	__u16 instance;
> > > > > > +	/** @clock_freq: A clock frequency for timestamp */
> > > > > > +	__u32 clock_freq;
> > > > > 
> > > > > We may want to be more clear about what frequency this is.  I.e., is it
> > > > > the current frequency when the query is issued?  Is it the maximum
> > > > > frequency the GT supports?  Does this always give a meaningful value or
> > > > > does it report 0 when nothing is executing (and we're in RC6)?
> > > > > 
> > > > > BTW, is this redundant with what we're exposing through sysfs?  Does
> > > > > userspace get any benefit from grabbing frequency here rather than
> > > > > through the sysfs interface?
> > > > 
> > > > oh my! When I was adding this documentation here I hated this API
> > > > exactly because I knew it would cause confusion with the GT operating
> > > > frequencies. This has nothing to do with those frequencies. It is a
> > > > timestamp thing.
> > > > 
> > > > Check get_crystal_clock_freq() at xe_gt_clock.c.
> > > > 
> > > > Do you have more suggestions on how to make this documentation
> > > > better and people not confusing this with the GT frequencies?
> > > > 
> > > > maybe remove the freq term and use crystal_clock? :/
> > > 
> > > Ah, okay.  Yeah, I'm not really sure how that winds up getting used, so
> > > probably another place for Jose to weigh in on what kind of description
> > > makes sense.
> > > 
> > > I know there's other uapi work for CPU/GPU timestamp correlation going
> > > on, although I haven't really followed what's happening there.  Does
> > > this relate to that in any way?  And if it does, should this information
> > > be moving over to that query rather than staying in the general GT
> > > query?
> > 
> > Good questions...
> > Cc: Jose, since I forgot to do so on the previous email.
> 
> I believe we should keep it in GTS query as DRM_XE_QUERY_CS_CYCLES uAPI will fail when running in SRIOV modes but Mesa would still query timestamp
> from engine in some cases...

okay, let's keep it there.

> 
> About the name, what about timestamp_frequency?

well, this at least align with mesa variable name... although
the 'frequency' word is what will always cause confusion... :/

> 
> > 
> > > 
> > > 
> > > Matt
> > > 
> > > > 
> > > > > 
> > > > > > +	/** @features: Reserved for future information about GT features */
> > > > > > +	__u64 features;
> > > > > > +	/**
> > > > > > +	 * @native_mem_regions: Bit maks of instances from
> > > > > > +	 * drm_xe_query_mem_usage that lives on the same GPU/Tile and have
> > > > > > +	 * direct access.
> > > > > > +	 */
> > > > > > +	__u64 native_mem_regions;
> > > > > 
> > > > > Was the plan to eventually move these memory region masks to a
> > > > > tile-based uapi?  Leaving them in a GT query seems awkward since the GT
> > > > > isn't directly related to memory regions.  It seems like what userspace
> > > > > cares about would either be tile<->mem_region location or
> > > > > hwe<->mem_region distance.  How is userspace utilizing this information
> > > > > today (and are they needing to jump through extra hoops due to the
> > > > > current placement?)?
> > > > 
> > > > That's a good question. They get this information, but it doesn't
> > > > look they are making a good real usage of this anyway...
> > > > 
> > > > Maybe we should just replace this entirely per a tile_id?
> > > > 
> > > > The vm_bind operation has a tile_mask there. So on the query gt here
> > > > we maybe only inform what tile that belongs to, so the userspace
> > > > can call the vm_bind to the right tile?
> > > > 
> > > > And later if they really need the distance we introduce something
> > > > like that i915 dii for query distance?
> > > > 
> > > > Another thing that confuses me here is that mem_region query
> > > > doesn't have any information about the tile-placement but
> > > > just the vram vs system types...
> > > > 
> > > > Should we add the tile_id there as well?
> > > > 
> > > > > 
> > > > > Also s/maks/mask/ in the kerneldoc you added here (and in the ones
> > > > > below).
> > > > 
> > > > Thanks,
> > > > Rodrigo.
> > > > 
> > > > > 
> > > > > 
> > > > > Matt
> > > > > 
> > > > > > +	/**
> > > > > > +	 * @slow_mem_regions: Bit maks of instances from
> > > > > > +	 * drm_xe_query_mem_usage that this GT can indirectly access, although
> > > > > > +	 * they live on a different GPU/Tile.
> > > > > > +	 */
> > > > > > +	__u64 slow_mem_regions;
> > > > > > +	/**
> > > > > > +	 * @inaccessible_mem_regions: Bit maks of instances from
> > > > > > +	 * drm_xe_query_mem_usage that is not accessible by this GT at all.
> > > > > > +	 */
> > > > > > +	__u64 inaccessible_mem_regions;
> > > > > > +	/** @reserved: Reserved */
> > > > > > +	__u64 reserved[8];
> > > > > > +};
> > > > > > +
> > > > > >  /**
> > > > > >   * struct drm_xe_query_gts - describe GTs
> > > > > >   *
> > > > > > @@ -272,30 +313,10 @@ struct drm_xe_query_config {
> > > > > >  struct drm_xe_query_gts {
> > > > > >  	/** @num_gt: number of GTs returned in gts */
> > > > > >  	__u32 num_gt;
> > > > > > -
> > > > > >  	/** @pad: MBZ */
> > > > > >  	__u32 pad;
> > > > > > -
> > > > > > -	/**
> > > > > > -	 * @gts: The GTs returned for this device
> > > > > > -	 *
> > > > > > -	 * TODO: convert drm_xe_query_gt to proper kernel-doc.
> > > > > > -	 * TODO: Perhaps info about every mem region relative to this GT? e.g.
> > > > > > -	 * bandwidth between this GT and remote region?
> > > > > > -	 */
> > > > > > -	struct drm_xe_query_gt {
> > > > > > -#define XE_QUERY_GT_TYPE_MAIN		0
> > > > > > -#define XE_QUERY_GT_TYPE_REMOTE		1
> > > > > > -#define XE_QUERY_GT_TYPE_MEDIA		2
> > > > > > -		__u16 type;
> > > > > > -		__u16 instance;
> > > > > > -		__u32 clock_freq;
> > > > > > -		__u64 features;
> > > > > > -		__u64 native_mem_regions;	/* bit mask of instances from drm_xe_query_mem_usage */
> > > > > > -		__u64 slow_mem_regions;		/* bit mask of instances from drm_xe_query_mem_usage */
> > > > > > -		__u64 inaccessible_mem_regions;	/* bit mask of instances from drm_xe_query_mem_usage */
> > > > > > -		__u64 reserved[8];
> > > > > > -	} gts[];
> > > > > > +	/** @gts: The GT list returned for this device */
> > > > > > +	struct drm_xe_query_gt gts[];
> > > > > >  };
> > > > > >  
> > > > > >  /**
> > > > > > -- 
> > > > > > 2.41.0
> > > > > > 
> > > > > 
> > > > > -- 
> > > > > Matt Roper
> > > > > Graphics Software Engineer
> > > > > Linux GPU Platform Enablement
> > > > > Intel Corporation
> > > 
> > > -- 
> > > Matt Roper
> > > Graphics Software Engineer
> > > Linux GPU Platform Enablement
> > > Intel Corporation
> 


More information about the Intel-xe mailing list