[Intel-xe] [RFC 2/5] drm/xe/uapi: Document drm_xe_query_gt.
Souza, Jose
jose.souza at intel.com
Thu Sep 14 13:57:15 UTC 2023
On Wed, 2023-09-13 at 17:55 -0400, Rodrigo Vivi wrote:
> On Wed, Sep 13, 2023 at 02:46:30PM -0700, Matt Roper wrote:
> > On Wed, Sep 13, 2023 at 05:12:57PM -0400, Rodrigo Vivi wrote:
> > > On Wed, Sep 13, 2023 at 12:55:06PM -0700, Matt Roper wrote:
> > > > On Fri, Sep 08, 2023 at 04:32:59PM -0400, Rodrigo Vivi wrote:
> > > > > Split drm_xe_query_gt out of the gt list one in order to better
> > > > > document it. No functional change at this point.
> > > > >
> > > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > > > ---
> > > > > include/uapi/drm/xe_drm.h | 65 ++++++++++++++++++++++++++-------------
> > > > > 1 file changed, 43 insertions(+), 22 deletions(-)
> > > > >
> > > > > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h
> > > > > index 51c4ef5dee6d..4bcee709bef9 100644
> > > > > --- a/include/uapi/drm/xe_drm.h
> > > > > +++ b/include/uapi/drm/xe_drm.h
> > > > > @@ -262,6 +262,47 @@ struct drm_xe_query_config {
> > > > > __u64 info[];
> > > > > };
> > > > >
> > > > > +/**
> > > > > + * struct drm_xe_query_gt - describe an individual GT.
> > > > > + *
> > > > > + * To be used with drm_xe_query_gts, which will return a list with all the
> > > > > + * existing GT individual descriptions.
> > > > > + * Graphics Technology (GT) is a subset of a GPU/tile that is responsible for
> > > > > + * implementing graphics and/or media operations.
> > > > > + */
> > > > > +struct drm_xe_query_gt {
> > > > > +#define XE_QUERY_GT_TYPE_MAIN 0
> > > > > +#define XE_QUERY_GT_TYPE_REMOTE 1
> > > >
> > > > I know the goal of this patch is just to add kerneldoc, but while we're
> > > > looking at this, one other thing we'll need to decide before finalizing
> > > > the uapi is whether there's any value in keeping these two separate.
> > > > I.e., does any userspace actually care about the difference between
> > > > "render GT on first tile" vs "render GT on other tiles?" Given that GT
> > > > at the uapi level is pretty much only concerned with power management, I
> > > > can't really think of anywhere that the distinction would matter (but we
> > > > should still check with the userspace teams).
> > >
> > > Cc: Jose.
> > > Mesa is using this, but I believe a XE_QUERY_GT_TYPE_RENDER could be
> > > enough. And in case we have multiple GTs then they just get the data
> > > from whatever RENDER if finds first and stop?
What about PVC? XE_QUERY_GT_TYPE_RENDER may confuse compute folks, maybe a better name to include both render and compute...
Also could we have a guarantee that the main tile will always be instance = 0 for XE_QUERY_GT_TYPE_RENDER(or whatever name)?
Just in case we need to do some special handling based on a main tile parameter.
> > >
> > > >
> > > > > +#define XE_QUERY_GT_TYPE_MEDIA 2
> > > > > + /** @type: GT type: Main, Remote, or Media */
> > > > > + __u16 type;
> > > > > + /** @instance: Instance of this GT in the GT list */
> > > > > + __u16 instance;
> > > > > + /** @clock_freq: A clock frequency for timestamp */
> > > > > + __u32 clock_freq;
> > > >
> > > > We may want to be more clear about what frequency this is. I.e., is it
> > > > the current frequency when the query is issued? Is it the maximum
> > > > frequency the GT supports? Does this always give a meaningful value or
> > > > does it report 0 when nothing is executing (and we're in RC6)?
> > > >
> > > > BTW, is this redundant with what we're exposing through sysfs? Does
> > > > userspace get any benefit from grabbing frequency here rather than
> > > > through the sysfs interface?
> > >
> > > oh my! When I was adding this documentation here I hated this API
> > > exactly because I knew it would cause confusion with the GT operating
> > > frequencies. This has nothing to do with those frequencies. It is a
> > > timestamp thing.
> > >
> > > Check get_crystal_clock_freq() at xe_gt_clock.c.
> > >
> > > Do you have more suggestions on how to make this documentation
> > > better and people not confusing this with the GT frequencies?
> > >
> > > maybe remove the freq term and use crystal_clock? :/
> >
> > Ah, okay. Yeah, I'm not really sure how that winds up getting used, so
> > probably another place for Jose to weigh in on what kind of description
> > makes sense.
> >
> > I know there's other uapi work for CPU/GPU timestamp correlation going
> > on, although I haven't really followed what's happening there. Does
> > this relate to that in any way? And if it does, should this information
> > be moving over to that query rather than staying in the general GT
> > query?
>
> Good questions...
> Cc: Jose, since I forgot to do so on the previous email.
I believe we should keep it in GTS query as DRM_XE_QUERY_CS_CYCLES uAPI will fail when running in SRIOV modes but Mesa would still query timestamp
from engine in some cases...
About the name, what about timestamp_frequency?
>
> >
> >
> > Matt
> >
> > >
> > > >
> > > > > + /** @features: Reserved for future information about GT features */
> > > > > + __u64 features;
> > > > > + /**
> > > > > + * @native_mem_regions: Bit maks of instances from
> > > > > + * drm_xe_query_mem_usage that lives on the same GPU/Tile and have
> > > > > + * direct access.
> > > > > + */
> > > > > + __u64 native_mem_regions;
> > > >
> > > > Was the plan to eventually move these memory region masks to a
> > > > tile-based uapi? Leaving them in a GT query seems awkward since the GT
> > > > isn't directly related to memory regions. It seems like what userspace
> > > > cares about would either be tile<->mem_region location or
> > > > hwe<->mem_region distance. How is userspace utilizing this information
> > > > today (and are they needing to jump through extra hoops due to the
> > > > current placement?)?
> > >
> > > That's a good question. They get this information, but it doesn't
> > > look they are making a good real usage of this anyway...
> > >
> > > Maybe we should just replace this entirely per a tile_id?
> > >
> > > The vm_bind operation has a tile_mask there. So on the query gt here
> > > we maybe only inform what tile that belongs to, so the userspace
> > > can call the vm_bind to the right tile?
> > >
> > > And later if they really need the distance we introduce something
> > > like that i915 dii for query distance?
> > >
> > > Another thing that confuses me here is that mem_region query
> > > doesn't have any information about the tile-placement but
> > > just the vram vs system types...
> > >
> > > Should we add the tile_id there as well?
> > >
> > > >
> > > > Also s/maks/mask/ in the kerneldoc you added here (and in the ones
> > > > below).
> > >
> > > Thanks,
> > > Rodrigo.
> > >
> > > >
> > > >
> > > > Matt
> > > >
> > > > > + /**
> > > > > + * @slow_mem_regions: Bit maks of instances from
> > > > > + * drm_xe_query_mem_usage that this GT can indirectly access, although
> > > > > + * they live on a different GPU/Tile.
> > > > > + */
> > > > > + __u64 slow_mem_regions;
> > > > > + /**
> > > > > + * @inaccessible_mem_regions: Bit maks of instances from
> > > > > + * drm_xe_query_mem_usage that is not accessible by this GT at all.
> > > > > + */
> > > > > + __u64 inaccessible_mem_regions;
> > > > > + /** @reserved: Reserved */
> > > > > + __u64 reserved[8];
> > > > > +};
> > > > > +
> > > > > /**
> > > > > * struct drm_xe_query_gts - describe GTs
> > > > > *
> > > > > @@ -272,30 +313,10 @@ struct drm_xe_query_config {
> > > > > struct drm_xe_query_gts {
> > > > > /** @num_gt: number of GTs returned in gts */
> > > > > __u32 num_gt;
> > > > > -
> > > > > /** @pad: MBZ */
> > > > > __u32 pad;
> > > > > -
> > > > > - /**
> > > > > - * @gts: The GTs returned for this device
> > > > > - *
> > > > > - * TODO: convert drm_xe_query_gt to proper kernel-doc.
> > > > > - * TODO: Perhaps info about every mem region relative to this GT? e.g.
> > > > > - * bandwidth between this GT and remote region?
> > > > > - */
> > > > > - struct drm_xe_query_gt {
> > > > > -#define XE_QUERY_GT_TYPE_MAIN 0
> > > > > -#define XE_QUERY_GT_TYPE_REMOTE 1
> > > > > -#define XE_QUERY_GT_TYPE_MEDIA 2
> > > > > - __u16 type;
> > > > > - __u16 instance;
> > > > > - __u32 clock_freq;
> > > > > - __u64 features;
> > > > > - __u64 native_mem_regions; /* bit mask of instances from drm_xe_query_mem_usage */
> > > > > - __u64 slow_mem_regions; /* bit mask of instances from drm_xe_query_mem_usage */
> > > > > - __u64 inaccessible_mem_regions; /* bit mask of instances from drm_xe_query_mem_usage */
> > > > > - __u64 reserved[8];
> > > > > - } gts[];
> > > > > + /** @gts: The GT list returned for this device */
> > > > > + struct drm_xe_query_gt gts[];
> > > > > };
> > > > >
> > > > > /**
> > > > > --
> > > > > 2.41.0
> > > > >
> > > >
> > > > --
> > > > Matt Roper
> > > > Graphics Software Engineer
> > > > Linux GPU Platform Enablement
> > > > Intel Corporation
> >
> > --
> > Matt Roper
> > Graphics Software Engineer
> > Linux GPU Platform Enablement
> > Intel Corporation
More information about the Intel-xe
mailing list