[Intel-xe] [PATCH DONTMERGE] drm/xe: uapi review submission

Matt Roper matthew.d.roper at intel.com
Wed Jul 5 21:19:23 UTC 2023


On Fri, Jun 30, 2023 at 04:40:54PM -0700, Dixit, Ashutosh wrote:
> On Fri, 30 Jun 2023 03:00:59 -0700, Thomas Hellström wrote:
> >
> 
> I have a question about the toplogy query below. I am not hugely familiar
> with why/how this particular struct was chosen nor the history here, but
> anyway.
> 
> > +/**
> > + * struct drm_xe_query_topology_mask - describe the topology mask of a GT
> > + *
> > + * This is the hardware topology which reflects the internal physical
> > + * structure of the GPU.
> > + *
> > + * If a query is made with a struct drm_xe_device_query where .query
> > + * is equal to DRM_XE_DEVICE_QUERY_GT_TOPOLOGY, then the reply uses
> > + * struct drm_xe_query_topology_mask in .data.
> > + */
> > +struct drm_xe_query_topology_mask {
> > +	/** @gt_id: GT ID the mask is associated with */
> > +	__u16 gt_id;
> > +
> > +	/*
> > +	 * To query the mask of Dual Sub Slices (DSS) available for geometry
> > +	 * operations. For example a query response containing the following
> > +	 * in mask:
> > +	 *   DSS_GEOMETRY    ff ff ff ff 00 00 00 00
> > +	 * means 32 DSS are available for geometry.
> > +	 */
> > +#define XE_TOPO_DSS_GEOMETRY	(1 << 0)
> > +	/*
> > +	 * To query the mask of Dual Sub Slices (DSS) available for compute
> > +	 * operations. For example a query response containing the following
> > +	 * in mask:
> > +	 *   DSS_COMPUTE    ff ff ff ff 00 00 00 00
> > +	 * means 32 DSS are available for compute.
> > +	 */
> > +#define XE_TOPO_DSS_COMPUTE	(1 << 1)
> > +	/*
> > +	 * To query the mask of Execution Units (EU) available per Dual Sub
> > +	 * Slices (DSS). For example a query response containing the following
> > +	 * in mask:
> > +	 *   EU_PER_DSS    ff ff 00 00 00 00 00 00
> > +	 * means each DSS has 16 EU.
> > +	 */
> > +#define XE_TOPO_EU_PER_DSS	(1 << 2)
> > +	/** @type: type of mask */
> > +	__u16 type;
> > +
> > +	/** @num_bytes: number of bytes in requested mask */
> > +	__u32 num_bytes;
> > +
> > +	/** @mask: little-endian mask of @num_bytes */
> > +	__u8 mask[];
> > +};
> 
> So typically to consume the above struct, userspace needs additional
> information, specifically 'max_subslices' and 'max_eus_per_subslice' which
> was included in i915 'struct drm_i915_query_topology_info'.
> 
> For example to consume 'struct drm_xe_query_topology_mask' I had recently
> to write the following code in IGT because this information was not
> available through 'struct drm_xe_query_topology_mask':
> 
> 	/* Fixed fields, see fill_topology_info() and intel_sseu_set_info() in i915 */
> 	i915_topinfo.max_slices = 1;			/* always 1 */
> 	if (IS_PONTEVECCHIO(xe_dev_id(drm_fd))) {
> 		i915_topinfo.max_subslices = 64;
> 		i915_topinfo.max_eus_per_subslice = 8;
> 	} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 50)) {
> 		i915_topinfo.max_subslices = 32;
> 		i915_topinfo.max_eus_per_subslice = 16;
> 	} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 0)) {
> 		i915_topinfo.max_subslices = 6;
> 		i915_topinfo.max_eus_per_subslice = 16;
> 	} else {
> 		igt_assert(0);
> 	}
> 
> So, if we are going to expose 'struct drm_xe_query_topology_mask' as it is
> above through xe uapi, are we assuming that userspace has out of band
> knowledge of 'max_subslices' and 'max_eus_per_subslice'? Or should this
> information be (somehow) added to the above struct?
> 
> Another option (which sort of works but only approximately) would be to set
> 'num_bytes' field in 'struct drm_xe_query_topology_mask' to actual number
> of bytes in the mask. At present it is set unconditionally to 8 in
> query_gt_topology(), irrespective of the actual num bytes in the masks
> which is basically (again in i915 parlance):

I'm not sure what you're saying here.  num_bytes is set to the actual
size of the mask data returned; it's not hardcoded as a constant value,
although in practice it won't change very often since it's not tied to
the specific platform you're running on.

The size of the mask may be larger than the true set of possible DSS/EUs
on a given platform, but that's expected.  For example, if some platform
can only ever have a maximum of 4 DSS, but we tell userspace they're
getting 8 bytes (64 bits), that's fine since the upper bits will always
be 0 and can just be ignored.

Remember --- this API is not intended to be give information about
platform theoretical maximums (e.g., "Foobar Lake will never have more
than 23 DSS") since if userspace drivers actually need that information
it must come from the GuC's hwconfig, not from the driver code.  This
interface is exclusively about conveying the fusing information of which
units are physically present on your specific chip.


Matt

> 
> 	i915_topinfo.subslice_stride = DIV_ROUND_UP(i915_topinfo.max_subslices, 8);
> 	i915_topinfo.eu_stride = DIV_ROUND_UP(i915_topinfo.max_eus_per_subslice, 8);
> 
> If we go the 'num_bytes' route, that would be just an implementation change
> not a uapi change.
> 
> Thanks.
> --
> Ashutosh

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the Intel-xe mailing list