[Intel-xe] [PATCH DONTMERGE] drm/xe: uapi review submission

Dixit, Ashutosh ashutosh.dixit at intel.com
Wed Jul 5 22:33:20 UTC 2023


On Wed, 05 Jul 2023 14:19:23 -0700, Matt Roper wrote:
>
> On Fri, Jun 30, 2023 at 04:40:54PM -0700, Dixit, Ashutosh wrote:
> > On Fri, 30 Jun 2023 03:00:59 -0700, Thomas Hellström wrote:
> > >
> >
> > I have a question about the toplogy query below. I am not hugely familiar
> > with why/how this particular struct was chosen nor the history here, but
> > anyway.
> >
> > > +/**
> > > + * struct drm_xe_query_topology_mask - describe the topology mask of a GT
> > > + *
> > > + * This is the hardware topology which reflects the internal physical
> > > + * structure of the GPU.
> > > + *
> > > + * If a query is made with a struct drm_xe_device_query where .query
> > > + * is equal to DRM_XE_DEVICE_QUERY_GT_TOPOLOGY, then the reply uses
> > > + * struct drm_xe_query_topology_mask in .data.
> > > + */
> > > +struct drm_xe_query_topology_mask {
> > > +	/** @gt_id: GT ID the mask is associated with */
> > > +	__u16 gt_id;
> > > +
> > > +	/*
> > > +	 * To query the mask of Dual Sub Slices (DSS) available for geometry
> > > +	 * operations. For example a query response containing the following
> > > +	 * in mask:
> > > +	 *   DSS_GEOMETRY    ff ff ff ff 00 00 00 00
> > > +	 * means 32 DSS are available for geometry.
> > > +	 */
> > > +#define XE_TOPO_DSS_GEOMETRY	(1 << 0)
> > > +	/*
> > > +	 * To query the mask of Dual Sub Slices (DSS) available for compute
> > > +	 * operations. For example a query response containing the following
> > > +	 * in mask:
> > > +	 *   DSS_COMPUTE    ff ff ff ff 00 00 00 00
> > > +	 * means 32 DSS are available for compute.
> > > +	 */
> > > +#define XE_TOPO_DSS_COMPUTE	(1 << 1)
> > > +	/*
> > > +	 * To query the mask of Execution Units (EU) available per Dual Sub
> > > +	 * Slices (DSS). For example a query response containing the following
> > > +	 * in mask:
> > > +	 *   EU_PER_DSS    ff ff 00 00 00 00 00 00
> > > +	 * means each DSS has 16 EU.
> > > +	 */
> > > +#define XE_TOPO_EU_PER_DSS	(1 << 2)
> > > +	/** @type: type of mask */
> > > +	__u16 type;
> > > +
> > > +	/** @num_bytes: number of bytes in requested mask */
> > > +	__u32 num_bytes;
> > > +
> > > +	/** @mask: little-endian mask of @num_bytes */
> > > +	__u8 mask[];
> > > +};
> >
> > So typically to consume the above struct, userspace needs additional
> > information, specifically 'max_subslices' and 'max_eus_per_subslice' which
> > was included in i915 'struct drm_i915_query_topology_info'.
> >
> > For example to consume 'struct drm_xe_query_topology_mask' I had recently
> > to write the following code in IGT because this information was not
> > available through 'struct drm_xe_query_topology_mask':
> >
> >	/* Fixed fields, see fill_topology_info() and intel_sseu_set_info() in i915 */
> >	i915_topinfo.max_slices = 1;			/* always 1 */
> >	if (IS_PONTEVECCHIO(xe_dev_id(drm_fd))) {
> >		i915_topinfo.max_subslices = 64;
> >		i915_topinfo.max_eus_per_subslice = 8;
> >	} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 50)) {
> >		i915_topinfo.max_subslices = 32;
> >		i915_topinfo.max_eus_per_subslice = 16;
> >	} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 0)) {
> >		i915_topinfo.max_subslices = 6;
> >		i915_topinfo.max_eus_per_subslice = 16;
> >	} else {
> >		igt_assert(0);
> >	}
> >
> > So, if we are going to expose 'struct drm_xe_query_topology_mask' as it is
> > above through xe uapi, are we assuming that userspace has out of band
> > knowledge of 'max_subslices' and 'max_eus_per_subslice'? Or should this
> > information be (somehow) added to the above struct?
> >
> > Another option (which sort of works but only approximately) would be to set
> > 'num_bytes' field in 'struct drm_xe_query_topology_mask' to actual number
> > of bytes in the mask. At present it is set unconditionally to 8 in
> > query_gt_topology(), irrespective of the actual num bytes in the masks
> > which is basically (again in i915 parlance):
>
> I'm not sure what you're saying here.  num_bytes is set to the actual
> size of the mask data returned; it's not hardcoded as a constant value,
> although in practice it won't change very often since it's not tied to
> the specific platform you're running on.
>
> The size of the mask may be larger than the true set of possible DSS/EUs
> on a given platform, but that's expected.  For example, if some platform
> can only ever have a maximum of 4 DSS, but we tell userspace they're
> getting 8 bytes (64 bits), that's fine since the upper bits will always
> be 0 and can just be ignored.
>
> Remember --- this API is not intended to be give information about
> platform theoretical maximums (e.g., "Foobar Lake will never have more
> than 23 DSS") since if userspace drivers actually need that information
> it must come from the GuC's hwconfig, not from the driver code.  This
> interface is exclusively about conveying the fusing information of which
> units are physically present on your specific chip.

Ah, thanks Matt, what I missed was the GuC hwconfig part. That should solve
my problem.

Thanks.
--
Ashutosh

> >	i915_topinfo.subslice_stride = DIV_ROUND_UP(i915_topinfo.max_subslices, 8);
> >	i915_topinfo.eu_stride = DIV_ROUND_UP(i915_topinfo.max_eus_per_subslice, 8);
> >
> > If we go the 'num_bytes' route, that would be just an implementation change
> > not a uapi change.
> >
> > Thanks.
> > --
> > Ashutosh
>
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation


More information about the Intel-xe mailing list