[Intel-xe] [PATCH DONTMERGE] drm/xe: uapi review submission
Dixit, Ashutosh
ashutosh.dixit at intel.com
Fri Jun 30 23:40:54 UTC 2023
On Fri, 30 Jun 2023 03:00:59 -0700, Thomas Hellström wrote:
>
I have a question about the toplogy query below. I am not hugely familiar
with why/how this particular struct was chosen nor the history here, but
anyway.
> +/**
> + * struct drm_xe_query_topology_mask - describe the topology mask of a GT
> + *
> + * This is the hardware topology which reflects the internal physical
> + * structure of the GPU.
> + *
> + * If a query is made with a struct drm_xe_device_query where .query
> + * is equal to DRM_XE_DEVICE_QUERY_GT_TOPOLOGY, then the reply uses
> + * struct drm_xe_query_topology_mask in .data.
> + */
> +struct drm_xe_query_topology_mask {
> + /** @gt_id: GT ID the mask is associated with */
> + __u16 gt_id;
> +
> + /*
> + * To query the mask of Dual Sub Slices (DSS) available for geometry
> + * operations. For example a query response containing the following
> + * in mask:
> + * DSS_GEOMETRY ff ff ff ff 00 00 00 00
> + * means 32 DSS are available for geometry.
> + */
> +#define XE_TOPO_DSS_GEOMETRY (1 << 0)
> + /*
> + * To query the mask of Dual Sub Slices (DSS) available for compute
> + * operations. For example a query response containing the following
> + * in mask:
> + * DSS_COMPUTE ff ff ff ff 00 00 00 00
> + * means 32 DSS are available for compute.
> + */
> +#define XE_TOPO_DSS_COMPUTE (1 << 1)
> + /*
> + * To query the mask of Execution Units (EU) available per Dual Sub
> + * Slices (DSS). For example a query response containing the following
> + * in mask:
> + * EU_PER_DSS ff ff 00 00 00 00 00 00
> + * means each DSS has 16 EU.
> + */
> +#define XE_TOPO_EU_PER_DSS (1 << 2)
> + /** @type: type of mask */
> + __u16 type;
> +
> + /** @num_bytes: number of bytes in requested mask */
> + __u32 num_bytes;
> +
> + /** @mask: little-endian mask of @num_bytes */
> + __u8 mask[];
> +};
So typically to consume the above struct, userspace needs additional
information, specifically 'max_subslices' and 'max_eus_per_subslice' which
was included in i915 'struct drm_i915_query_topology_info'.
For example to consume 'struct drm_xe_query_topology_mask' I had recently
to write the following code in IGT because this information was not
available through 'struct drm_xe_query_topology_mask':
/* Fixed fields, see fill_topology_info() and intel_sseu_set_info() in i915 */
i915_topinfo.max_slices = 1; /* always 1 */
if (IS_PONTEVECCHIO(xe_dev_id(drm_fd))) {
i915_topinfo.max_subslices = 64;
i915_topinfo.max_eus_per_subslice = 8;
} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 50)) {
i915_topinfo.max_subslices = 32;
i915_topinfo.max_eus_per_subslice = 16;
} else if (intel_graphics_ver(xe_dev_id(drm_fd)) >= IP_VER(12, 0)) {
i915_topinfo.max_subslices = 6;
i915_topinfo.max_eus_per_subslice = 16;
} else {
igt_assert(0);
}
So, if we are going to expose 'struct drm_xe_query_topology_mask' as it is
above through xe uapi, are we assuming that userspace has out of band
knowledge of 'max_subslices' and 'max_eus_per_subslice'? Or should this
information be (somehow) added to the above struct?
Another option (which sort of works but only approximately) would be to set
'num_bytes' field in 'struct drm_xe_query_topology_mask' to actual number
of bytes in the mask. At present it is set unconditionally to 8 in
query_gt_topology(), irrespective of the actual num bytes in the masks
which is basically (again in i915 parlance):
i915_topinfo.subslice_stride = DIV_ROUND_UP(i915_topinfo.max_subslices, 8);
i915_topinfo.eu_stride = DIV_ROUND_UP(i915_topinfo.max_eus_per_subslice, 8);
If we go the 'num_bytes' route, that would be just an implementation change
not a uapi change.
Thanks.
--
Ashutosh
More information about the Intel-xe
mailing list