[PATCH i-g-t] PVC: tests/intel/xe_multi_tile: Added Multi tile verification in available platform
Matt Roper
matthew.d.roper at intel.com
Tue Jul 1 21:56:53 UTC 2025
On Tue, Jul 01, 2025 at 01:03:09AM -0700, Sharma, Nishit wrote:
>
> This multi-tile test verifies whether platform supports multi-tile or not. If multi-tile supported then how many gt belongs to single tile%d and type of each gt.
I think I mentioned on a previous review, but the important thing is to
clearly explain what kind of behavior you're trying to verify/test.
Have we encountered a specific bug that we're trying to write a test
case for to ensure we don't reintroduce it in the future? Or is there
some specific invariant about the uapi that we're trying to guarantee
isn't violated?
>
> Signed-off-by: Nishit Sharma <nishit.sharma at intel.com>
> ---
> tests/intel/xe_multi_tile.c | 274 ++++++++++++++++++++++++++++++++++++
> tests/meson.build | 1 +
> 2 files changed, 275 insertions(+)
> create mode 100644 tests/intel/xe_multi_tile.c
>
> diff --git a/tests/intel/xe_multi_tile.c b/tests/intel/xe_multi_tile.c new file mode 100644 index 000000000..d302dce28
> --- /dev/null
> +++ b/tests/intel/xe_multi_tile.c
> @@ -0,0 +1,274 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + *
> + * Authors:
> + * Nishit Sharma <nishit.sharma at intel.com>
> + */
> +
> +#include <dirent.h>
> +#include <fcntl.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <unistd.h>
> +
> +#include "igt.h"
> +#include "igt_sysfs.h"
> +
> +#include "xe_drm.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +
> +/**
> + * TEST: Test to verify if multi-tile support available in platform
> + * Category: Core
> + * Mega feature: General Core features
> + * Functionality: Tile/GT operations
> + */
> +
> +/**
> + * SUBTEST: multi-tile-info
> + * Description: Test gathers Tile_ID/s, GT_ID/s, GT Type info
> + * Test category: functionality test
> + *
> + */
> +
> +/**
> + * struct xe_gt_info - Holds ID and Type info for GT
> + * This structure required to check gt_id belongs to specific
> + * tile%d and gt_type which should differ for each gt
> + * Can be optimized in future
> + */
> +struct xe_gt_info {
> + /* GT ID */
> + int gt_id;
> +
> + /* GT Type */
> + int gt_type;
> +};
> +
> +/**
> + * struct xe_tile - Holds gt_count and gt_info per tile%d
> + * This structure required to store tile%d related info like
> + * number of gt/s belongs to single tile
> + * Can be optimized in future
> + */
> +struct xe_tile {
> +#define XE_MAX_TILES_PER_DEVICE 2
> +#define XE_MAX_GT_PER_DEVICE 4
It would be best if we not hardcode things like this inside IGT since it
will just lead to problems down the road. Part of the point of the
xe_query interface is that we can support any number of tiles and GTs
and userspace (including IGT) shouldn't need to make assumptions like
this.
Also note that the XE_MAX_GT_PER_DEVICE value isn't correct for any
platforms we have today. We only have:
- single-tile, single GT platforms: 1 primary GT
- multi-tile, single GT per tile: 2 primary GTs
- single-tile, multi-tile platforms: 1 primary GT, 1 media GT
It seems likely that we probably will have multi-tile + multi-GT
platforms at some point in the future, but no such platform exists yet.
And even if/when those platforms do show up, there's no guarantee that
the "4" here will be accurate; we could wind up more than two tiles, or
more than two GT types on those platforms.
There's a lot of other refactoring going on in both the kernel and IGT
right now to break assumptions like this, so we should avoid adding new
ones here. The xe_query interface can give us a GT list of potentially
any size if that's what the hardware supports (e.g., 6 tiles and 5 GTs
per tile for some hypothetical future platform) and userspace like IGT
should be able to just walk through that list and not make assumptions
about the upper limits.
> +
> + /* GT count */
> + int gt_count;
> +
> + /* GT info */
> + struct xe_gt_info gt_info[XE_MAX_GT_PER_DEVICE];
> +}tile[XE_MAX_TILES_PER_DEVICE];
> +
> +/*
> + * Get gt_list using XE_DEVICE_QUERY UAPI returned by Driver
> + * gt_list holds tile ID/s, gt ID/s, type and other information
> + *
> + */
> +static struct drm_xe_query_gt_list *xe_query_gt_list(int fd) {
> + struct drm_xe_query_gt_list *gt_list;
> + struct drm_xe_device_query query = {
> + .extensions = 0,
> + .query = DRM_XE_DEVICE_QUERY_GT_LIST,
> + .size = 0,
> + .data = 0,
> + };
> +
> + igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> + igt_assert_neq(query.size, 0);
> +
> + gt_list = malloc(query.size);
> + igt_assert(gt_list);
> +
> + query.data = to_user_pointer(gt_list);
> + igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> +
> + return gt_list;
> +}
Do we really need this function? IGT already queries all of this
information during xe_device_get() and stores it into xe_dev->gt_list.
I don't see a specific need for a redundant copy of the information that
IGT is already fetching automatically.
> +
> +/*
> + * Dipslay gt_list information to user
> + * gt_list holds tile ID/s, gt ID/s, type and other information
> + *
> + */
> +static void
> +xe_show_gt_info(int fd, struct drm_xe_query_gt_list *gt_list) {
> + uint16_t dev_id;
> + dev_id = intel_get_drm_devid(fd);
> + igt_assert(gt_list);
> +
> + igt_info("Displaying GT Info\n");
> + for (int i = 0; i < gt_list->num_gt; i++) {
> + int verx100 = 100 * gt_list->gt_list[i].ip_ver_major +
> + gt_list->gt_list[i].ip_ver_minor;
> +
> + igt_info("type: %d\n", gt_list->gt_list[i].type);
> + igt_info("gt_id: %d\n", gt_list->gt_list[i].gt_id);
> + igt_info("IP version: %d.%02d, stepping %d\n",
> + gt_list->gt_list[i].ip_ver_major,
> + gt_list->gt_list[i].ip_ver_minor,
> + gt_list->gt_list[i].ip_ver_rev);
> + igt_info("reference_clock: %u\n", gt_list->gt_list[i].reference_clock);
> + igt_info("near_mem_regions: 0x%016llx\n",
> + gt_list->gt_list[i].near_mem_regions);
> + igt_info("far_mem_regions: 0x%016llx\n",
> + gt_list->gt_list[i].far_mem_regions);
> + igt_info("type of gt: %d\n",
> + gt_list->gt_list[i].type);
> + igt_info("tile_id: %d\n",
> + gt_list->gt_list[i].tile_id);
I think I mentioned in a previous review, but IGT tests are generally
intended to be executed by automated CI systems and such, so we don't
really want a bunch of output look this. It would be better to have a
separate program in the tools/ directory that displays information about
the device in a human-readable manner. It's okay to have a little bit
of debug output if it will help debug failures in the test, but in this
case you're dumping a bunch of information that isn't actually used by
the test, so that's not helpful.
> +
> + /* Sanity check IP version. */
> + if (verx100) {
> + /*
> + * First GMD_ID platforms had graphics 12.70 and media
> + * 13.00 so we should never see non-zero values lower
> + * than those.
> + */
> + if (gt_list->gt_list[i].type == DRM_XE_QUERY_GT_TYPE_MEDIA)
> + igt_assert_lte(1300, verx100);
> + else
> + igt_assert_lte(1270, verx100);
> +
> + /*
> + * Aside from MTL/ARL and media on BMG, all version
> + * numbers should be 20.00 or higher.
> + */
> + if (IS_METEORLAKE(dev_id))
> + continue;
> + if (gt_list->gt_list[i].type == DRM_XE_QUERY_GT_TYPE_MEDIA &&
> + IS_BATTLEMAGE(dev_id))
> + continue;
> +
> + igt_assert_lte(20, gt_list->gt_list[i].ip_ver_major);
> + }
This seems to be a copy-paste from the xe_query gt-list subtest. Since
these checks are already been handled by that test, we don't really want
to duplicate it here. If this is xe_multi_tile, then this test is
expected to be testing something specific to multi-tile platforms, not
general logic that applies to all platforms.
> + }
> +}
> +
> +/*
> + * To check whether tiles are in order or not or if any tile order is
> + * skipped/missing as returned through UAPI
> + * Function returns 0 if all tiles in order or sequenctial otherwise
> + * returns 1
> + */
> +static uint8_t
> +xe_check_tile_order(int fd, struct drm_xe_query_gt_list *gt_list) {
Do we actually guarantee any specific _order_ of the GTs returned by the
query interface today? In practice, we'll return all the GTs for tile
0, then tile 1, etc., but I don't think there's any mandate that we have
to continue to do that on future platforms.
There's probably an expectation that tile IDs don't have holes in them
(although GT IDs might), but the GTs returned by the GT list query don't
have to be ordered according to their tile.
> + int prev_tile = -1, tile_id;
> + uint8_t tile_mis_count = -1;
> +
> + igt_info("Verifying tile order/sequence available in platform\n");
> + for(int index = 0; index < gt_list->num_gt; index++) {
> + tile_id = gt_list->gt_list[index].tile_id;
> + if(prev_tile != tile_id)
> + {
> + if(++tile_mis_count != tile_id) {
> + return 1;
> + }
> + prev_tile = tile_id;
> + }
> + }
> +
> + igt_info("Tiles available in platform are in order/sequential\n");
> + return 0;
> +}
> +
> +/*
> + * To get tile count. UAPI returns tile ID based on GT ID
> + * tile count is calculated and rturned for further processing */
> +static int xe_get_tile_count(int fd, struct drm_xe_query_gt_list
> +*gt_list) {
It doesn't seem like anything in this test actually cares about the
exact tile count, only that there's more than one tile. So this could
be replaced with a simpler function along the lines of
bool is_multitile(int fd) {
int gt_id;
xe_for_each_gt(fd, gt_id)
if (xe_dev->gt_list->gt_list[gt_id].tile_id > 0)
return true;
return false;
}
Note that the specifics of that loop will change a bit if the
refactoring series at https://patchwork.freedesktop.org/series/150975/
lands (we won't want to be doing a direct lookup of the gt_id in the
gt_list at that point).
> + int prev_tile = -1, tile_id, tile_index = 0;
> + int tile_count = 0;
> +
> + for(int index = 0; index < gt_list->num_gt; index++) {
> + tile_id = gt_list->gt_list[index].tile_id;
> + if(prev_tile != tile_id)
> + {
> + prev_tile = tile_id;
> + tile[tile_id].gt_count++;
> + tile[tile_id].gt_info[index].gt_id = index;
> + tile[tile_id].gt_info[index].gt_type = gt_list->gt_list[index].type;
> + tile_count++;
> + tile_index = tile_id;
> + }
> + else
> + {
> + tile[tile_index].gt_count++;
> + tile[tile_index].gt_info[index].gt_id = index;
> + tile[tile_index].gt_info[index].gt_type = gt_list->gt_list[index].type;
> + }
> + }
> +
> + tile_index = 0;
> + tile_id = 0;
> +
> + return tile_count;
> +}
> +
> +/*
> + * To check gt type. UAPI returns gt type based on gt ID
> + * The gt/s belonging to same tile%d must have dfferent types
> + * If found same gt/s within tile%d program will halt */ static void
> +xe_check_gt_type(int fd, struct drm_xe_query_gt_list *gt_list,
> + int num_tiles)
> +{
> + igt_info("Verifying gt type belongs to each tile in platform\n");
This explanation doesn't seem to match what the test below is doing.
But we also don't need these igt_info() lines at all, so we can probably
just drop it.
> +
> + for(int tile_num = 0; tile_num < num_tiles; tile_num++) {
> + for(int gt_num = 0; gt_num < tile[tile_num].gt_count - 1; gt_num++) {
> + igt_assert_neq(tile[tile_num].gt_info[gt_num].gt_type,
> + tile[tile_num].gt_info[gt_num + 1].gt_type);
> + }
> + }
> +}
> +
> +igt_main
> +{
> + int fd;
> + struct drm_xe_query_gt_list *gt_list;
> + struct xe_device *xe_dev;
> + int num_tiles = 0;
> +
> + gt_list = malloc(sizeof(*gt_list));
> + igt_assert(gt_list);
> +
> + igt_fixture {
> + fd = drm_open_driver(DRIVER_XE);
> + xe_dev = xe_device_get(fd);
> + }
> +
> + igt_subtest("multi-tile-info") {
> + /** get gt information from driver **/
> + gt_list = xe_query_gt_list(fd);
> + igt_assert(gt_list);
As noted above, we already have a copy of the GT list stashed in
xe_dev->gt_list thanks to the xe_device_get() call above. There doesn't
seem to be a benefit to doing a redundant query and storing a parallel
copy.
> +
> + /** display gt info to user **/
> + xe_show_gt_info(fd, gt_list);
We don't need to be display information to the user in a test. That's
something that belongs in a tool instead.
> +
> + /** check platform has multi tile **/
> + num_tiles = xe_get_tile_count(fd, gt_list);
> + igt_assert_f(num_tiles > 1, "Tiles available %d Multi-Tile not supported.\n",
> + num_tiles);
We don't expect all platforms to support multi-tile, so this assertion
is going to cause a bunch of failures (it will fail on anything that
isn't PVC today). If the subsequent tests are things that truly rely on
multi-tile platforms, then this should be an igt_require() or
igt_skip_on() so that the test just skips if you run it on a platform
that it isn't relevant for.
> +
> + /** check tile order **/
> + igt_assert_eq(xe_check_tile_order(fd, gt_list), 0);
> +
> + /** check type of gt in tile%d **/
> + xe_check_gt_type(fd, gt_list, num_tiles);
The actual checks here are what we'd generally turn into individual
subtests. Although it's still a bit unclear whether these need to live
in a test by themselves, or whether we could have just added these
somewhere like xe_query where we're already doing sanity checks of the
information returned by the GT list query.
Matt
> + }
> +
> + igt_fixture {
> + drm_close_driver(fd);
> + }
> +}
> diff --git a/tests/meson.build b/tests/meson.build index 2f5406523..cdc68498d 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -328,6 +328,7 @@ intel_xe_progs = [
> 'xe_sysfs_scheduler',
> 'xe_sysfs_timeslice_duration',
> 'xe_tlb',
> + 'xe_multi_tile',
> ]
>
> intel_xe_eudebug_progs = [
> --
> 2.43.0
>
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the igt-dev
mailing list