[Intel-xe] [PATCH v2] drm/xe/mmio: update gt_count when probing multi-tile
Ofir Bitton
obitton at habana.ai
Wed Jul 5 09:50:26 UTC 2023
On 26/06/2023 20:20, Matthew Auld wrote:
> It looks like the single-tile PVC in CI dies during module load when doing
> the pcode init. From the logs we try to access the address
> 0000000000138124 which doesn't map to anything, however 0x138124 also
> looks to be the PCODE_MAILBOX register. So looks like the per-tile
> mmio register mapping is NULL.
>
> During probe the tile count is potentially trimmed, since we don't know
> the real count until we actually probe the device. This seems to be
> the case for single-tile PVC or similar devices. However it looks like
> the gt_count is never adjusted to respect this updated tile count. As a
> result when later doing some for_each_gt() loop, like we do for the
> pcode, we can get back some GT that maps to some non-existent tile
> which hasn't been properly set up, leading to crashes.
>
> Try to fix this by adjusting the gt_count after probing the tiles for
> real.
>
> v2: Fix typo so it actually builds
>
> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/383
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> Cc: Matt Roper <matthew.d.roper at intel.com>
> ---
> drivers/gpu/drm/xe/xe_mmio.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_mmio.c b/drivers/gpu/drm/xe/xe_mmio.c
> index f1336803b915..8150bf6f3578 100644
> --- a/drivers/gpu/drm/xe/xe_mmio.c
> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> @@ -334,6 +334,12 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
> adj_tile_count = xe->info.tile_count =
> REG_FIELD_GET(TILE_COUNT, mtcfg) + 1;
>
> + /*
> + * FIXME: Needs some work for standalone media, but should be impossible
> + * with multi-tile for now.
> + */
> + xe->info.gt_count = xe->info.tile_count;
> +
> drm_info(&xe->drm, "tile_count: %d, adj_tile_count %d\n",
> xe->info.tile_count, adj_tile_count);
>
'xe->info.gt_count' is getting incremented in 'xe_info_init' during
probe, seems you must remove it from there, or else 'gt_count' will
eventually be equal to x2 of the desired value.
--
Ofir
More information about the Intel-xe
mailing list