[Intel-xe] [PATCH 1/4] drm/xe: Fix MTL+ stolen memory mapping

Matt Roper matthew.d.roper at intel.com
Tue Jul 25 22:47:18 UTC 2023


On Tue, Jul 25, 2023 at 01:28:33PM -0700, Lucas De Marchi wrote:
> Based on commit 8d8d062be6b9 ("drm/i915/mtl: Fix MTL stolen memory GGTT
> mapping"). For stolen on MTL and beyond, the address in the PTE is the
> offset from DSM base. While at it, update the comments explaining each
> part of the calculation.
> 
> v2: Also change resource_is_stolen_vram() so the right PTE bit is set

With a bit more testing locally I noticed that I'm seeing some warnings
and crashes on driver unload when display is enabled.  I'm not 100% sure
they're tied to this, but it seems like they could be.

The first warning is

    WARNING: CPU: 10 PID: 4710 at drivers/gpu/drm/xe/xe_ttm_vram_mgr.c:324 ttm_vram_mgr_fini+0x15e/0x170 [xe]

which is

        WARN_ON_ONCE(mgr->visible_avail != mgr->visible_size);

>From there things go downhill --- circular locking detected, slab cache
problems, and KASAN use-after-free reports.

Without digging through the code in depth, my worry is that by just
marking this stolen memory as vram everywhere in the driver, we try to
return the framebuffer to the otherwise empty/uninitialized VRAM pool,
even though it isn't truly VRAM.

Maybe it would be better to make a change with less side effects
elsewhere in the code.  E.g., add a "resource_is_stolen_devmem" that
only gets called from the pagetable encoding functions to set the DM=1
bit and not anywhere else?

Or maybe these warnings are unrelated; I haven't tried unloading with
display on any other platforms recently, so maybe it's a pre-existing
problem.


Matt

> 
> Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
>  drivers/gpu/drm/xe/xe_bo.c             |  3 ++-
>  drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c | 15 +++++++++++++--
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 10858fcf4f62..5379d8e5affb 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -61,7 +61,8 @@ bool mem_type_is_vram(u32 mem_type)
>  
>  static bool resource_is_stolen_vram(struct xe_device *xe, struct ttm_resource *res)
>  {
> -	return res->mem_type == XE_PL_STOLEN && IS_DGFX(xe);
> +	return res->mem_type == XE_PL_STOLEN &&
> +		(IS_DGFX(xe) || GRAPHICS_VERx100(xe) >= 1270);
>  }
>  
>  static bool resource_is_vram(struct ttm_resource *res)
> diff --git a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> index 21ecc734f10a..271b3fba4129 100644
> --- a/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> +++ b/drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c
> @@ -94,11 +94,22 @@ static u32 detect_bar2_integrated(struct xe_device *xe, struct xe_ttm_stolen_mgr
>  
>  	ggc = xe_mmio_read32(xe_root_mmio_gt(xe), GGC);
>  
> -	/* check GGMS, should be fixed 0x3 (8MB) */
> +	/*
> +	 * Check GGMS: it should be fixed 0x3 (8MB), which corresponds to the
> +	 * GTT size
> +	 */
>  	if (drm_WARN_ON(&xe->drm, (ggc & GGMS_MASK) != GGMS_MASK))
>  		return 0;
>  
> -	mgr->stolen_base = mgr->io_base = pci_resource_start(pdev, 2) + SZ_8M;
> +	/*
> +	 * Graphics >= 1270 uses the offset to the GSMBASE as address in the
> +	 * PTEs, together with the DM flag being set. Previously there was no
> +	 * such flag so the address was the io_base.
> +	 *
> +	 * DSMBASE = GSMBASE + 8MB
> +	 */
> +	mgr->stolen_base = SZ_8M;
> +	mgr->io_base = pci_resource_start(pdev, 2) + mgr->stolen_base;
>  
>  	/* return valid GMS value, -EIO if invalid */
>  	gms = REG_FIELD_GET(GMS_MASK, ggc);
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the Intel-xe mailing list