BO alignment for kernel page size > 4kB
Lionel Landwerlin
lionel.g.landwerlin at intel.com
Tue Aug 5 09:24:08 UTC 2025
Hi Simon,
Probably best to open an issue on gitlab, this is all Anv specific stuff.
Let's not bother the entire project with it.
-Lionel
On 05/08/2025 11:13, Simon Richter wrote:
> Hi,
>
> there is a proposed patch[1] to the xe driver to make it work for
> larger kernel page sizes. Part of this patch is to return the CPU page
> size as DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT, so Mesa will pad size
> requests accordingly.
>
> However, that is necessary only for CPU visible BOs, local-only is
> still fine with 4kB. The query parameter has no context of whether the
> allocation will be CPU visible, so I think it's the wrong place for it.
>
> We can (and do) also fix up the size inside the kernel with the
> detected alignment, but that means that Mesa doesn't know about it.
>
> I've looked into the xe_gem_create function, and inserting an extra
> alignment requirement there seems somewhat doable, but I'm not
> entirely sure if that is sufficient, and I don't entirely follow the
> meaning of all the relevant flags here.
>
> My proposed strategy:
>
> 1. the xe driver will report 4kB as DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT,
> even if CPU page size is larger, because that is the requirement from
> the GPU.
>
> 2. the xe driver will silently fix up the size if
> DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM is set, to allow older
> versions of Mesa to work.
>
> 3. xe_gem_create will align the size to the result of
> sysconf(_SC_PAGESIZE) if ANV_BO_ALLOC_MAPPED or
> ANV_BO_ALLOC_LOCAL_MEM_CPU_VISIBLE is set
>
> However: DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM is not set in the
> ioctl if device->physical->vram_non_mappable.size is zero, or
> ANV_BO_ALLOC_NO_LOCAL_MEM is set.
>
> So if we found an aperture for all of VRAM (which is quite likely on
> platforms that have larger kernel page sizes), then Mesa will not tell
> us that the memory must be CPU visible -- so we need to fix up the
> size of all allocations, and we're achieving the same result as just
> reporting a larger DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT.
>
> Does it make sense to set vram_non_mappable.size to 1 here, to force
> Mesa to tell us if the mapping is meant to be CPU accessible?
>
> What is the role of ANV_BO_ALLOC_NO_LOCAL_MEM? To me, the logic looks
> reversed -- if we're told not to use (device) local memory, we *don't*
> tell the kernel that the memory should be CPU visible. Is that a bug,
> am I misinterpreting the function of this flag, or is there some other
> mechanism I'm unaware of that makes this work (I see this flag is used
> for fences, which most certainly are CPU visible)?
>
> Or should we just not care to optimize device local allocations, and
> pad everything both in the kernel and in userspace?
>
> Simon
>
> [1]
> https://lore.kernel.org/all/20250604-upstream-xe-non-4k-v2-v2-0-ce7905da7b08@aosc.io/
More information about the mesa-dev
mailing list