Separating xe_vma- and page-table state

Zeng, Oak oak.zeng at intel.com
Tue Mar 12 23:02:20 UTC 2024


Hi Thomas,

> -----Original Message-----
> From: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Sent: Tuesday, March 12, 2024 3:43 AM
> To: intel-xe at lists.freedesktop.org
> Cc: Brost, Matthew <matthew.brost at intel.com>; Zeng, Oak
> <oak.zeng at intel.com>
> Subject: Separating xe_vma- and page-table state
> 
> Hi,
> 
> It's IMO become apparent both in the system allocator discussion and in
> the patch that enables the presence of invalid vmas 
> that we need to be
> better at separating xe_vma and page-table state, so that xe_vma state
> would contain things that are mostly immutable and that the user
> requested: PAT index, memory attributes, requested tile presence etc,
> whereas the page-table state would contain mutable state like actual
> tile presence, invalidaton state and MMU notifier.

It is a valid reasoning to me... if we want to do what community want us to do with system allocator, 
And if we want to meet our umd's requirement of "free w/o vm_unbind", yes,  we need this "invalid" vma concept.

The strange thing is, it seems Matt can still achieve the goal without introducing invalid vma concept... it doesn't look like he has
This concept in his patch....


> 
> So far we have had no reason to separate the two, but with hmmptr we
> would likely end up with multiple page-table regions per xe-vma, 

Can we still maintain 1 xe-vma : 1 page table region relationship for simplicity?

This requires vma splitting. i.e., if you have a hmmptr vma cover range [0~100],
And fault happens at range [40~60], then we will end up with 3 vmas:
[0~40], a dummy hmmptr vma, not mapped gpu
[40~60], hmmptr vma mapped to gp
[60~100], dummy, not mapped to gpu.

Does this work for you? Or do you see a benefit of not splitting vma?



and
> with the patch discussed earlier we could've easily reused
> xe_vm_unbind_vma() that only touches the mutable page-table state and
> does the correct locking.
> 
> The page table could would then typically take a const xe_vma *, and
> and xe_pt_state *, or whatever we choose to call it. All xe_vmas except
> hmmptr ones would have an 1:1 xe_vma <-> xe_pt_state relationship.
> 

Matt has POC codes which works for me for system allocator w/o this vma/pt_state splitting: https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-system-allocator/-/commits/system_allocator_poc?ref_type=heads

Can you take a look also... 

When I worked on system allocator HLD, I pictured the invalid vma concept also. But somehow Matt was able to make it working without such concept....

Oak

> Thoughts, comments?
> 
> Thanks,
> Thomas


More information about the Intel-xe mailing list