[PATCH v2 18/32] drm/xe/vm: Add attributes struct as member of vma

Wed May 28 05:33:31 UTC 2025

On 27-05-2025 23:07, Matthew Brost wrote:
> On Tue, May 20, 2025 at 02:57:45PM +0530, Ghimiray, Himal Prasad wrote:
>>
>>
>> On 15-05-2025 00:06, Matthew Brost wrote:
>>> On Mon, Apr 07, 2025 at 03:47:05PM +0530, Himal Prasad Ghimiray wrote:
>>>> The attribute of xe_vma will determine the migration policy and the
>>>> encoding of the page table entries (PTEs) for that vma.
>>>> This attribute helps manage how memory pages are moved and how their
>>>> addresses are translated. It will be used by madvise to set the
>>>> behavior of the vma.
>>>>
>>>> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_vm.c       |  6 ++++++
>>>>    drivers/gpu/drm/xe/xe_vm_types.h | 20 ++++++++++++++++++++
>>>>    2 files changed, 26 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>>>> index 27a8dbe709c2..1ff9e477e061 100644
>>>> --- a/drivers/gpu/drm/xe/xe_vm.c
>>>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>>>> @@ -2470,6 +2470,12 @@ static struct xe_vma *new_vma(struct xe_vm *vm, struct drm_gpuva_op_map *op,
>>>>    		vma = ERR_PTR(err);
>>>>    	}
>>>> +	/*TODO: assign devmem_fd of local vram once multi device
>>>> +	 * support is added.
>>>> +	 */
>>>> +	vma->attr.preferred_loc.devmem_fd = 1;
>>>
>>> Assigning a value of '1' is a bit odd... I'd prefer using a define or
>>> something similar to indicate the intended behavior. I noticed a few
>>> other assignments to '1' in the final result—same comment applies to
>>> those.
>>
>> Sure
>>
>>>
>>>> +	vma->attr.atomic_access = DRM_XE_VMA_ATOMIC_UNDEFINED;
>>>> +
>>>>    	return vma;
>>>>    }
>>>> diff --git a/drivers/gpu/drm/xe/xe_vm_types.h b/drivers/gpu/drm/xe/xe_vm_types.h
>>>> index d3c1209348e9..5f5feffecb82 100644
>>>> --- a/drivers/gpu/drm/xe/xe_vm_types.h
>>>> +++ b/drivers/gpu/drm/xe/xe_vm_types.h
>>>> @@ -77,6 +77,19 @@ struct xe_userptr {
>>>>    #endif
>>>>    };
>>>> +/**
>>>> + * struct xe_vma_mem_attr - memory attributes associated with vma
>>>> + */
>>>> +struct xe_vma_mem_attr {
>>>> +	/** @preferred_loc: perferred memory_location*/
>>>> +	struct {
>>>> +		u32 migration_policy; /* represents migration policies */
>>>> +		u32 devmem_fd; /* devmem_fd used for determining pagemap_fd requested by user */
>>>> +	} preferred_loc;
>>>
>>> I'm a little unclear on how these variables work.
>>>
>>> In the uAPI for migration_policy, I see MIGRATE_ALL_PAGES and
>>> MIGRATE_ONLY_SYSTEM_PAGES (these should probably be normalized with a
>>> DRM_XE_* prefix, by the way), but it's unclear to me what exactly these
>>> mean or how they're used based on the final result—could you clarify?
>>
>> With multi-device support the idea was to have flexibility to move only
>> system pages to preferred location or also move pages from other vram
>> location to preferred location.
>>
> 
> Ok, I think having bits set aside for this makes sense.
> 
>>>
>>> Likewise, I'm confused about the devmem_fd usage. It can either be
>>> assigned a devmem_fd from the uAPI, but in some cases, it's interpreted
>>> as a region. I assume this is anticipating multi-GPU support, but again,
>>> the plan isn't clear to me. Could you explain?
>>
>> The devmem_fd is intended to be used to determine the struct drm_pagemap *,
>> which in turn will be used to identify the tile associated with VRAM for
>> allocation and binding. The changes that introduce the
>> devmem_fd->drm_pagemap->tile [1] linkage will be part of the upcoming
>> multi-GPU support.
>>
>> To ensure that the current changes are easily scalable and can be extended
>> for multi-GPU support, I am defining devmem_fd in the UAPI and using it in
>> the KMD as a region placeholder until multi-GPU support is integrated.
>>
> 
> Hmm, will this break the uAPI though? e.g. How does the UMD choose
> between region and FD at the uAPI level? If the answer is once multi-GPU
> lands it is always a FD rather than region then we really need to land
> some of the multi-GPU patches at same time as madvise - at least the
> ones which export memory regions as FDs.

I have tried to streamline these changes in the latest revision. Let's 
see if they make sense. This is with assumption that in a multi-device 
setup devmem_fd <=0 is actually invalid and doesn't point to any 
remote/local tile.

A value of 0 for DEVMEM_FD means the default behavior or VRAM of the 
faulting tile.

A negative value indicates that SMEM should be used.

Other positive values correspond to valid devmem_fds. With the landing 
of multi-device changes, if a positive devmem_fd is not valid, we can 
fall back to the faulting tile or SMEM.

> 
> Let's loop in Thomas on the multi-GPU assumptions too to ensure
> correctness.
> 
> Matt
> 
>> [1] https://patchwork.freedesktop.org/patch/642773/?series=146227&rev=1
>>
>>
>>>
>>> In general I agree with the idea of xe_vma_mem_attr though.
>>>
>>> Matt
>>>
>>>> +	/** @atomic_access: The atomic access type for the vma */
>>>> +	u32 atomic_access;
>>>> +};
>>>> +
>>>>    struct xe_vma {
>>>>    	/** @gpuva: Base GPUVA object */
>>>>    	struct drm_gpuva gpuva;
>>>> @@ -128,6 +141,13 @@ struct xe_vma {
>>>>    	 * Needs to be signalled before UNMAP can be processed.
>>>>    	 */
>>>>    	struct xe_user_fence *ufence;
>>>> +
>>>> +	/**
>>>> +	 * @attr: The attributes of vma which determines the migration policy
>>>> +	 * and encoding of the PTEs for this vma.
>>>> +	 */
>>>> +	struct xe_vma_mem_attr attr;
>>>> +
>>>>    };
>>>>    /**
>>>> -- 
>>>> 2.34.1
>>>>
>>