[PATCH v3 2/7] drm/xe: Consolidate setting PTE_AE into one place

Mon Apr 22 08:18:15 UTC 2024

Hi Oak,

On 4/19/2024 8:35 PM, Zeng, Oak wrote:
>
>> -----Original Message-----
>> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of
>> Nirmoy Das
>> Sent: Monday, April 15, 2024 10:52 AM
>> To: intel-xe at lists.freedesktop.org
>> Cc: Das, Nirmoy <nirmoy.das at intel.com>
>> Subject: [PATCH v3 2/7] drm/xe: Consolidate setting PTE_AE into one place
>>
>> Currently decision to set PTE_AE is spread between xe_pt
>> and xe_vm files and there is no reason to be keep it that
>> way. Consolidate the logic for better maintainability.
>>
>> Atomics is not expected on userptr memory so this patch
>> also making sure PTE_AE is only applied when a buffer object
>> exist.
>>
>> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_pt.c | 4 +---
>>   drivers/gpu/drm/xe/xe_vm.c | 7 ++++---
>>   2 files changed, 5 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>> index 5b7930f46cf3..7dc13a8bb44f 100644
>> --- a/drivers/gpu/drm/xe/xe_pt.c
>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>> @@ -597,7 +597,6 @@ static int
>>   xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
>>   		 struct xe_vm_pgtable_update *entries, u32 *num_entries)
>>   {
>> -	struct xe_device *xe = tile_to_xe(tile);
>>   	struct xe_bo *bo = xe_vma_bo(vma);
>>   	bool is_devmem = !xe_vma_is_userptr(vma) && bo &&
>>   		(xe_bo_is_vram(bo) || xe_bo_is_stolen_devmem(bo));
>> @@ -619,8 +618,7 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma
>> *vma,
>>   	struct xe_pt *pt = xe_vma_vm(vma)->pt_root[tile->id];
>>   	int ret;
>>
>> -	if ((vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) &&
>> -	    (is_devmem || !IS_DGFX(xe)))
>> +	if (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT)
>>   		xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
>>
>>   	if (is_devmem) {
>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>> index 2dbba55e7785..b1dcaa35b6cc 100644
>> --- a/drivers/gpu/drm/xe/xe_vm.c
>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>> @@ -806,9 +806,6 @@ static struct xe_vma *xe_vma_create(struct xe_vm
>> *vm,
>>   	for_each_tile(tile, vm->xe, id)
>>   		vma->tile_mask |= 0x1 << id;
>>
>> -	if (GRAPHICS_VER(vm->xe) >= 20 || vm->xe->info.platform ==
>> XE_PVC)
>> -		vma->gpuva.flags |= XE_VMA_ATOMIC_PTE_BIT;
>> -
>>   	vma->pat_index = pat_index;
>>
>>   	if (bo) {
>> @@ -816,6 +813,10 @@ static struct xe_vma *xe_vma_create(struct xe_vm
>> *vm,
>>
>>   		xe_bo_assert_held(bo);
>>
>> +		if (vm->xe->info.has_atomic_enable_pte_bit &&
>> +		    (xe_bo_is_vram(bo) || !IS_DGFX(vm->xe)))
> This is vma creation time. The xe_bo_is_vram works for device or host allocation. But for shared allocation, we can't decide bo placement at vma creation time, as bo can be migrated after vma creation. So I think this should be determined somewhere right before the page table programing, at lease bo migration is done.
Thanks for raising this. In that case, I will drop this patch.
>
>
> Also regarding the IS_DGFX... I think on some dgfx platform, we can support device atomic to host memory, for example, when dgpu is connected to host through cxl. I need to double confirm this with HW

I think CXL should be able to handle atomics even without migration. We 
can look into that once we have a mechanism to detect such platform.

Thanks,

Nirmoy

>
>
> Oak
>
>
>
>
>> +			vma->gpuva.flags |= XE_VMA_ATOMIC_PTE_BIT;
>> +
>>   		vm_bo = drm_gpuvm_bo_obtain(vma->gpuva.vm, &bo-
>>> ttm.base);
>>   		if (IS_ERR(vm_bo)) {
>>   			xe_vma_free(vma);
>> --
>> 2.42.0