<!DOCTYPE html><html><head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body> Hi Brian, <div class="moz-cite-prefix">On 4/29/2024 10:14 PM, Welty, Brian wrote: </div> <blockquote type="cite" cite="mid:41eb1fab-d9a5-45be-9c95-05a34fc7685e@intel.com"> On 4/25/2024 3:23 PM, Nirmoy Das wrote: <blockquote type="cite">The default behavior of device atomics depends on the VM type and buffer allocation types. Device atomics are expected to function with all types of allocations for traditional applications/APIs. Additionally, in compute/SVM API scenarios with fault mode or LR mode VMs, device atomics must work with single-region allocations. </blockquote> I think additional patch may be needed..... for the Compute mode handling. I'm not sure correct thing will happened for 'shared' (multi-placement) allocations. HW will raise an atomic access page fault. </blockquote> Yes, I think the change should be then if (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) { if (xe_vm_in_fault_mode(xe_vma_vm(vma)) || xe_vm_in_lr_mode(xe_vma_vm(vma))) { if (bo && (xe_bo_has_single_placement(bo) || is_devmem) ) xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE; } else { xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE; } } So when the handle after migration state. Though currently shared allocation is broken as cpu access will not trigger any cpu migration. This needs future uAPI. <blockquote type="cite" cite="mid:41eb1fab-d9a5-45be-9c95-05a34fc7685e@intel.com">Our page fault handler only checks this was atomic fault and proceeds to migrate the BO to VRAM. I think PTE_AE bit won't be set with your changes, so seems are in infinite loop of atomic access faults on same address. I think we need to fail the page fault in this case, so HW raise CAT error here? As atomic access not supposed to be allowed for above case until future uAPI is added? And need IGT test for this. </blockquote> IGT for the patch ? Currently we have a test(xe_exec_atomic) for atomics access which will test the traditional one but yes, we need some IGT for compute case. Thanks, Nirmoy <blockquote type="cite" cite="mid:41eb1fab-d9a5-45be-9c95-05a34fc7685e@intel.com"> -Brian <blockquote type="cite">In all other cases device atomics should be disabled by default. Signed-off-by: Nirmoy Das <a class="moz-txt-link-rfc2396E" href="mailto:nirmoy.das@intel.com"><nirmoy.das@intel.com></a> --- drivers/gpu/drm/xe/xe_pt.c | 24 ++++++++++++++++++++---- drivers/gpu/drm/xe/xe_vm.c | 3 ++- 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c index 5b7930f46cf3..a8e9e8592c43 100644 --- a/drivers/gpu/drm/xe/xe_pt.c +++ b/drivers/gpu/drm/xe/xe_pt.c @@ -597,7 +597,6 @@ static int xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma, struct xe_vm_pgtable_update *entries, u32 *num_entries) { - struct xe_device *xe = tile_to_xe(tile); struct xe_bo *bo = xe_vma_bo(vma); bool is_devmem = !xe_vma_is_userptr(vma) && bo && (xe_bo_is_vram(bo) || xe_bo_is_stolen_devmem(bo)); @@ -619,9 +618,26 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma, struct xe_pt *pt = xe_vma_vm(vma)->pt_root[tile->id]; int ret; - if ((vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) && - (is_devmem || !IS_DGFX(xe))) - xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE; + /** + * Default atomic expectations for different allocation scenarios are as follows: + * + * 1. Traditional API: When the VM is not in fault mode or LR mode: + * - Device atomics are expected to function with all allocations. + * + * 2. Compute/SVM API: When the VM is either in fault mode or LR mode: + * - Device atomics are the default behavior when the bo is placed in a single region. + * - In all other cases device atomics will be disabled with AE=0 until an application + * request differently using a ioctl like madvise. + */ + if (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) { + if (xe_vm_in_fault_mode(xe_vma_vm(vma)) || + xe_vm_in_lr_mode(xe_vma_vm(vma))) { + if (bo && xe_bo_has_single_placement(bo)) + xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE; + } else { + xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE; + } + } if (is_devmem) { xe_walk.default_pte |= XE_PPGTT_PTE_DM; diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index e41345c1627d..ac08b6fd537e 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -805,7 +805,8 @@ static struct xe_vma *xe_vma_create(struct xe_vm *vm, for_each_tile(tile, vm->xe, id) vma->tile_mask |= 0x1 << id; - if (GRAPHICS_VER(vm->xe) >= 20 || vm->xe->info.platform == XE_PVC) + if (vm->xe->info.has_atomic_enable_pte_bit && + vm->xe->info.has_device_atomics_on_smem) vma->gpuva.flags |= XE_VMA_ATOMIC_PTE_BIT; vma->pat_index = pat_index; </blockquote> </blockquote> </body> </html>