[PATCH v6 5/5] drm/xe: Refactor default device atomic settings
Zeng, Oak
oak.zeng at intel.com
Fri May 3 16:11:32 UTC 2024
> -----Original Message-----
> From: Das, Nirmoy <nirmoy.das at intel.com>
> Sent: Friday, May 3, 2024 12:01 PM
> To: Zeng, Oak <oak.zeng at intel.com>; intel-xe at lists.freedesktop.org
> Cc: Mrozek, Michal <michal.mrozek at intel.com>
> Subject: Re: [PATCH v6 5/5] drm/xe: Refactor default device atomic settings
>
> Hi Oak,
>
> On 5/3/2024 5:39 PM, Zeng, Oak wrote:
> > Hi, Nirmoy,
> >
> >> -----Original Message-----
> >> From: Intel-xe <intel-xe-bounces at lists.freedesktop.org> On Behalf Of
> >> Nirmoy Das
> >> Sent: Tuesday, April 30, 2024 12:25 PM
> >> To: intel-xe at lists.freedesktop.org
> >> Cc: Das, Nirmoy <nirmoy.das at intel.com>; Mrozek, Michal
> >> <michal.mrozek at intel.com>
> >> Subject: [PATCH v6 5/5] drm/xe: Refactor default device atomic settings
> >>
> >> The default behavior of device atomics depends on the
> >> VM type and buffer allocation types. Device atomics are
> >> expected to function with all types of allocations for
> >> traditional applications/APIs. Additionally, in compute/SVM
> >> API scenarios with fault mode or LR mode VMs, device atomics
> >> must work with single-region allocations. In all other cases
> >> device atomics should be disabled by default also on platforms
> >> where we know device atomics doesn't on work on particular
> >> allocations types.
> >>
> >> v3: fault mode requires LR mode so only check for LR mode
> >> to determine compute API(Jose).
> >> Handle SMEM+LMEM BO's migration to LMEM where device
> >> atomics is expected to work. (Brian).
> >> v2: Fix platform checks to correct atomics behaviour on PVC.
> >>
> >> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> >> Acked-by: Michal Mrozek <michal.mrozek at intel.com>
> >> ---
> >> drivers/gpu/drm/xe/xe_pt.c | 37
> >> ++++++++++++++++++++++++++++++++++---
> >> drivers/gpu/drm/xe/xe_vm.c | 2 +-
> >> 2 files changed, 35 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
> >> index 8d3765d3351e..87975e45622a 100644
> >> --- a/drivers/gpu/drm/xe/xe_pt.c
> >> +++ b/drivers/gpu/drm/xe/xe_pt.c
> >> @@ -619,9 +619,40 @@ xe_pt_stage_bind(struct xe_tile *tile, struct
> xe_vma
> >> *vma,
> >> struct xe_pt *pt = xe_vma_vm(vma)->pt_root[tile->id];
> >> int ret;
> >>
> >> - if ((vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) &&
> >> - (is_devmem || !IS_DGFX(xe)))
> >> - xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
> > I think below logic can be moved to a separate function? I am also good if
> you leave it as below. But I think a separate function is better
> I was thinking about but I think we should do that, once it gets more
> complicated with madvise-like options.
> >
> >
> >> + /**
> >> + * Default atomic expectations for different allocation scenarios are
> >> as follows:
> >> + *
> >> + * 1. Traditional API: When the VM is not in LR mode:
> >> + * - Device atomics are expected to function with all allocations.
> >> + *
> >> + * 2. Compute/SVM API: When the VM is in LR mode:
> >> + * - Device atomics are the default behavior when the bo is placed
> >> in a single region.
> >> + * - In all other cases device atomics will be disabled with AE=0 until
> >> an application
> >> + * request differently using a ioctl like madvise.
> >> + */
> >> + if (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) {
> >> + if (xe_vm_in_lr_mode(xe_vma_vm(vma))) {
> >> + if (bo && xe_bo_has_single_placement(bo))
> >> + xe_walk.default_pte |=
> >> XE_USM_PPGTT_PTE_AE;
> >> + /**
> >> + * If a SMEM+LMEM allocation is backed by SMEM, a
> >> device
> >> + * atomics will cause a gpu page fault and which then
> >> + * gets migrated to LMEM, bind such allocations with
> >> + * device atomics enabled.
> >> + */
> >> + else if (is_devmem
> >> && !xe_bo_has_single_placement(bo))
> >
> > Note bo could be NULL here...
>
> is_devmem checks bo that so I left it out.
>
> >
> > So userptr and system allocator don't have a bo
> >
> > Userptr can't run into this case because userptr can't be is_devmem
> >
> > System allocator allocated memory can have devmem backing store... so
> seems the logic can be:
> >
> > (Is_devmem && ((bo && !single_placement) || !bo)
> >
> > But we don't have system allocator for now, so your logic should work for
> the current code.
>
> Yes, default for userptr is AE=0 hence I left it so and planning to
> change that when we have madvise-like ioctl.
Not only madvise
What is in my mind for system allocator is here: https://patchwork.freedesktop.org/patch/588534/?series=132229&rev=1
As you can see, there is a case where xe_vma_is_devmem returns true while !bo.
Anyway, the code seems good as of now.
Oak
>
>
> > If you write as above, then it is future proofed. I am okay with your
> current writing also - I just need to change it a little when system allocator
> come into picture....
> >
> > I do feel the logic here is getting complicated, hard for understanding and
> maintenance. Let's follow up the other email thread to see whether we can
> simply default all allocation to be NO_ATOMIC, and depends on compiler and
> UMD to set atomics up.
>
> I will keep track of that.
>
>
> >
> > As of now, Patch is:
> >
> > Reviewed-by: Oak Zeng <oak.zeng at intel.com>
>
>
> Thanks a lot.
>
> Nirmoy
>
> >
> >> + xe_walk.default_pte |=
> >> XE_USM_PPGTT_PTE_AE;
> >> + } else {
> >> + xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
> >> + }
> >> +
> >> + /**
> >> + * Unset AE if the platform(PVC) doesn't support it on an
> >> + * allocation
> >> + */
> >> + if (!xe->info.has_device_atomics_on_smem && !is_devmem)
> >> + xe_walk.default_pte &= ~XE_USM_PPGTT_PTE_AE;
> >> + }
> >>
> >> if (is_devmem) {
> >> xe_walk.default_pte |= XE_PPGTT_PTE_DM;
> >> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> >> index f1357e2a3b10..d17192c8b7de 100644
> >> --- a/drivers/gpu/drm/xe/xe_vm.c
> >> +++ b/drivers/gpu/drm/xe/xe_vm.c
> >> @@ -888,7 +888,7 @@ static struct xe_vma *xe_vma_create(struct
> xe_vm
> >> *vm,
> >> for_each_tile(tile, vm->xe, id)
> >> vma->tile_mask |= 0x1 << id;
> >>
> >> - if (GRAPHICS_VER(vm->xe) >= 20 || vm->xe->info.platform ==
> >> XE_PVC)
> >> + if (vm->xe->info.has_atomic_enable_pte_bit)
> >> vma->gpuva.flags |= XE_VMA_ATOMIC_PTE_BIT;
> >>
> >> vma->pat_index = pat_index;
> >> --
> >> 2.42.0
More information about the Intel-xe
mailing list