[PATCH] drm/xe: Set PTE_AE for xe2 dgfx platforms
Nirmoy Das
nirmoy.das at intel.com
Fri Jan 19 10:18:40 UTC 2024
Hi Matt,
On 1/18/2024 5:08 PM, Matt Roper wrote:
> On Thu, Jan 18, 2024 at 03:42:56PM +0100, Nirmoy Das wrote:
>> Hi Matt,
>>
>> On 1/18/2024 1:49 AM, Matt Roper wrote:
>>> On Wed, Jan 17, 2024 at 03:48:51PM +0100, Nirmoy Das wrote:
>>>> Atomics on XE2 works for both type of memory so
>>>> extend setting PTE_AE for dgfx platforms as well.
>>> There are no Xe2 dgfx platforms yet, so it's kind of hard to review this
>>> fully,
>> I should've have sent it internally.
>>> but my understanding from bspec 71539 is that in theory a dGPU
>>> atomic operation against system memory would probably only be atomic in
>>> device scope, not global scope. I.e., it's atomic with other GPU
>>> operations,
>> This is my understand as well.
>>> assuming the CPU isn't also accessing the buffer. But if
>>> the buffer is shared between the CPU and GPU, then you'd want to set
>>> AE=0 to ensure that we get a page fault and can migrate the object into
>>> lmem first.
>> I think on system memory, AE=1 should be the default and opt-out by UMD with
>> a uAPI. Basic operations like
>>
>> MI_ATOMIC will fail otherwise which if understand correctly MI_ATOMIC should
>> work on system memory except
>>
>> on PVC because of a know bug. I added a small test to check MI_ATOMIC which
>> works on dg2 system memory
>>
>> https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_10547/bat-dg2-oem2/igt@xe_exec_atomic@basic-dec-all@engine-drm_xe_engine_class_compute-instance-0-tile-0-system-memory.html
>>
>> and also on discrete xe2 but requires this patch.
> I think it depends what you mean by "work." My understanding is that
> barring hardware defects, MI_ATOMIC can always execute and perform the
> requested operation on Xe2. However in some situations the operation
> will not actually behave atomically from the CPU's point of view. The
> AE bit in the page table doesn't make MI_ATOMIC work/not work, it just
> determines whether we want MI_ATOMIC to trigger a page fault or not. In
> cases like this where a buffer can be accessed from both the CPU and
> GPU, and where we need true global scope atomicity, then we'd want to
> make sure AE=0 so that a page fault is generated and we get the
> opportunity to migrate the buffer to LMEM.
>
> I think the expectation of userspace would be that MI_ATOMIC is always
> truly atomic. Wouldn't it be better to make sure we trigger a page
> fault by default and only turn that off if userspace explicitly tells us
> they're okay with relaxing to device scope atomicity?
Yes, if the default expectation is global scope then it makes sense to
have AE=0.
Had a chat with Oak and he confirmed that for L0 the expectation is
global scope.
>
> Either way, we should probably wait until there actually are Xe2
> discrete GPUs; at the moment this is all theoretical since the only
> Xe2 platform we have today is an igpu.
Sounds good.
Thanks,
Nirmoy
>
>
> Matt
>
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>>> Matt
>>>
>>>> Cc: Fei Yang<fei.yang at intel.com>
>>>> Cc: Jose Souza<jose.souza at intel.com>
>>>> Cc: Matt Roper<matthew.d.roper at intel.com>
>>>> Cc: Brian Welty<brian.welty at intel.com>
>>>> Signed-off-by: Nirmoy Das<nirmoy.das at intel.com>
>>>> ---
>>>> drivers/gpu/drm/xe/xe_pt.c | 6 +++++-
>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>>>> index de1030a47588..3ace4b401369 100644
>>>> --- a/drivers/gpu/drm/xe/xe_pt.c
>>>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>>>> @@ -602,8 +602,12 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
>>>> struct xe_pt *pt = xe_vma_vm(vma)->pt_root[tile->id];
>>>> int ret;
>>>> + /**
>>>> + * XE_USM_PPGTT_PTE_AE is available for igfx and dgfx from xe2 onwards
>>>> + * and also for PVC but atomics only works for PVC on device memory.
>>>> + */
>>>> if (vma && (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) &&
>>>> - (is_devmem || !IS_DGFX(xe)))
>>>> + (is_devmem || GRAPHICS_VER(xe) >= 20))
>>>> xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
>>>> if (is_devmem) {
>>>> --
>>>> 2.42.0
>>>>
More information about the Intel-xe
mailing list