[PATCH] drm/xe: Set PTE_AE for xe2 dgfx platforms

Nirmoy Das nirmoy.das at intel.com
Fri Jan 19 10:18:40 UTC 2024


Hi Matt,

On 1/18/2024 5:08 PM, Matt Roper wrote:
> On Thu, Jan 18, 2024 at 03:42:56PM +0100, Nirmoy Das wrote:
>> Hi Matt,
>>
>> On 1/18/2024 1:49 AM, Matt Roper wrote:
>>> On Wed, Jan 17, 2024 at 03:48:51PM +0100, Nirmoy Das wrote:
>>>> Atomics on XE2 works for both type of memory so
>>>> extend setting PTE_AE for dgfx platforms as well.
>>> There are no Xe2 dgfx platforms yet, so it's kind of hard to review this
>>> fully,
>> I should've have sent it internally.
>>>    but my understanding from bspec 71539 is that in theory a dGPU
>>> atomic operation against system memory would probably only be atomic in
>>> device scope, not global scope.  I.e., it's atomic with other GPU
>>> operations,
>> This is my understand as well.
>>>    assuming the CPU isn't also accessing the buffer.  But if
>>> the buffer is shared between the CPU and GPU, then you'd want to set
>>> AE=0 to ensure that we get a page fault and can migrate the object into
>>> lmem first.
>> I think on system memory, AE=1 should be the default and opt-out by UMD with
>> a uAPI. Basic operations like
>>
>> MI_ATOMIC will fail otherwise which if understand correctly MI_ATOMIC should
>> work on system memory except
>>
>> on PVC because of a know bug. I added a small test to check MI_ATOMIC which
>> works on dg2 system memory
>>
>> https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_10547/bat-dg2-oem2/igt@xe_exec_atomic@basic-dec-all@engine-drm_xe_engine_class_compute-instance-0-tile-0-system-memory.html
>>
>> and also on discrete xe2 but requires this patch.
> I think it depends what you mean by "work."  My understanding is that
> barring hardware defects, MI_ATOMIC can always execute and perform the
> requested operation on Xe2.  However in some situations the operation
> will not actually behave atomically from the CPU's point of view.  The
> AE bit in the page table doesn't make MI_ATOMIC work/not work, it just
> determines whether we want MI_ATOMIC to trigger a page fault or not.  In
> cases like this where a buffer can be accessed from both the CPU and
> GPU, and where we need true global scope atomicity, then we'd want to
> make sure AE=0 so that a page fault is generated and we get the
> opportunity to migrate the buffer to LMEM.
>
> I think the expectation of userspace would be that MI_ATOMIC is always
> truly atomic.  Wouldn't it be better to make sure we trigger a page
> fault by default and only turn that off if userspace explicitly tells us
> they're okay with relaxing to device scope atomicity?

Yes, if the default expectation is global scope then it makes sense to 
have AE=0.

Had a chat with Oak and he confirmed that for L0 the expectation is 
global scope.

>
> Either way, we should probably wait until there actually are Xe2
> discrete GPUs; at the moment this is all theoretical since the only
> Xe2 platform we have today is an igpu.

Sounds good.


Thanks,

Nirmoy

>
>
> Matt
>
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>>> Matt
>>>
>>>> Cc: Fei Yang<fei.yang at intel.com>
>>>> Cc: Jose Souza<jose.souza at intel.com>
>>>> Cc: Matt Roper<matthew.d.roper at intel.com>
>>>> Cc: Brian Welty<brian.welty at intel.com>
>>>> Signed-off-by: Nirmoy Das<nirmoy.das at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_pt.c | 6 +++++-
>>>>    1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c
>>>> index de1030a47588..3ace4b401369 100644
>>>> --- a/drivers/gpu/drm/xe/xe_pt.c
>>>> +++ b/drivers/gpu/drm/xe/xe_pt.c
>>>> @@ -602,8 +602,12 @@ xe_pt_stage_bind(struct xe_tile *tile, struct xe_vma *vma,
>>>>    	struct xe_pt *pt = xe_vma_vm(vma)->pt_root[tile->id];
>>>>    	int ret;
>>>> +	/**
>>>> +	 * XE_USM_PPGTT_PTE_AE is available for igfx and dgfx from xe2 onwards
>>>> +	 * and also for PVC but atomics only works for PVC on device memory.
>>>> +	 */
>>>>    	if (vma && (vma->gpuva.flags & XE_VMA_ATOMIC_PTE_BIT) &&
>>>> -	    (is_devmem || !IS_DGFX(xe)))
>>>> +	    (is_devmem || GRAPHICS_VER(xe) >= 20))
>>>>    		xe_walk.default_pte |= XE_USM_PPGTT_PTE_AE;
>>>>    	if (is_devmem) {
>>>> -- 
>>>> 2.42.0
>>>>


More information about the Intel-xe mailing list