xe vs amdgpu userptr handling

Thu Feb 8 11:08:49 UTC 2024

Am 08.02.24 um 10:43 schrieb Thomas Hellström:
> On Thu, 2024-02-08 at 10:38 +0100, Thomas Hellström wrote:
>> Hi,
>>
>> On Thu, 2024-02-08 at 07:30 +0100, Christian König wrote:
>>> Am 08.02.24 um 01:36 schrieb Dave Airlie:
>>>> Just cc'ing some folks. I've also added another question.
>>>>
>>>> On Wed, 7 Feb 2024 at 21:08, Maíra Canal <mcanal at igalia.com>
>>>> wrote:
>>>>> Adding another point to this discussion, would it make sense to
>>>>> somehow
>>>>> create a generic structure that all drivers, including shmem
>>>>> drivers,
>>>>> could use it?
>>>>>
>>>>> Best Regards,
>>>>> - Maíra
>>>>>
>>>>> On 2/7/24 03:56, Dave Airlie wrote:
>>>>>> I'm just looking over the userptr handling in both drivers,
>>>>>> and
>>>>>> of
>>>>>> course they've chosen different ways to represent things.
>>>>>> Again
>>>>>> this
>>>>>> is a divergence that is just going to get more annoying over
>>>>>> time and
>>>>>> eventually I'd like to make hmm/userptr driver independent as
>>>>>> much as
>>>>>> possible, so we get consistent semantics in userspace.
>>>>>>
>>>>>> AFAICS the main difference is that amdgpu builds the userptr
>>>>>> handling
>>>>>> inside a GEM object in the kernel, whereas xe doesn't bother
>>>>>> creating
>>>>>> a holding object and just handles things directly in the VM
>>>>>> binding
>>>>>> code.
>>>>>>
>>>>>> Is this just different thinking at different times here?
>>>>>> like since we have VM BIND in xe, it made sense not to bother
>>>>>> creating
>>>>>> a gem object for userptrs?
>>>>>> or is there some other advantages to going one way or the
>>>>>> other?
>>>>>>
>>>> So the current AMD code uses hmm to do userptr work, but xe
>>>> doesn't
>>>> again, why isn't xe using hmm here, I thought I remembered
>>>> scenarios
>>>> where plain mmu_notifiers weren't sufficient.
>>> Well amdgpu is using hmm_range_fault because that was made
>>> mandatory
>>> for
>>> the userptr handling.
>>>
>>> And yes pure mmu_notifiers are not sufficient, you need the
>>> sequence
>>> number + retry mechanism hmm_range_fault provides.
>>>
>>> Otherwise you open up security holes you can push an elephant
>>> through.
>>>
>>> Regards,
>>> Christian.
>> Xe also uses a retry mechanism, so no security hole. The main
>> difference is we use get_user_pages() + retry instead of
>> hmm_range_fault() + retry, with a net effect we're probably holding
>> the
>> page refcounts a little longer, but we drop it immediately after
>> obtaining the page pointers, and dirtying the pages.
>>
>> That said we should move over to hmm_range_fault() to align. I think
>> this was only a result of limited hmm knowledge when the xe code was
>> written.

Yeah, that makes sense. In this case it's just a missing cleanup.

>> As for the userptr not using a backing object in Xe, AFAIK that was
>> because a backing object was deemed unnecessary with VM_BIND. Joonas
>> or
>> Matt can probably provide more detail, but if we're going to do an
>> hmmptr, and have userptr only be a special case of that, then the
>> object is ofc out of the question.

Well how do you then store the dma_fence of the operation?

At least my long term plan was to move some of the logic necessary for 
hmm_range_fault based userptr handling into common GEM code.

One puzzle piece of that is the drm_exec object and for that to work 
userptrs would need to be based on GEM objects as well.

>> FWIW i915 also keeps an object, but it also pins the pages and relies
>> on the shrinker to release that pinning.

Well what exactly do you pin? The pages backing the userptr?

Cause that would be a no-go as well I think since it badly clashes with 
NUMA migrations and transparent huge pages.

Regards,
Christian.

>>
>> So yes, some common code would come in handy. From looking at all
>> implementations I'd
>>
>> - Use hmm_range_fault() - Probably want to temporarily get and lock
>> the
>> pages to dirty them at fault time, though, if gpu mapping is write-
>> enabled.
>> - Don't use a backing object - To be able to unify hmmptr and userptr
> Oh, and to clarify for people that haven't been following the gpuvm
> discussions and the xe SVM discussions,
>
> hmmptr is sparsely populated on-demand allocated (fault aware) and can
> do migration.
> userptr is allocated upfront and can't do migration.
>
> Idea has been to create helpers for these in drm_gpuvm().
>
> /Thomas
>
>
>
>> Thanks,
>> Thomas
>>
>>
>>
>>
>>
>>
>>
>>>> Dave.