Need to support mixed memory mappings with HMM

Christian König ckoenig.leichtzumerken at gmail.com
Fri Mar 26 15:03:16 UTC 2021


Am 26.03.21 um 15:49 schrieb Felix Kuehling:
> Am 2021-03-26 um 4:50 a.m. schrieb Christian König:
>>
>> Am 25.03.21 um 17:28 schrieb Felix Kuehling:
>>> Am 2021-03-25 um 12:22 p.m. schrieb Christian König:
>>>>>>>>> My idea is to change the amdgpu_vm_update_mapping interface to use
>>>>>>>>> some
>>>>>>>>> high-bit in the pages_addr array to indicate the page type. For
>>>>>>>>> the
>>>>>>>>> default page type (0) nothing really changes for the callers. The
>>>>>>>>> "flags" parameter needs to become a pointer to an array that gets
>>>>>>>>> indexed by the high bits from the pages_addr array. For existing
>>>>>>>>> callers
>>>>>>>>> it's as easy as changing flags to &flags (array of size 1). For
>>>>>>>>> HMM we
>>>>>>>>> would pass a pointer to a real array.
>>>>>>>> Yeah, I've thought about stuff like that as well for a while.
>>>>>>>>
>>>>>>>> Problem is that this won't work that easily. We assume at many
>>>>>>>> places
>>>>>>>> that the flags doesn't change for the range in question.
>>>>>>> I think some lower level functions assume that the flags stay the
>>>>>>> same
>>>>>>> for physically contiguous ranges. But if you use the high-bits to
>>>>>>> encode
>>>>>>> the page type, the ranges won't be contiguous any more. So you can
>>>>>>> change page flags for different contiguous ranges.
>>>>>> Yeah, but then you also get absolutely zero THP and fragment flags
>>>>>> support.
>>>>> As long as you have a contiguous 2MB page with the same page type, I
>>>>> think you can still get a THP mapping in the GPU page table. But if
>>>>> one
>>>>> page in the middle of a 2MB page has a different page type, that will
>>>>> break the THP mapping, as it should.
>>>> Yeah, but currently we detect that before we call down into the
>>>> functions to update the tables.
>>>>
>>>> When you give a mixed list the chance for that is just completely gone.
>>> Currently the detection of contiguous ranges is in
>>> amdgpu_vm_update_mapping:
>>>
>>>> if (num_entries > AMDGPU_GPU_PAGES_IN_CPU_PAGE) { uint64_t pfn =
>>>> cursor.start >> PAGE_SHIFT; uint64_t count; contiguous =
>>>> pages_addr[pfn + 1] == pages_addr[pfn] + PAGE_SIZE; tmp = num_entries
>>>> / AMDGPU_GPU_PAGES_IN_CPU_PAGE; for (count = 2; count < tmp; ++count)
>>>> { uint64_t idx = pfn + count; if (contiguous != (pages_addr[idx] ==
>>>> pages_addr[idx - 1] + PAGE_SIZE)) break; } num_entries = count *
>>>> AMDGPU_GPU_PAGES_IN_CPU_PAGE; }
>>> If a page type changes from one page to the next, it will end a
>>> contiguous range and call into the lower level (amdgpu_vm_update_ptes).
>>> I don't think anything needs to change in amdgpu_vm_update_ptes, because
>>> all pages contiguous passed to it use the same page type, so the same
>>> flags. And the existing code in amdgpu_vm_update_mapping already does
>>> the right thing. I honestly don't see the problem.
>> But then you could also just call amdgpu_vm_update_mapping() multiple
>> times feeding it from whatever data structure you use in the HMM code.
>>
>> Using the page array sounds like an intermediate data structure to me
>> you only created to feed into amdgpu_vm_update_mapping().
> You're right. It could be done. One new call to amdgpu_vm_update_mapping
> every time the flags change as we iterate over virtual addresses. It
> should work OK as long as the mappings of different memory types aren't
> too finely interleaved.

I suggest that we stick to this approach for now.

The only alternative I see is that we have an unified address space for 
DMA addresses similar to what you have outlined.

This way we can easily figure out for each address to which device it 
belongs and then calculate the best path to that device.

E.g. direct address for system memory, clearing the S bit for local 
memory, using XGMI for remote memory etc...

That is similar to what you have in mind, just a bit more generalized.

Regards,
Christian.

>
> Regards,
>    Felix
>
>
>> Regards,
>> Christian.
>>
>>> Regards,
>>>     Felix
>>>
>>>
>>>> Regards,
>>>> Christian.
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list