[PATCH 17/35] drm/amdkfd: register HMM device private zone

Felix Kuehling felix.kuehling at amd.com
Thu Mar 4 17:58:00 UTC 2021


Am 2021-03-01 um 3:46 a.m. schrieb Thomas Hellström (Intel):
>
> On 3/1/21 9:32 AM, Daniel Vetter wrote:
>> On Wed, Jan 06, 2021 at 10:01:09PM -0500, Felix Kuehling wrote:
>>> From: Philip Yang <Philip.Yang at amd.com>
>>>
>>> Register vram memory as MEMORY_DEVICE_PRIVATE type resource, to
>>> allocate vram backing pages for page migration.
>>>
>>> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>> So maybe I'm getting this all wrong, but I think that the current ttm
>> fault code relies on devmap pte entries (especially for hugepte entries)
>> to stop get_user_pages. But this only works if the pte happens to not
>> point at a range with devmap pages.
>
> I don't think that's in TTM yet, but the proposed fix, yes (see email
> I just sent in another thread),
> but only for huge ptes.
>
>>
>> This patch here changes that, and so probably breaks this devmap pte
>> hack
>> ttm is using?
>>
>> If I'm not wrong here then I think we need to first fix up the ttm
>> code to
>> not use the devmap hack anymore, before a ttm based driver can
>> register a
>> dev_pagemap. Also adding Thomas since that just came up in another
>> discussion.
>
> It doesn't break the ttm devmap hack per se, but it indeed allows gup
> to the range registered, but here's where my lack of understanding why
> we can't allow gup-ing TTM ptes if there indeed is a backing
> struct-page? Because registering MEMORY_DEVICE_PRIVATE implies that,
> right?

I wasn't aware that TTM used devmap at all. If it does, what type of
memory does it use?

MEMORY_DEVICE_PRIVATE is like swapped out memory. It cannot be mapped in
the CPU page table. GUP would cause a page fault to swap it back into
system memory. We are looking into use MEMORY_DEVICE_GENERIC for a
future coherent memory architecture, where device memory can be
coherently accessed by the CPU and GPU.

As I understand it, our DEVICE_PRIVATE registration is not tied to an
actual physical address. Thus your devmap registration and our devmap
registration could probably coexist without any conflict. You'll just
have the overhead of two sets of struct pages for the same memory.

Regards,
  Felix


>
> /Thomas
>
>> -Daniel
>>
>


More information about the amd-gfx mailing list