[PATCH 2/2] mm: remove device private page support from hmm_range_fault

Tue Mar 17 22:46:22 UTC 2020

On 3/17/20 4:56 AM, Jason Gunthorpe wrote:
> On Mon, Mar 16, 2020 at 01:24:09PM -0700, Ralph Campbell wrote:
> 
>> The reason for it being backwards is that "normally" a device doesn't want
>> the device private page to be faulted back to system memory, it wants to
>> get the device private struct page so it can update its page table to point
>> to the memory already in the device.
> 
> The "backwards" is you set the flag on input and never get it on
> output, clear the flag in input and maybe get it on output.
> 
> Compare with valid or write which don't work that way.
> 
>> Also, a device like Nvidia's GPUs may have an alternate path for copying
>> one GPU's memory to another (nvlink) without going through system memory
>> so getting a device private struct page/PFN from hmm_range_fault() that isn't
>> "owned" by the faulting GPU is useful.
>> I agree that the current code isn't well tested or thought out for multiple devices
>> (rdma, NVMe drives, GPUs, etc.) but it also ties in with peer-to-peer access via PCIe.
> 
> I think the series here is a big improvement. The GPU driver can set
> owners that match the hidden cluster interconnect.
> 
> Jason
> 

I agree this is an improvement. I was just thinking about possible future use cases.