[PATCH v2] nouveau/hmm: map pages after migration

Ralph Campbell rcampbell at nvidia.com
Tue Mar 3 21:15:21 UTC 2020


On 3/3/20 4:42 AM, Jason Gunthorpe wrote:
> On Mon, Mar 02, 2020 at 05:00:23PM -0800, Ralph Campbell wrote:
>> When memory is migrated to the GPU, it is likely to be accessed by GPU
>> code soon afterwards. Instead of waiting for a GPU fault, map the
>> migrated memory into the GPU page tables with the same access permissions
>> as the source CPU page table entries. This preserves copy on write
>> semantics.
>>
>> Signed-off-by: Ralph Campbell <rcampbell at nvidia.com>
>> Cc: Christoph Hellwig <hch at lst.de>
>> Cc: Jason Gunthorpe <jgg at mellanox.com>
>> Cc: "Jérôme Glisse" <jglisse at redhat.com>
>> Cc: Ben Skeggs <bskeggs at redhat.com>
>> ---
>>
>> Originally this patch was targeted for Jason's rdma tree since other HMM
>> related changes were queued there. Now that those have been merged, this
>> patch just contains changes to nouveau so it could go through any tree.
>> I guess Ben Skeggs' tree would be appropriate.
> 
> Yep
> 
>> +static inline struct nouveau_pfnmap_args *
>> +nouveau_pfns_to_args(void *pfns)
> 
> don't use static inline inside C files

OK.

>> +{
>> +	struct nvif_vmm_pfnmap_v0 *p =
>> +		container_of(pfns, struct nvif_vmm_pfnmap_v0, phys);
>> +
>> +	return container_of(p, struct nouveau_pfnmap_args, p);
> 
> And this should just be
> 
>     return container_of(pfns, struct nouveau_pfnmap_args, p.phys);

Much simpler, thanks.

>> +static struct nouveau_svmm *
>> +nouveau_find_svmm(struct nouveau_svm *svm, struct mm_struct *mm)
>> +{
>> +	struct nouveau_ivmm *ivmm;
>> +
>> +	list_for_each_entry(ivmm, &svm->inst, head) {
>> +		if (ivmm->svmm->notifier.mm == mm)
>> +			return ivmm->svmm;
>> +	}
>> +	return NULL;
>> +}
> 
> Is this re-implementing mmu_notifier_get() ?
> 
> Jason

Not quite. This is being called from an ioctl() call on the GPU device
file which calls nouveau_svmm_bind() which locks mmap_sem for reading,
walks the vmas for the address range given in the ioctl() data, and migrates
the pages to the GPU memory.
mmu_notifier_get() would try to lock mmap_sem for writing so that would deadlock.
But it is similar in that the GPU specific process context (nouveau_svmm) needs
to be found for the given ioctl caller.
If find_get_mmu_notifier() was exported, I think that could work.
Now that I look at this again, there is an easier way to find the svmm and I see
some other bugs that need fixing. I'll post a v3 as soon as I get those written
and tested.

Thanks for the review.


More information about the dri-devel mailing list