[Intel-gfx] [RFC 06/13] drm/i915/svm: Page table mirroring support

Sat Nov 23 23:53:52 UTC 2019

On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
> On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
> > On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
> > 
> > > +static inline bool
> > > +i915_range_done(struct hmm_range *range)
> > > +{
> > > +	bool ret = hmm_range_valid(range);
> > > +
> > > +	hmm_range_unregister(range);
> > > +	return ret;
> > > +}
> > 
> > This needs to be updated to follow the new API, but this pattern is
> > generally wrong, failure if hmm_range_valid should retry the
> > range_fault and so forth. See the hmm.rst.
> > 
> 
> Thanks Jason for the feedback.
> Yah, will update as per new API in the next revision.
> The caller of this function is indeed retrying if the range is not valid.
> But I got the point. I made changes in my local build as per hmm.rst
> and it is working fine. Will include the change in next revision.

Generally speaking this locking approach is a live-lockable
collision-retry scheme. 

For maintainability it is best if the retry loop is explicit and
doesn't unregister the range during the retry. I also encourage adding
an absolute time bound to the retry as it *could* live lock.

> > > +static int
> > > +i915_range_fault(struct i915_svm *svm, struct hmm_range *range)
> > > +{
> > > +	long ret;
> > > +
> > > +	range->default_flags = 0;
> > > +	range->pfn_flags_mask = -1UL;
> > > +
> > > +	ret = hmm_range_register(range, &svm->mirror);
> > > +	if (ret) {
> > > +		up_read(&svm->mm->mmap_sem);
> > > +		return (int)ret;
> > > +	}
> > 
> > 
> > Using a temporary range is the pattern from nouveau, is it really
> > necessary in this driver?
> 
> Yah, not required. In my local build I tried with proper default_flags
> and set pfn_flags_mask to 0 and it is working fine.

Sorry, I ment calling hmm_range_register during fault processing.

If your driver works around user space objects that cover a VA then
the range should be created when the object is created.

> > > +	/*
> > > +	 * No needd to dma map the host pages and later unmap it, as
> > > +	 * GPU is not allowed to access it with SVM. Hence, no need
> > > +	 * of any intermediate data strucutre to hold the mappings.
> > > +	 */
> > > +	for (i = 0; i < npages; i++) {
> > > +		u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
> > > +
> > > +		if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
> > > +			sg->length += PAGE_SIZE;
> > > +			sg_dma_len(sg) += PAGE_SIZE;
> > > +			continue;
> > > +		}
> > > +
> > > +		if (sg)
> > > +			sg_page_sizes |= sg->length;
> > > +
> > > +		sg =  sg ? __sg_next(sg) : st->sgl;
> > > +		sg_dma_address(sg) = addr;
> > > +		sg_dma_len(sg) = PAGE_SIZE;
> > > +		sg->length = PAGE_SIZE;
> > > +		st->nents++;
> > 
> > It is odd to build the range into a sgl.
> > 
> > IMHO it is not a good idea to use the sg_dma_address like this, that
> > should only be filled in by a dma map. Where does it end up being
> > used?
> 
> The sgl is used to plug into the page table update function in i915.
> 
> For the device memory in discrete card, we don't need dma map which
> is the case here.

How did we get to device memory on a card? Isn't range->pfns a CPU PFN
at this point?

I'm confused.

> > > +int i915_svm_bind_mm(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_svm *svm;
> > > +	struct mm_struct *mm;
> > > +	int ret = 0;
> > > +
> > > +	mm = get_task_mm(current);
> > 
> > I don't think the get_task_mm(current) is needed, the mmget is already
> > done for current->mm ?
> 
> No, I don't think mmget is already done for current->mm here.

I'm not certain, but I thought it is already done because it is
'current' and current cannot begin to destroy the mm while a syscall
is currently running.

Jason