[Intel-gfx] [PATCH 3/3] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl

Chris Wilson chris at chris-wilson.co.uk
Fri May 16 12:07:09 CEST 2014


On Thu, Apr 17, 2014 at 03:18:48PM -0700, Volkin, Bradley D wrote:
> > +	union {
> > +		struct i915_gem_userptr {
> 
> Out of curiosity, what's the point of using both a union and a
> struct here, given that everything is within the struct?

Because I stuff other things into this union in other patches, and
compacting our object using a union has been on the cards since
introducing subclasses of objects.
 
> > +	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
> > +
> > +	if (obj->userptr.mm && obj->userptr.mn == NULL)
> > +		return ERR_PTR(-EINVAL);
> > +
> 
> So this is basically saying we won't export an unsynchronized
> userptr? I don't think we've documented this or the other
> restrictions on unsynchronized userptrs (e.g. root only).

Yes. We cannot control the endpoint of a dmabuf and so we do not know if
the client would be able to control access to the bo accordingly (it
should after all appear to be a regular bo, except for the caveats about
snooped access and lack of CPU mmap support atm etc).

In fact, later I changed to this to obj->ops->export_dmabuf() callback
so that we can demote unsync'ed userptr for exporting.

> > +static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
> > +						       struct mm_struct *mm,
> > +						       unsigned long start,
> > +						       unsigned long end)
> > +{
> > +	struct i915_mmu_notifier *mn = container_of(_mn, struct i915_mmu_notifier, mn);
> > +	struct interval_tree_node *it = NULL;
> > +	unsigned long serial = 0;
> > +
> > +	end--; /* interval ranges are inclusive, but invalidate range is exclusive */
> > +	while (start < end) {
> > +		struct drm_i915_gem_object *obj;
> > +
> > +		obj = NULL;
> > +		spin_lock(&mn->lock);
> > +		if (serial == mn->serial)
> > +			it = interval_tree_iter_next(it, start, end);
> > +		else
> > +			it = interval_tree_iter_first(&mn->objects, start, end);
> 
> Is the issue here just that items being removed from the tree could make 'it'
> invalid on the next call to interval_tree_iter_next()? Or are we also worried
> about items being added to the tree? If items can be added, we've been advancing
> 'start', so I think there would be the potential for adding an item to the
> portion of the invalidate range that we've already processed. Whether that
> could happen given the locking or if the invalidation should apply to such an
> object, I'm not sure.

Both. We have to guard against modifications to the interval-tree, both
by ourselves (through freeing a no longer active bo) and other threads.
Only if the interval-tree was untouched whilst unlocked can we jump
ahead. I presume that the mm will handle serialisation of invalidate
against new users, so that we don't have to continuously walk over the
same range dropping user pages. (i.e. someone won't be able to mmap the
same arena until the earlier munmap is complete)

> > +static int
> > +i915_mmu_notifier_add(struct i915_mmu_notifier *mmu,
> > +		      struct i915_mmu_object *mn)
> > +{
> > +	struct interval_tree_node *it;
> > +	int ret;
> > +
> > +	/* Make sure we drop the final active reference (and thereby
> > +	 * remove the objects from the interval tree) before we do
> > +	 * the check for overlapping objects.
> > +	 */
> 
> I assume this comment refers to the retire_requests call, in which
> case I would move it down.

Once upon a time it was the comment for the function.

> > +	ret = i915_mutex_lock_interruptible(mmu->dev);
> > +	if (ret)
> > +		return ret;
> > +
> > +	i915_gem_retire_requests(mmu->dev);
> > +
> > +	/* Disallow overlapping userptr objects */
> > +	spin_lock(&mmu->lock);
> > +	it = interval_tree_iter_first(&mmu->objects,
> > +				      mn->it.start, mn->it.last);
> > +	if (it) {
> > +		struct drm_i915_gem_object *obj;
> > +
> > +		/* We only need to check the first object as it either
> > +		 * is idle (and in use elsewhere) or we try again in order
> > +		 * to give time for the gup-worker to run and flush its
> > +		 * object references. Afterwards if we find another
> > +		 * object that is idle (and so referenced elsewhere)
> > +		 * we know that the overlap with an pinned object is
> > +		 * genuine.
> > +		 */
> 
> I found this a bit confusing because it refers to the object being idle when
> it's really a question of the worker executing or not. In my mind, the object
> has the opposite idle/active state as the worker e.g. if the worker is active,
> the object is idle w.r.t. being used by the GPU. Hence the earlier suggestion
> r.e. renaming userptr.active.

/* We only need to check the first object in the range as it either
 * has cancelled gup work queued and we need to return back to the user
 * to give time for the gup-workers to flush their object references
 * upon which the object will be removed from the interval-tree, or the
 * the range is still in use by another client and the overlap is
 * invalid.
 */

> > +			uintptr_t ptr;
> > +			unsigned read_only :1;
> > +			unsigned active :4;
> > +#define I915_GEM_USERPTR_MAX_ACTIVE 15
> 
> Maybe s/active/active_workers/ or just 'workers'

Having rewritten the comment, I would go with workers.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list