[Intel-gfx] [PATCH] drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)

Wed Nov 16 14:55:59 UTC 2016

On Wed, Nov 16, 2016 at 03:41:53PM +0100, Daniel Vetter wrote:
> On Wed, Nov 16, 2016 at 08:58:30AM +0000, Chris Wilson wrote:
> > Soft-pinning depends upon being able to check for availabilty of an
> > interval and evict overlapping object from a drm_mm range manager very
> > quickly. Currently it uses a linear list, and so performance is dire and
> > not suitable as a general replacement. Worse, the current code will oops
> > if it tries to evict an active buffer.
> > 
> > It also helps if the routine reports the correct error codes as expected
> > by its callers and emits a tracepoint upon use.
> > 
> > For posterity since the wrong patch was pushed (i.e. that missed these
> > key points and had known bugs), this is the changelog that should have
> > been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for
> > execbuffer"):
> > 
> > Userspace can pass in an offset that it presumes the object is located
> > at. The kernel will then do its utmost to fit the object into that
> > location. The assumption is that userspace is handling its own object
> > locations (for example along with full-ppgtt) and that the kernel will
> > rarely have to make space for the user's requests.
> > 
> > This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
> > * if the user supplies a virtual address via the execobject->offset
> >   *and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
> >   that object is placed at that offset in the address space selected
> >   by the context specifier in execbuffer.
> > * the location must be aligned to the GTT page size, 4096 bytes
> > * as the object is placed exactly as specified, it may be used by this
> >   execbuffer call without relocations pointing to it
> > 
> > It may fail to do so if:
> > * EINVAL is returned if the object does not have a 4096 byte aligned
> >   address
> > * the object conflicts with another pinned object (either pinned by
> >   hardware in that address space, e.g. scanouts in the aliasing ppgtt)
> >   or within the same batch.
> >   EBUSY is returned if the location is pinned by hardware
> >   EINVAL is returned if the location is already in use by the batch
> > * EINVAL is returned if the object conflicts with its own alignment (as meets
> >   the hardware requirements) or if the placement of the object does not fit
> >   within the address space
> > 
> > All other execbuffer errors apply.
> > 
> > Presence of this execbuf extension may be queried by passing
> > I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
> > a reported value of 1 (or greater).
> > 
> > Fixes: 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/drm_mm.c                   |  9 ++++
> >  drivers/gpu/drm/i915/gvt/aperture_gm.c     |  4 +-
> >  drivers/gpu/drm/i915/i915_drv.h            |  3 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c      | 87 +++++++++++++++++++++---------
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  3 +-
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        | 10 ++--
> >  drivers/gpu/drm/i915/i915_trace.h          | 23 ++++++++
> >  drivers/gpu/drm/i915/i915_vma.c            |  8 +--
> >  8 files changed, 105 insertions(+), 42 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
> > index 632473beb40c..06f319e54dc1 100644
> > --- a/drivers/gpu/drm/drm_mm.c
> > +++ b/drivers/gpu/drm/drm_mm.c
> > @@ -339,6 +339,15 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct drm_mm_node *node)
> >  	if (hole_start > node->start || hole_end < end)
> >  		return -ENOSPC;
> >  
> > +	if (mm->color_adjust) {
> > +		u64 adj_start = hole_start, adj_end = hole_end;
> > +
> > +		mm->color_adjust(hole, node->color, &adj_start, &adj_end);
> > +		if (adj_start > node->start ||
> > +		    adj_end < node->start + node->size)
> > +			return -ENOSPC;
> > +	}
> 
> Can't we move that up and combine with the earlier check like in other
> places, i.e.

Unlike the other places we don't use adj_start, adj_end.

> > --- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
> > +++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
> > @@ -73,12 +73,12 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm)
> >  	mutex_lock(&dev_priv->drm.struct_mutex);
> >  search_again:
> >  	ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm,
> > -						  node, size, 4096, 0,
> > +						  node, size, 4096, -1,
> >  						  start, end, search_flag,
> >  						  alloc_flag);
> >  	if (ret) {
> >  		ret = i915_gem_evict_something(&dev_priv->ggtt.base,
> > -					       size, 4096, 0, start, end, 0);
> > +					       size, 4096, -1, start, end, 0);
> 
> Bugfix for gvt coloring? Should we have a GVT_COLOR? Feels like separate
> patch once more.

No, copy paste of broken code. The pattern was broken by evict_for_vma.

> > -int
> > -i915_gem_evict_for_vma(struct i915_vma *target)
> > +int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags)
> 
> Kerneldoc for this one here is missing.

There should be exactly one caller. It is not an interface that should
be used.

> >  {
> > -	struct drm_mm_node *node, *next;
> > +	struct list_head eviction_list;
> > +	struct drm_mm_node *node;
> > +	u64 start = target->node.start;
> > +	u64 end = start + target->node.size - 1;
> > +	struct i915_vma *vma, *next;
> > +	bool check_snoop;
> > +	int ret = 0;
> >  
> >  	lockdep_assert_held(&target->vm->dev->struct_mutex);
> > +	trace_i915_gem_evict_vma(target, flags);
> > +
> > +	check_snoop = target->vm->mm.color_adjust;
> > +	if (check_snoop) {
> > +		if (start > target->vm->start)
> > +			start -= 4096;
> > +		if (end < target->vm->start + target->vm->total)
> > +			end += 4096;
> > +	}
> >  
> > -	list_for_each_entry_safe(node, next,
> > -			&target->vm->mm.head_node.node_list,
> > -			node_list) {
> > -		struct i915_vma *vma;
> > -		int ret;
> > +	node = drm_mm_interval_first(&target->vm->mm, start, end);
> 
> drm_mm_interval_first is lacking some pretty kernel-doc, should be fixed!
> 
> > +	if (!node)
> > +		return 0;
> >  
> > -		if (node->start + node->size <= target->node.start)
> > -			continue;
> > -		if (node->start >= target->node.start + target->node.size)
> > +	INIT_LIST_HEAD(&eviction_list);
> > +	vma = container_of(node, typeof(*vma), node);
> > +	list_for_each_entry_from(vma,
> > +				 &target->vm->mm.head_node.node_list,
> > +				 node.node_list) {
> > +		if (vma->node.start > end)
> >  			break;
> 
> We have this nice drm_mm_interval_next helper, imo should use it here
> instead of open-coding.

It's not the same. This is a linear walk which is fewer pointer dances.

> >  
> > -		vma = container_of(node, typeof(*vma), node);
> > +		if (check_snoop) {
> > +			if (vma->node.start + vma->node.size == target->node.start) {
> > +				if (vma->node.color == target->node.color)
> > +					continue;
> > +			}
> > +			if (vma->node.start == target->node.start + target->node.size) {
> > +				if (vma->node.color == target->node.color)
> > +					continue;
> > +			}
> > +		}
> 
> Not feeling too happy with open-coding our color_adjust here.

Good, because it doesn't.

> > -		if (i915_vma_is_pinned(vma)) {
> > -			if (!vma->exec_entry || i915_vma_pin_count(vma) > 1)
> > -				/* Object is pinned for some other use */
> > -				return -EBUSY;
> > +		if (vma->node.color == -1) {
> 
> The magic -1!

And now you see why it is absolutely required. And not any more magic
than any other color value.

> > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> > index 738ff3a5cd6e..827eaeb75524 100644
> > --- a/drivers/gpu/drm/i915/i915_vma.c
> > +++ b/drivers/gpu/drm/i915/i915_vma.c
> > @@ -325,12 +325,6 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma,
> >  	if (vma->vm->mm.color_adjust == NULL)
> >  		return true;
> >  
> > -	if (!drm_mm_node_allocated(gtt_space))
> > -		return true;
> > -
> > -	if (list_empty(&gtt_space->node_list))
> > -		return true;
> 
> Not clear to me why we need this: set_cache_level should never see
> pre-filled vmas, and vma_insert only has this check in the success case
> where the same should hold. Why remove these 2 cases? This seems to change
> set_cache_level behaviour ...

Because they're irrelevant. We only get called when bound and the second
is not our bug.

There's no change in caller behaviour.

-- 
Chris Wilson, Intel Open Source Technology Centre