[Intel-gfx] [PATCH 5/6] drm/i915: Use the new vm [un]bind functions

Ben Widawsky ben at bwidawsk.net
Wed Sep 25 19:19:57 CEST 2013


On Wed, Sep 25, 2013 at 01:42:21PM +0200, Daniel Vetter wrote:
> On Tue, Sep 24, 2013 at 09:58:00AM -0700, Ben Widawsky wrote:
> > From: Ben Widawsky <ben at bwidawsk.net>
> > 
> > Building on the last patch which created the new function pointers in
> > the VM for bind/unbind, here we actually put those new function pointers
> > to use.
> > 
> > Split out as a separate patch to aid in review. I'm fine with squashing
> > into the previous patch if people request it.
> > 
> > v2: Updated to address the smart ggtt which can do aliasing as needed
> > Make sure we bind to global gtt when mappable and fenceable. I thought
> > we could get away without this initialy, but we cannot.
> > 
> > v3: Make the global GTT binding explicitly use the ggtt VM for
> > bind_vma(). While at it, use the new ggtt_vma helper (Chris)
> > 
> > v4: Make the code support the secure dispatch flag, which requires
> > special handling during execbuf. This was fixed (incorrectly) later in
> > the series, but having it here earlier in the series should be perfectly
> > acceptable. (Chris)
> > Move do_switch over to the new, special ggtt_vma interface.
> > 
> > v5: Don't use a local variable (or assertion) when setting the batch
> > object to the global GTT during secure dispatch (Chris)
> > 
> > v6: Caclulate the exec offset for the secure case (Bug fix missed on
> > v4). (Chris)
> > Remove redundant check for has_global_gtt_mapping, since it is done in
> > bind_vma.
> > 
> > v7: Remove now unused dev_priv in do_switch
> > Don't pass the vm to ggtt_offset (error from v6 which I should have
> > caught before sending).
> > 
> > v8: Assert, and rework the SNB workaround (to make it a bit clearer)
> > code to make sure the VM can't be anything but the GGTT.
> > 
> > v9: Fixing more bugs which can't exist yet on the behest of Chris. Make
> > sure that the batch object is properly bound, and added to the global
> > VM's active list - for when we use non-global VMs. (Chris) Note that
> > there is an ongoing confusion about what bugs existed, but the "known"
> > bugs are fixed here.
> > 
> > v10: Nitpicks on whitespacing etc. introduced in v9 (Chris)
> > 
> > Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> 
 
> Second is the death of the ->insert_entries/->clear_range vfuncs. We
> _need_ them in the internal tree and you really can't just kill them. If
> you want to, you need to follow three steps:
> 
> 1. Create new interfaces in the public tree, use the on public platforms
> but keeep the old interfaces working.
> 
> 2. Convert the -internal platforms over.
> 
> 3. Remove old interface.
> 
> Doing 3. before 2. is bonghits and will result in the internal tree being
> broken a bit in-between. Since I'm supposed to maintain that I'll undo the
> damage here to be able to do a rebase.
> 
> Cheers, Daniel

As I've now stated multiple times over the course of the last 5 months -
I am fine with you not merging this. It's the right thing to do, but
it seems neither you or I have time to do it in the right way. Sometimes
reality gets in the way what's desirable.

> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h            |  9 -----
> >  drivers/gpu/drm/i915/i915_gem.c            | 33 +++++++---------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |  6 ++-
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 61 ++++++++++++++++++++----------
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        | 48 ++---------------------
> >  5 files changed, 62 insertions(+), 95 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 9995cdb..e8ae8fd 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2102,17 +2102,8 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >  
> >  /* i915_gem_gtt.c */
> >  void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
> > -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > -			    struct drm_i915_gem_object *obj,
> > -			    enum i915_cache_level cache_level);
> > -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > -			      struct drm_i915_gem_object *obj);
> > -
> >  void i915_gem_restore_gtt_mappings(struct drm_device *dev);
> >  int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
> > -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > -				enum i915_cache_level cache_level);
> > -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
> >  void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
> >  void i915_gem_init_global_gtt(struct drm_device *dev);
> >  void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index f6c8b0e..378d4ef 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2693,12 +2693,8 @@ int i915_vma_unbind(struct i915_vma *vma)
> >  
> >  	trace_i915_vma_unbind(vma);
> >  
> > -	if (obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_unbind_object(obj);
> > -	if (obj->has_aliasing_ppgtt_mapping) {
> > -		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> > -		obj->has_aliasing_ppgtt_mapping = 0;
> > -	}
> > +	vma->vm->unbind_vma(vma);
> > +
> >  	i915_gem_gtt_finish_object(obj);
> >  	i915_gem_object_unpin_pages(obj);
> >  
> > @@ -3155,7 +3151,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
> >  /**
> >   * Finds free space in the GTT aperture and binds the object there.
> >   */
> > -static int
> > +int
> >  i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >  			   struct i915_address_space *vm,
> >  			   unsigned alignment,
> > @@ -3424,7 +3420,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  				    enum i915_cache_level cache_level)
> >  {
> >  	struct drm_device *dev = obj->base.dev;
> > -	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> > @@ -3463,11 +3458,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  				return ret;
> >  		}
> >  
> > -		if (obj->has_global_gtt_mapping)
> > -			i915_gem_gtt_bind_object(obj, cache_level);
> > -		if (obj->has_aliasing_ppgtt_mapping)
> > -			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > -					       obj, cache_level);
> > +		list_for_each_entry(vma, &obj->vma_list, vma_link)
> > +			vma->vm->bind_vma(vma, cache_level, 0);
> >  	}
> >  
> >  	list_for_each_entry(vma, &obj->vma_list, vma_link)
> > @@ -3795,6 +3787,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  		    bool map_and_fenceable,
> >  		    bool nonblocking)
> >  {
> > +	const u32 flags = map_and_fenceable ? GLOBAL_BIND : 0;
> >  	struct i915_vma *vma;
> >  	int ret;
> >  
> > @@ -3823,20 +3816,22 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	if (!i915_gem_obj_bound(obj, vm)) {
> > -		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > -
> >  		ret = i915_gem_object_bind_to_vm(obj, vm, alignment,
> >  						 map_and_fenceable,
> >  						 nonblocking);
> >  		if (ret)
> >  			return ret;
> >  
> > -		if (!dev_priv->mm.aliasing_ppgtt)
> > -			i915_gem_gtt_bind_object(obj, obj->cache_level);
> > -	}
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> > +		vm->bind_vma(vma, obj->cache_level, flags);
> > +	} else
> > +		vma = i915_gem_obj_to_vma(obj, vm);
> >  
> > +	/* Objects are created map and fenceable. If we bind an object
> > +	 * the first time, and we had aliasing PPGTT (and didn't request
> > +	 * GLOBAL), we'll need to do this on the second bind.*/
> >  	if (!obj->has_global_gtt_mapping && map_and_fenceable)
> > -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > +		vm->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
> >  
> >  	obj->pin_count++;
> >  	obj->pin_mappable |= map_and_fenceable;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 1a877a5..d4eb88a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -422,8 +422,10 @@ static int do_switch(struct i915_hw_context *to)
> >  		return ret;
> >  	}
> >  
> > -	if (!to->obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
> > +	if (!to->obj->has_global_gtt_mapping) {
> > +		struct i915_vma *vma = i915_gem_obj_to_ggtt(to->obj);
> > +		vma->vm->bind_vma(vma, to->obj->cache_level, GLOBAL_BIND);
> > +	}
> >  
> >  	if (!to->is_initialized || is_default_context(to))
> >  		hw_flags |= MI_RESTORE_INHIBIT;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 0ce0d47..88c924f 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -286,8 +286,14 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  	if (unlikely(IS_GEN6(dev) &&
> >  	    reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
> >  	    !target_i915_obj->has_global_gtt_mapping)) {
> > -		i915_gem_gtt_bind_object(target_i915_obj,
> > -					 target_i915_obj->cache_level);
> > +		/* SNB shall not support full PPGTT. This path can only be taken
> > +		 * when the VM is the GGTT (aliasing PPGTT is not a real VM, and
> > +		 * therefore doesn't count).
> > +		 */
> > +		BUG_ON(vm != obj_to_ggtt(target_i915_obj));
> > +		vm->bind_vma(i915_gem_obj_to_ggtt(target_i915_obj),
> > +			     target_i915_obj->cache_level,
> > +			     GLOBAL_BIND);
> >  	}
> >  
> >  	/* Validate that the target is in a valid r/w GPU domain */
> > @@ -464,11 +470,12 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> >  				struct intel_ring_buffer *ring,
> >  				bool *need_reloc)
> >  {
> > -	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +	struct drm_i915_gem_object *obj = vma->obj;
> >  	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
> >  	bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
> >  	bool need_fence, need_mappable;
> > -	struct drm_i915_gem_object *obj = vma->obj;
> > +	u32 flags = (entry->flags & EXEC_OBJECT_NEEDS_GTT) &&
> > +		!vma->obj->has_global_gtt_mapping ? GLOBAL_BIND : 0;
> >  	int ret;
> >  
> >  	need_fence =
> > @@ -497,14 +504,6 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> >  		}
> >  	}
> >  
> > -	/* Ensure ppgtt mapping exists if needed */
> > -	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> > -		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > -				       obj, obj->cache_level);
> > -
> > -		obj->has_aliasing_ppgtt_mapping = 1;
> > -	}
> > -
> >  	if (entry->offset != vma->node.start) {
> >  		entry->offset = vma->node.start;
> >  		*need_reloc = true;
> > @@ -515,9 +514,7 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> >  		obj->base.pending_write_domain = I915_GEM_DOMAIN_RENDER;
> >  	}
> >  
> > -	if (entry->flags & EXEC_OBJECT_NEEDS_GTT &&
> > -	    !obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > +	vma->vm->bind_vma(vma, obj->cache_level, flags);
> >  
> >  	return 0;
> >  }
> > @@ -936,7 +933,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  	struct intel_ring_buffer *ring;
> >  	struct i915_ctx_hang_stats *hs;
> >  	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
> > -	u32 exec_start, exec_len;
> > +	u32 exec_len, exec_start = args->batch_start_offset;
> >  	u32 mask, flags;
> >  	int ret, mode, i;
> >  	bool need_relocs;
> > @@ -1118,8 +1115,34 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  	 * batch" bit. Hence we need to pin secure batches into the global gtt.
> >  	 * hsw should have this fixed, but let's be paranoid and do it
> >  	 * unconditionally for now. */
> > -	if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping)
> > -		i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level);
> > +	if (flags & I915_DISPATCH_SECURE) {
> > +		struct i915_address_space *ggtt = obj_to_ggtt(batch_obj);
> > +
> > +		/* Assuming all privileged batches are in the global GTT means
> > +		 * we need to make sure we have a global gtt offset, as well as
> > +		 * the PTEs mapped. As mentioned above, we can forego this on
> > +		 * HSW, but don't.
> > +		 */
> > +
> > +		ret = i915_gem_obj_ggtt_pin(batch_obj, 0, false, false);
> > +		if (ret)
> > +			goto err;
> > +
> > +		ggtt->bind_vma(i915_gem_obj_to_ggtt(batch_obj),
> > +			       batch_obj->cache_level,
> > +			       GLOBAL_BIND);
> > +
> > +		/* Since the active list is per VM, we need to make sure this
> > +		 * VMA ends up on the GGTT's active list to avoid premature
> > +		 * eviction.
> > +		 */
> > +		i915_vma_move_to_active(i915_gem_obj_to_ggtt(batch_obj), ring);
> > +
> > +		i915_gem_object_unpin(batch_obj);
> > +
> > +		exec_start += i915_gem_obj_ggtt_offset(batch_obj);
> > +	} else
> > +		exec_start += i915_gem_obj_offset(batch_obj, vm);
> >  
> >  	ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->vmas);
> >  	if (ret)
> > @@ -1161,8 +1184,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >  			goto err;
> >  	}
> >  
> > -	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> > -		args->batch_start_offset;
> >  	exec_len = args->batch_len;
> >  	if (cliprects) {
> >  		for (i = 0; i < args->num_cliprects; i++) {
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 65b61d4..e053f14 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -437,15 +437,6 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
> >  	dev_priv->mm.aliasing_ppgtt = NULL;
> >  }
> >  
> > -void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > -			    struct drm_i915_gem_object *obj,
> > -			    enum i915_cache_level cache_level)
> > -{
> > -	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > -				   i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				   cache_level);
> > -}
> > -
> >  static void __always_unused
> >  gen6_ppgtt_bind_vma(struct i915_vma *vma,
> >  		    enum i915_cache_level cache_level,
> > @@ -458,14 +449,6 @@ gen6_ppgtt_bind_vma(struct i915_vma *vma,
> >  	gen6_ppgtt_insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
> >  }
> >  
> > -void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> > -			      struct drm_i915_gem_object *obj)
> > -{
> > -	ppgtt->base.clear_range(&ppgtt->base,
> > -				i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT,
> > -				obj->base.size >> PAGE_SHIFT);
> > -}
> > -
> >  static void __always_unused gen6_ppgtt_unbind_vma(struct i915_vma *vma)
> >  {
> >  	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
> > @@ -523,8 +506,10 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  				       dev_priv->gtt.base.total / PAGE_SIZE);
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> > +		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
> > +							   &dev_priv->gtt.base);
> >  		i915_gem_clflush_object(obj, obj->pin_display);
> > -		i915_gem_gtt_bind_object(obj, obj->cache_level);
> > +		vma->vm->bind_vma(vma, obj->cache_level, 0);
> >  	}
> >  
> >  	i915_gem_chipset_flush(dev);
> > @@ -687,33 +672,6 @@ static void gen6_ggtt_bind_vma(struct i915_vma *vma,
> >  	}
> >  }
> >  
> > -void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> > -			      enum i915_cache_level cache_level)
> > -{
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> > -
> > -	dev_priv->gtt.base.insert_entries(&dev_priv->gtt.base, obj->pages,
> > -					  entry,
> > -					  cache_level);
> > -
> > -	obj->has_global_gtt_mapping = 1;
> > -}
> > -
> > -void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> > -{
> > -	struct drm_device *dev = obj->base.dev;
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	const unsigned long entry = i915_gem_obj_ggtt_offset(obj) >> PAGE_SHIFT;
> > -
> > -	dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
> > -				       entry,
> > -				       obj->base.size >> PAGE_SHIFT);
> > -
> > -	obj->has_global_gtt_mapping = 0;
> > -}
> > -
> >  static void gen6_ggtt_unbind_vma(struct i915_vma *vma)
> >  {
> >  	struct drm_device *dev = vma->vm->dev;
> > -- 
> > 1.8.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center



More information about the Intel-gfx mailing list