[Intel-gfx] [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM

Tue Jul 16 06:00:54 CEST 2013

On Mon, Jul 15, 2013 at 08:35:43PM -0700, Ben Widawsky wrote:
> On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote:
> > On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote:
> > > As we plumb the code with more VM information, it has become more
> > > obvious that the easiest way to deal with bind and unbind is to simply
> > > put the function pointers in the vm, and let those choose the correct
> > > way to handle the page table updates. This change allows many places in
> > > the code to simply be vm->bind, and not have to worry about
> > > distinguishing PPGTT vs GGTT.
> > > 
> > > NOTE: At some point in the future, brining back insert_entries may in
> > > fact be desirable in order to use 1 bind/unbind for multiple generations
> > > of PPGTT. For now however, it's just not necessary.
> > 
> > I need to check the -internal tree again, but I'm rather sure that we need
> > ->insert_entries. In that case I don't want to remove it here in the
> > upstream tree since I have no intention to carry the re-add patch in
> > -internal ;-)
> 
> We do use it for i915_ppgtt_bind_object(), however it should be easily
> fixable since the mini-series is exactly about removing
> i915_ppgtt_bind_object, and making into vm->bind_object. I think it's
> fair if you ask me to fix this up on -internal as well, before merging
> it, but with that one exception - I still believe this is the right
> direction to go in.
> 
> > 
> > > 
> > > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h     |  9 +++++
> > >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 81 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index e6694ae..c2a9c98 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -484,9 +484,18 @@ struct i915_address_space {
> > >  	/* FIXME: Need a more generic return type */
> > >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > >  				     enum i915_cache_level level);
> > > +
> > > +	/** Unmap an object from an address space. This usually consists of
> > > +	 * setting the valid PTE entries to a reserved scratch page. */
> > > +	void (*unbind_object)(struct i915_address_space *vm,
> > > +			      struct drm_i915_gem_object *obj);
> > 
> > 	void (*unbind_vma)(struct i915_vma *vma);
> > 	void (*bind_vma)(struct i915_vma *vma,
> > 			 enum i915_cache_level cache_level);
> > 
> > I think if you do this as a follow-up we might as well bikeshed the
> > interface a bit. Again (I know, broken record) for me it feels
> > semantically much cleaner to talk about binding/unbindinig a vma instead
> > of an (obj, vm) pair ...
> > 
> 
> So as mentioned (and I haven't yet responded to the other email, but
> I'll be broken record there also) - I don't disagree with you. My
> argument is the performance difference should be negligible, and the code
> as is, is decently tested. Changing this requires changing so much, I'd
> rather do the conversion on top. See the other mail thread for more...
> 
> > >  	void (*clear_range)(struct i915_address_space *vm,
> > >  			    unsigned int first_entry,
> > >  			    unsigned int num_entries);
> > > +	/* Map an object into an address space with the given cache flags. */
> > > +	void (*bind_object)(struct i915_address_space *vm,
> > > +			    struct drm_i915_gem_object *obj,
> > > +			    enum i915_cache_level cache_level);
> > >  	void (*insert_entries)(struct i915_address_space *vm,
> > >  			       struct sg_table *st,
> > >  			       unsigned int first_entry,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > index c0d0223..31ff971 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > > @@ -45,6 +45,12 @@
> > >  #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
> > >  #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
> > >  
> > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > > +				   struct drm_i915_gem_object *obj,
> > > +				   enum i915_cache_level cache_level);
> > > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
> > > +				     struct drm_i915_gem_object *obj);
> > > +
> > >  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
> > >  				      enum i915_cache_level level)
> > >  {
> > > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> > >  	}
> > >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> > >  	ppgtt->enable = gen6_ppgtt_enable;
> > > +	ppgtt->base.unbind_object = gen6_ppgtt_unbind_object;
> > >  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > > +	ppgtt->base.bind_object = gen6_ppgtt_bind_object;
> > >  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> > >  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> > >  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> > >  			   cache_level);
> > >  }
> > >  
> > > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > > +				   struct drm_i915_gem_object *obj,
> > > +				   enum i915_cache_level cache_level)
> > > +{
> > > +	const unsigned long entry = i915_gem_obj_offset(obj, vm);
> > > +
> > > +	gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT,
> > > +				  cache_level);
> > > +	obj->has_aliasing_ppgtt_mapping = 1;
> > 
> > Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a
> > bit wrong here. Either we do the ppgtt insert_entries call as part of the
> > global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special
> > global gtt binding call for execbuf.
> > 
> > Thinking about this some more we might need bind flags with
> > 
> > #define VMA_BIND_CPU  (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */
> > #define VMA_BIND_GPU  (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */
> > 
> > since otherwise we can't properly encapsulate the aliasing ppgtt binding
> > logic into vm->bind. So in the end we'd have
> > 
> > void ggtt_bind_vma(vma, bind_flags, cache_level)
> > {
> > 	ggtt_vm = vma->vm;
> > 	WARN_ON(ggtt_vm != &dev_priv->gtt.base);
> > 
> > 	if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) &&
> > 	    !obj->has_global_gtt_mapping) {
> > 		ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve);
> > 		vma->obj->has_global_gtt_mapping = true;
> > 	}
> > 
> > 	if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) &&
> > 	    !obj->has_ppgtt_mapping) {
> > 		ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj,
> > 							vma->node.start,
> > 							cache_leve);
> > 		vma->obj->has_ppgtt_mapping = true;
> > 	}
> > }
> > 
> > Obviously completely untested, but I hope I could get the idea accross.
> > 
> > Cheers, Daniel
> 
> To me, aliasing ppgtt is just a wart that doesn't fit well with
> anything. As such, my plan was to hide as much of it as possible in ggtt
> functions. Using some kind of flag on ggtt_bind() we can determine if
> the user actually wants ggtt, and if so bind to both, else just use
> aliasing ppgtt. None of that code appears here because I want to make
> the diff churn as small as possible, and hadn't completely thought it
> all through.
> 
> Now after typing that (and this really did happen), I just looked at
> your function, and it seems to be more or less exactly what I just
> typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you
> really just want one flag which specifies "bind it in the global gtt,
> sucka"
> 
> Now having just typed /that/, it was indeed my plan. So as long as
> nothing really bothers you with the bind/unbind() stuff, I can move
> forward with a patch on top to fix it.
> 

I changed my mind already. A patch on top doesn't make sense. I'll try
to fix this one up as is.

-- 
Ben Widawsky, Intel Open Source Technology Center