[Intel-gfx] [PATCH 1/3] drm/i915: Add bind/unbind object functions to VM

Ben Widawsky ben at bwidawsk.net
Tue Jul 16 05:35:43 CEST 2013


On Sat, Jul 13, 2013 at 11:33:22AM +0200, Daniel Vetter wrote:
> On Fri, Jul 12, 2013 at 09:45:54PM -0700, Ben Widawsky wrote:
> > As we plumb the code with more VM information, it has become more
> > obvious that the easiest way to deal with bind and unbind is to simply
> > put the function pointers in the vm, and let those choose the correct
> > way to handle the page table updates. This change allows many places in
> > the code to simply be vm->bind, and not have to worry about
> > distinguishing PPGTT vs GGTT.
> > 
> > NOTE: At some point in the future, brining back insert_entries may in
> > fact be desirable in order to use 1 bind/unbind for multiple generations
> > of PPGTT. For now however, it's just not necessary.
> 
> I need to check the -internal tree again, but I'm rather sure that we need
> ->insert_entries. In that case I don't want to remove it here in the
> upstream tree since I have no intention to carry the re-add patch in
> -internal ;-)

We do use it for i915_ppgtt_bind_object(), however it should be easily
fixable since the mini-series is exactly about removing
i915_ppgtt_bind_object, and making into vm->bind_object. I think it's
fair if you ask me to fix this up on -internal as well, before merging
it, but with that one exception - I still believe this is the right
direction to go in.

> 
> > 
> > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h     |  9 +++++
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 72 +++++++++++++++++++++++++++++++++++++
> >  2 files changed, 81 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index e6694ae..c2a9c98 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -484,9 +484,18 @@ struct i915_address_space {
> >  	/* FIXME: Need a more generic return type */
> >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> >  				     enum i915_cache_level level);
> > +
> > +	/** Unmap an object from an address space. This usually consists of
> > +	 * setting the valid PTE entries to a reserved scratch page. */
> > +	void (*unbind_object)(struct i915_address_space *vm,
> > +			      struct drm_i915_gem_object *obj);
> 
> 	void (*unbind_vma)(struct i915_vma *vma);
> 	void (*bind_vma)(struct i915_vma *vma,
> 			 enum i915_cache_level cache_level);
> 
> I think if you do this as a follow-up we might as well bikeshed the
> interface a bit. Again (I know, broken record) for me it feels
> semantically much cleaner to talk about binding/unbindinig a vma instead
> of an (obj, vm) pair ...
> 

So as mentioned (and I haven't yet responded to the other email, but
I'll be broken record there also) - I don't disagree with you. My
argument is the performance difference should be negligible, and the code
as is, is decently tested. Changing this requires changing so much, I'd
rather do the conversion on top. See the other mail thread for more...

> >  	void (*clear_range)(struct i915_address_space *vm,
> >  			    unsigned int first_entry,
> >  			    unsigned int num_entries);
> > +	/* Map an object into an address space with the given cache flags. */
> > +	void (*bind_object)(struct i915_address_space *vm,
> > +			    struct drm_i915_gem_object *obj,
> > +			    enum i915_cache_level cache_level);
> >  	void (*insert_entries)(struct i915_address_space *vm,
> >  			       struct sg_table *st,
> >  			       unsigned int first_entry,
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index c0d0223..31ff971 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -45,6 +45,12 @@
> >  #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
> >  #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
> >  
> > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > +				   struct drm_i915_gem_object *obj,
> > +				   enum i915_cache_level cache_level);
> > +static void gen6_ppgtt_unbind_object(struct i915_address_space *vm,
> > +				     struct drm_i915_gem_object *obj);
> > +
> >  static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
> >  				      enum i915_cache_level level)
> >  {
> > @@ -285,7 +291,9 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  	}
> >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> >  	ppgtt->enable = gen6_ppgtt_enable;
> > +	ppgtt->base.unbind_object = gen6_ppgtt_unbind_object;
> >  	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
> > +	ppgtt->base.bind_object = gen6_ppgtt_bind_object;
> >  	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
> >  	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
> >  	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
> > @@ -397,6 +405,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			   cache_level);
> >  }
> >  
> > +static void gen6_ppgtt_bind_object(struct i915_address_space *vm,
> > +				   struct drm_i915_gem_object *obj,
> > +				   enum i915_cache_level cache_level)
> > +{
> > +	const unsigned long entry = i915_gem_obj_offset(obj, vm);
> > +
> > +	gen6_ppgtt_insert_entries(vm, obj->pages, entry >> PAGE_SHIFT,
> > +				  cache_level);
> > +	obj->has_aliasing_ppgtt_mapping = 1;
> 
> Since this is the bind function for ppgtt the aliasing ppgtt stuff looks a
> bit wrong here. Either we do the ppgtt insert_entries call as part of the
> global gtt bind call (if vm->aliasing_ppgtt is set) or we have a special
> global gtt binding call for execbuf.
> 
> Thinking about this some more we might need bind flags with
> 
> #define VMA_BIND_CPU  (1<<0) /* ensure ggtt mapping exists for aliasing ppgtt */
> #define VMA_BIND_GPU  (1<<1) /* ensure ppgtt mappings exists for aliasing ppgtt */
> 
> since otherwise we can't properly encapsulate the aliasing ppgtt binding
> logic into vm->bind. So in the end we'd have
> 
> void ggtt_bind_vma(vma, bind_flags, cache_level)
> {
> 	ggtt_vm = vma->vm;
> 	WARN_ON(ggtt_vm != &dev_priv->gtt.base);
> 
> 	if ((!ggtt_vm->aliasing_ppgtt || (bind_flags & BIND_CPU)) &&
> 	    !obj->has_global_gtt_mapping) {
> 		ggtt_vm->insert_entries(vma->obj, vma->node.start, cache_leve);
> 		vma->obj->has_global_gtt_mapping = true;
> 	}
> 
> 	if ((ggtt_vm->aliasing_ppgtt && (bind_flags & BIND_GPU)) &&
> 	    !obj->has_ppgtt_mapping) {
> 		ggtt_vm->aliasing_ppgtt->insert_entries(vma->obj,
> 							vma->node.start,
> 							cache_leve);
> 		vma->obj->has_ppgtt_mapping = true;
> 	}
> }
> 
> Obviously completely untested, but I hope I could get the idea accross.
> 
> Cheers, Daniel

To me, aliasing ppgtt is just a wart that doesn't fit well with
anything. As such, my plan was to hide as much of it as possible in ggtt
functions. Using some kind of flag on ggtt_bind() we can determine if
the user actually wants ggtt, and if so bind to both, else just use
aliasing ppgtt. None of that code appears here because I want to make
the diff churn as small as possible, and hadn't completely thought it
all through.

Now after typing that (and this really did happen), I just looked at
your function, and it seems to be more or less exactly what I just
typed. Cool! The GPU/CPU naming scheme seems off to me, and I think you
really just want one flag which specifies "bind it in the global gtt,
sucka"

Now having just typed /that/, it was indeed my plan. So as long as
nothing really bothers you with the bind/unbind() stuff, I can move
forward with a patch on top to fix it.


-- 
Ben Widawsky, Intel Open Source Technology Center



More information about the Intel-gfx mailing list