[Intel-gfx] [PATCH 06/11] drm/i915: plumb VM into object operations

Tue Jul 16 05:57:13 CEST 2013

On Fri, Jul 12, 2013 at 06:46:51PM +0200, Daniel Vetter wrote:
> On Fri, Jul 12, 2013 at 08:46:48AM -0700, Ben Widawsky wrote:
> > On Fri, Jul 12, 2013 at 08:26:07AM +0200, Daniel Vetter wrote:
> > > On Thu, Jul 11, 2013 at 07:23:08PM -0700, Ben Widawsky wrote:
> > > > On Tue, Jul 09, 2013 at 09:15:01AM +0200, Daniel Vetter wrote:
> > > > > On Mon, Jul 08, 2013 at 11:08:37PM -0700, Ben Widawsky wrote:
> 
> [snip]
> 
> > > > > > @@ -3333,12 +3376,15 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > > >  	}
> > > > > >  
> > > > > >  	if (!i915_gem_valid_gtt_space(dev, &vma->node, cache_level)) {
> > > > > > -		ret = i915_gem_object_unbind(obj);
> > > > > > +		ret = i915_gem_object_unbind(obj, vm);
> > > > > >  		if (ret)
> > > > > >  			return ret;
> > > > > >  	}
> > > > > >  
> > > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
> > > > > > +		if (!i915_gem_obj_bound(obj, vm))
> > > > > > +			continue;
> > > > > 
> > > > > Hm, shouldn't we have a per-object list of vmas? Or will that follow later
> > > > > on?
> > > > > 
> > > > > Self-correction: It exists already ... why can't we use this here?
> > > > 
> > > > Yes. That should work, I'll fix it and test it. It looks slightly worse
> > > > IMO in terms of code clarity, but I don't mind the change.
> > > 
> > > Actually I think it'd gain in clarity, doing pte updatest (which
> > > set_cache_level does) on the vma instead of the (obj, vm) pair feels more
> > > natural. And we'd be able to drop lots of (obj, vm) -> vma lookups here.
> > 
> > That sounds good to me. Would you mind a patch on top?
> 
> If you want I guess we can refactor this after everything has settled. Has
> the upside that assessing whether using vma or (obj, vm) is much easier.
> So fine with me.

I think our time to get these merged in is quickly evaporating. I am
sure you know this - just reminding you.

> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > >  		ret = i915_gem_object_finish_gpu(obj);
> > > > > >  		if (ret)
> > > > > >  			return ret;
> > > > > > @@ -3361,7 +3407,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> > > > > >  			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > > > > >  					       obj, cache_level);
> > > > > >  
> > > > > > -		i915_gem_obj_ggtt_set_color(obj, cache_level);
> > > > > > +		i915_gem_obj_set_color(obj, vm, cache_level);
> > > > > >  	}
> > > > > >  
> > > > > >  	if (cache_level == I915_CACHE_NONE) {
> > > > > > @@ -3421,6 +3467,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > > >  			       struct drm_file *file)
> > > > > >  {
> > > > > >  	struct drm_i915_gem_caching *args = data;
> > > > > > +	struct drm_i915_private *dev_priv;
> > > > > >  	struct drm_i915_gem_object *obj;
> > > > > >  	enum i915_cache_level level;
> > > > > >  	int ret;
> > > > > > @@ -3445,8 +3492,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
> > > > > >  		ret = -ENOENT;
> > > > > >  		goto unlock;
> > > > > >  	}
> > > > > > +	dev_priv = obj->base.dev->dev_private;
> > > > > >  
> > > > > > -	ret = i915_gem_object_set_cache_level(obj, level);
> > > > > > +	/* FIXME: Add interface for specific VM? */
> > > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
> > > > > >  
> > > > > >  	drm_gem_object_unreference(&obj->base);
> > > > > >  unlock:
> > > > > > @@ -3464,6 +3513,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  				     u32 alignment,
> > > > > >  				     struct intel_ring_buffer *pipelined)
> > > > > >  {
> > > > > > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> > > > > >  	u32 old_read_domains, old_write_domain;
> > > > > >  	int ret;
> > > > > >  
> > > > > > @@ -3482,7 +3532,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  	 * of uncaching, which would allow us to flush all the LLC-cached data
> > > > > >  	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
> > > > > >  	 */
> > > > > > -	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
> > > > > > +	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
> > > > > > +					      I915_CACHE_NONE);
> > > > > >  	if (ret)
> > > > > >  		return ret;
> > > > > >  
> > > > > > @@ -3490,7 +3541,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > > > >  	 * always use map_and_fenceable for all scanout buffers.
> > > > > >  	 */
> > > > > > -	ret = i915_gem_object_pin(obj, alignment, true, false);
> > > > > > +	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
> > > > > >  	if (ret)
> > > > > >  		return ret;
> > > > > >  
> > > > > > @@ -3633,6 +3684,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> > > > > >  
> > > > > >  int
> > > > > >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > > +		    struct i915_address_space *vm,
> > > > > >  		    uint32_t alignment,
> > > > > >  		    bool map_and_fenceable,
> > > > > >  		    bool nonblocking)
> > > > > > @@ -3642,26 +3694,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> > > > > >  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
> > > > > >  		return -EBUSY;
> > > > > >  
> > > > > > -	if (i915_gem_obj_ggtt_bound(obj)) {
> > > > > > -		if ((alignment && i915_gem_obj_ggtt_offset(obj) & (alignment - 1)) ||
> > > > > > +	BUG_ON(map_and_fenceable && !i915_is_ggtt(vm));
> > > > > 
> > > > > WARN_ON, since presumably we can keep on going if we get this wrong
> > > > > (albeit with slightly corrupted state, so render corruptions might
> > > > > follow).
> > > > 
> > > > Can we make a deal, can we leave this as BUG_ON with a FIXME to convert
> > > > it at the end of merging?
> > > 
> > > Adding a FIXME right above it will cause equal amounts of conflicts, so I
> > > don't see the point that much ...
> > 
> > I'm just really fearful that in doing the reworks I will end up with
> > this condition, and I am afraid I will miss them if it's a WARN_ON.
> > Definitely it's more likely to miss than a BUG.
> > 
> > Also, and we've disagreed on this a few times by now, this is an
> > internal interface which I think should carry such a fatal error for
> > this level of mistake.
> 
> Ime every time I argue this with myself and state your case it ends up
> biting me horribly because I'm regularly too incompetent and hit my very
> on BUG_ONs ;-) Hence why I insist so much on using WARN_ON wherever
> possible. Of course if people don't check they're logs that's a different
> matter (*cough* Jesse *cough*) ...
> 
> > In any case I've made the change locally. Will yell at you later if I
> > was right.
> 
> Getting yelled at is part of my job, so bring it on ;-)
> 
> [snip]
> 
> > > > > > @@ -4645,11 +4719,93 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > > > >  		if (obj->pages_pin_count == 0)
> > > > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > > -	list_for_each_entry(obj, &vm->inactive_list, mm_list)
> > > > > > -		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > > -			cnt += obj->base.size >> PAGE_SHIFT;
> > > > > > +
> > > > > > +	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
> > > > > > +		list_for_each_entry(obj, &vm->inactive_list, global_list)
> > > > > > +			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > > > > +				cnt += obj->base.size >> PAGE_SHIFT;
> > > > > 
> > > > > Isn't this now double-counting objects? In the shrinker we only care about
> > > > > how much physical RAM an object occupies, not how much virtual space it
> > > > > occupies. So just walking the bound list of objects here should be good
> > > > > enough ...
> > > > > 
> > > > 
> > > > Maybe I've misunderstood you. My code is wrong, but I think you're idea
> > > > requires a prep patch because it changes functionality, right?
> > > > 
> > > > So let me know if I've understood you.
> > > 
> > > Don't we have both the bound and unbound list? So we could just switch
> > > over to counting the bound objects here ... Otherwise yes, we need a prep
> > > patch to create the bound list first.
> > 
> > Of course there is a bound list.
> > 
> > The old code automatically added the size of unbound objects with
> > unpinned pages, and unpinned inactive objects with unpinned pages.
> > 
> > The latter check for inactive, needs to be checked for all VMAs. That
> > was my point.
> 
> Oh right. The thing is that technically there's no reason to not also scan
> the active objects, i.e. just the unbound list. So yeah, sounds like we
> need a prep patch to switch to the unbound list here first. My apologies
> for being dense and not fully grasphing this right away.
> 
> > 
> > > 
> > > > 
> > > > > >  
> > > > > >  	if (unlock)
> > > > > >  		mutex_unlock(&dev->struct_mutex);
> > > > > >  	return cnt;
> > > > > >  }
> > > > > > +
> > > > > > +/* All the new VM stuff */
> > > > > > +unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
> > > > > > +				  struct i915_address_space *vm)
> > > > > > +{
> > > > > > +	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
> > > > > > +	struct i915_vma *vma;
> > > > > > +
> > > > > > +	if (vm == &dev_priv->mm.aliasing_ppgtt->base)
> > > > > > +		vm = &dev_priv->gtt.base;
> > > > > > +
> > > > > > +	BUG_ON(list_empty(&o->vma_list));
> > > > > > +	list_for_each_entry(vma, &o->vma_list, vma_link) {
> > > > > 
> > > > > Imo the vma list walking here and in the other helpers below indicates
> > > > > that we should deal more often in vmas instead of (object, vm) pairs. Or
> > > > > is this again something that'll get fixed later on?
> > > > > 
> > > > > I just want to avoid diff churn, and it also makes reviewing easier if the
> > > > > foreshadowing is correct ;-) So generally I'd vote for more liberal
> > > > > sprinkling of obj_to_vma in callers.
> > > > 
> > > > It's not something I fixed in the whole series. I think it makes sense
> > > > conceptually, to keep some things as <obj,vm> and others as direct vma.
> > > > 
> > > > If you want me to change something, you need to be more specific since
> > > > no action specifically comes to mind at this point in the series.
> > > 
> > > It's just that the (obj, vm) -> vma lookup is a list-walk, so imo we
> > > should try to avoid it whenever possible. Since the vma has both and obj
> > > and a vm pointer the vma is imo strictly better than the (obj, vm) pair.
> > > And the look-up should be pushed down the callchain as much as possible.
> > > 
> > > So I think generally we want to pass the vma around to functions
> > > everywhere, and the (obj, vm) pair would be the exception (which needs
> > > special justification).
> > > 
> > 
> > Without actually coding it, I am not sure. I think there are probably a
> > decent number of reasonable exceptions where we want the object (ie.
> > it's not really that much of a special case). In any case, I think we'll
> > find you have to do this list walk at some point in the call chain
> > anyway, but I can try to start changing around the code as a patch on
> > top of this. I really want to leave as much as this patch in place as
> > is, since it's decently tested (pre-rebase at least).
> 
> Ok, I can life with this if we clean things up afterwards. But imo
> vma->obj isn't worse for readability than just obj, and passing pairs of
> (obj,vm) around all the time just feels wrong conceptually. In C we have
> structs for this, and since we already have a suitable one created we
> might as well use it.
> 
> Aside: I know that we have the (ring, seqno) pair splattered all over the
> code. It's been on my todo to fix that ever since I've proposed to add a
> i915_gpu_sync_cookie with the original multi-ring enabling. As you can see
> I've been ignored, but I hope we can finally fix this with the dma_buf
> fence rework.
> 
> And we did just recently discover a bug where such a (ring, seqno) pair
> got mixed up, so imo not using vma is fraught with unecessary peril.
> 
> [snip]

As I wrote this code, I would say things as, "do this operation on that
object, in so and so address space." That's more or less why I kept
everything as obj, vm pairs. Since GEM is about BOs, at some fundamental
level we're always talking about objects, and you need to translate an
<obj,VM> pair into a VMA. The other motivation was of course to make the
diffs more manageable and easier to write. It's way easier to add a new
VM argument, than it is to do even more plumbing. As stated now in several
places, I agree it very well might make sense to convert many functions
in the call chain to VMAs, and do the lookup at the entry point from
userspace, or whenever it's logical; and indeed in the long run I think
it will lead to less buggy code.

> 
> > > > > > @@ -158,13 +158,18 @@ int
> > > > > >  i915_gem_evict_everything(struct drm_device *dev)
> > > > > 
> > > > > I suspect evict_everything eventually wants a address_space *vm argument
> > > > > for those cases where we only want to evict everything in a given vm. Atm
> > > > > we have two use-cases of this:
> > > > > - Called from the shrinker as a last-ditch effort. For that it should move
> > > > >   _every_ object onto the unbound list.
> > > > > - Called from execbuf for badly-fragmented address spaces to clean up the
> > > > >   mess. For that case we only care about one address space.
> > > > 
> > > > The current thing is more or less a result of Chris' suggestions. A
> > > > non-posted iteration did plumb the vm, and after reworking to the
> > > > suggestion made by Chris, the vm didn't make much sense anymore.
> > > > 
> > > > For point #1, it requires VM prioritization I think. I don't really see
> > > > any other way to fairly manage it.
> > > 
> > > The shrinker will rip out  objects in lru order by walking first unbound
> > > and then bound objects. That's imo as fair as it gets, we don't need
> > > priorities between vms.
> > 
> > If you pass in a vm, the semantics would be, evict everything for the
> > vm, right?
> 
> Yes.
> 
> > 
> > > 
> > > > For point #2, that I agree it might be useful, but we can easily create
> > > > a new function, and not call it "shrinker" to do it. 
> > > 
> > > Well my point was that this function is called
> > > i915_gem_evict_everything(dev, vm) and for the first use case we simply
> > > pass in vm = NULL. But essentially thrashing the vm should be rare enough
> > > that for now we don't need to care.
> > > 
> > 
> > IIRC, this is exactly how my original patch worked pre-Chris.
> 
> Oops. Do you or Chris still now the argument for changing things? Maybe I
> just don't see another facet of the issue at hand ...
> 
> [snip]
> 
> > > > > > @@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
> > > > > >  static int
> > > > > >  i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > > >  			    struct list_head *objects,
> > > > > > +			    struct i915_address_space *vm,
> > > > > >  			    bool *need_relocs)
> > > > > >  {
> > > > > >  	struct drm_i915_gem_object *obj;
> > > > > > @@ -529,32 +538,37 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
> > > > > >  		list_for_each_entry(obj, objects, exec_list) {
> > > > > >  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> > > > > >  			bool need_fence, need_mappable;
> > > > > > +			u32 obj_offset;
> > > > > >  
> > > > > > -			if (!i915_gem_obj_ggtt_bound(obj))
> > > > > > +			if (!i915_gem_obj_bound(obj, vm))
> > > > > >  				continue;
> > > > > 
> > > > > I wonder a bit how we could avoid the multipler (obj, vm) -> vma lookups
> > > > > here ... Maybe we should cache them in some pointer somewhere (either in
> > > > > the eb object or by adding a new pointer to the object struct, e.g.
> > > > > obj->eb_vma, similar to obj->eb_list).
> > > > > 
> > > > 
> > > > I agree, and even did this at one unposted patch too. However, I think
> > > > it's a premature optimization which risks code correctness. So I think
> > > > somewhere a FIXME needs to happen to address that issue. (Or if Chris
> > > > complains bitterly about some perf hit).
> > > 
> > > If you bring up code correctness I'd vote strongly in favour of using vmas
> > > everywhere - vma has the (obj, vm) pair locked down, doing the lookup all
> > > the thing risks us mixing them up eventually and creating a hella lot of
> > > confusion ;-)
> > 
> > I think this is addressed with the previous comments.
> 
> See my example for (ring, seqno). I really strongly believe passing pairs
> is the wrong thing and passing structs is the right thing. Especially if
> we have one at hand. But I'm ok if you want to clean this up afterwards.
> 
> Ofc if the cleanup afterwards doesn't happend I'll be a bit pissed ;-)
> 
> [snip]

I think it's fair to not merge anything until I do it. I'm just asking
that you let me do it as a patch on top of the series as opposed to
doing a bunch of rebasing which I've shown historically time and again
to screw up. Whether or not I should be better at rebasing isn't up for
debate.

> 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > index 245eb1d..bfe61fa 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > > > @@ -391,7 +391,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > > >  	if (gtt_offset == I915_GTT_OFFSET_NONE)
> > > > > >  		return obj;
> > > > > >  
> > > > > > -	vma = i915_gem_vma_create(obj, &dev_priv->gtt.base);
> > > > > > +	vma = i915_gem_vma_create(obj, vm);
> > > > > >  	if (!vma) {
> > > > > >  		drm_gem_object_unreference(&obj->base);
> > > > > >  		return NULL;
> > > > > > @@ -404,8 +404,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > > > >  	 */
> > > > > >  	vma->node.start = gtt_offset;
> > > > > >  	vma->node.size = size;
> > > > > > -	if (drm_mm_initialized(&dev_priv->gtt.base.mm)) {
> > > > > > -		ret = drm_mm_reserve_node(&dev_priv->gtt.base.mm, &vma->node);
> > > > > > +	if (drm_mm_initialized(&vm->mm)) {
> > > > > > +		ret = drm_mm_reserve_node(&vm->mm, &vma->node);
> > > > > 
> > > > > These two hunks here for stolen look fishy - we only ever use the stolen
> > > > > preallocated stuff for objects with mappings in the global gtt. So keeping
> > > > > that explicit is imo the better approach. And tbh I'm confused where the
> > > > > local variable vm is from ...
> > > > 
> > > > If we don't create a vma for it, we potentially have to special case a
> > > > bunch of places, I think. I'm not actually sure of this, but the
> > > > overhead to do it is quite small.
> > > > 
> > > > Anyway, I'll look this over again nd see what I think.
> > > 
> > > I'm not against the vma, I've just wonedered why you do the
> > > /dev_priv->gtt.base/vm/ replacement here since
> > > - it's never gonna be used with another vm than ggtt
> > > - this patch doesn't add the vm variable, so I'm even more confused where
> > >   this started ;-)
> > 
> > It started from the rebase. In the original series, I did that
> > "deferred_offset" thing, and having a vm variable made that code pass
> > the checkpatch.pl. There wasn't a particular reason for naming it vm
> > other than I had done it all over the place.
> > 
> > I've fixed this locally, leaving the vma, and renamed the local variable
> > ggtt. It still has 3 uses in this function, so it's a bit less typing.
> 
> Yeah, make sense to keep it then.
> 
> Cheers, Daniel

So here is the plan summing up from this mail thread and the other
one...

I'm calling the series done until we get feedback from Chris on the
eviction/shrinker code. Basically whatever he signs off on, I am willing
to implement (assuming you agree). I am confident enough in my rebase
abilities to at least fix that up when we get to it.

On top of this series, I am going smash the 3 patches which introduce
bind/unbind function pointers. I am doing this because all this code
already exists, plays nicely together, and has been tested.

Finally on top of this, I am going to have a [fairly large] patch to
squash <obj,vm> pair into <vma> where I feel its appropriate. My plan
for this is to basically look at a squashed version of all the patches
where I introduce both arguments in one call, think about if it makes
sense to me to think about the operation in terms of an object, or a
vma, and then change as appropriate.

So the feedback I need from you:
1. Does that sound reasonable? If you want to call it weird, that's fine,
but I think the diff churn is acceptable, and the series should be quite
bisectable.
2. Are you aware of anything else I missed in other patch review which I
haven't mentioned. I cannot find anything, and I think except for this
patch I had updated everything locally.

You can see the 14 patches (11 here + 3 from the others series):
http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt2

Hopefully this was all taken in good humor. I am not upset or anything,
we just need to figure out how we can make some progress on this soon
without making total crap, and without taking all my time (since I have
a few other things to work on). I know you warned me about the latter a
few weeks ago, but well, reality.

-- 
Ben Widawsky, Intel Open Source Technology Center