[PATCH 1/8] drm: execution context for GEM buffers v2

Wed May 11 15:23:49 UTC 2022

On Mon, May 09, 2022 at 05:01:33PM +0200, Christian König wrote:
> Am 09.05.22 um 16:31 schrieb Daniel Vetter:
> > On Wed, May 04, 2022 at 09:47:32AM +0200, Christian König wrote:
> > > [SNIP]
> > > +/* Make sure we have enough room and add an object the container */
> > > +static int drm_exec_objects_add(struct drm_exec_objects *container,
> > > +				struct drm_gem_object *obj)
> > > +{
> > > +	if (unlikely(container->num_objects == container->max_objects)) {
> > > +		size_t size = container->max_objects * sizeof(void *);
> > > +		void *tmp;
> > > +
> > > +		tmp = kvrealloc(container->objects, size, size + PAGE_SIZE,
> > > +				GFP_KERNEL);
> > > +		if (!tmp)
> > > +			return -ENOMEM;
> > > +
> > > +		container->objects = tmp;
> > > +		container->max_objects += PAGE_SIZE / sizeof(void *);
> > Might be worth it to inquire the actual allocation size here, since if
> > it's kmalloc the generic buckets only cover doubling of sizes, so once
> > it's big it goes up a lot quicker than PAGE_SIZE.
> > 
> > But also krealloc checks this internally already so maybe better to not
> > break the abstraction.
> 
> How can I actually do this? ksize() only works with kmalloc().
> 
> Or do we had a function to figure out if vmalloc or kmalloc was used by
> kvrealloc()?

kvfree has a is_vmalloc_addr so it would boil down to open-code that a
bit.

Probably not worth the trouble really, otoh looking at kvrealloc it
doesn't use krealloc underneath, so it's not doing that check. Maybe we
should just push that check into kvrealloc for the !vmalloc_addr case.

> > > [SNIP]
> > > +/**
> > > + * drm_exec_cleanup - cleanup when contention is detected
> > > + * @exec: the drm_exec object to cleanup
> > > + *
> > > + * Cleanup the current state and return true if we should stay inside the retry
> > > + * loop, false if there wasn't any contention detected and we can keep the
> > > + * objects locked.
> > > + */
> > > +bool drm_exec_cleanup(struct drm_exec *exec)
> > > +{
> > > +	if (likely(!exec->contended)) {
> > > +		ww_acquire_done(&exec->ticket);
> > > +		return false;
> > > +	}
> > > +
> > > +	if (likely(exec->contended == DRM_EXEC_DUMMY)) {
> > > +		exec->contended = NULL;
> > > +		ww_acquire_init(&exec->ticket, &reservation_ww_class);
> > Not sure why this is here instead of in _init()? I thought you're playing
> > some really dangerous tricks with re-initting the acquire ctx, which would
> > at least be questionable, but does not look like that.
> 
> That was my initial design, but the problem with this approach is that all
> locks taken between drm_exec_init() and the loop suddenly have a lockdep
> dependency on reservation_ww_class. And that in turn goes boom immediately.
> 
> Took me a moment to realize what's wrong with that as well.

Uh crap, indeed. I think minimally this needs to be document, but
personally I'm leaning towards drm_exec_prepare_init(), which does this
explicitly.

I do agree we need this split, especially so we can eventually add helpers
for bo lookup, or maybe userptr/hmm prep and things like that, which all
has to be outside of the acquire_ctx.

> > [SNIP]
> > +/**
> > + * drm_exec_has_duplicates - check for duplicated GEM object
> > + * @exec: drm_exec object
> > + *
> > + * Return true if the drm_exec object has encountered some already locked GEM
> > + * objects while trying to lock them. This can happen if multiple GEM objects
> > + * share the same underlying resv object.
> > + */
> > +static inline bool drm_exec_has_duplicates(struct drm_exec *exec)
> > +{
> > +	return exec->duplicates.num_objects > 0;
> > Definitely an aside, but in our i915 efforts to get rid of temporary pins
> > we run into some fun where the eviction code couldn't differentiate from
> > memory we need reserved for the CS and memory we just keep locked because
> > we evicted it and fun stuff like that. So maybe we need a bit more
> > tracking here eventually, but that's only when we have this somehow glued
> > into ttm eviction code.
> 
> Hehe, yeah that's what I was thinking about as well. But then I though one
> step at a time.
> 
> > Also the even more massive step would be to glue this into dma-buf so you
> > can do dynamic dma-buf eviction and still keep track of all the buffers. I
> > think with some clever pointer tagging and a bit more indirection we could
> > nest drm_exec structures (so that a driver could insert it's entire
> > drm_exec structure with a drm_exec-level callback for handling refcounting
> > and stuff like that).
> 
> I considered in which component to put this quite a bit as well, but then
> intentionally decided against DMA-buf.
> 
> One major reason was that not all buffers which needs to be locked this way
> are actually exported as DMA-buf.
> 
> Another reason is that DMA-buf doesn't necessary need a concept of an
> execution context. As far as I can see that's something GPU/DRM driver
> specific.

Yeah I think putting this into driver subsystem is right. I just wanted to
point that even with that driver subsystem design we can still pretty
easily put this into dma-buf eventually. And still have the benefit that
each driver would have a structure which operates on the native buffer
object.

> > So anyway I think this all looks good, just one more thing before I think
> > we should land this:
> > 
> > gem helpers in drm_gem_lock_reservations() has something which is
> > practically compatible already, except that you bulk-add the entire set of
> > objects. I think if you add a bulk-prepare function then we could also
> > replace all those. Maybe even nicer if the bulk-prepare takes the array of
> > handles and does the handle lookup too, but at least something which can
> > subsititue drm_gem_lock_reservations with drm_exec would be nice to
> > validate the helpers a bit more and really make sure we only have one of
> > them left.
> 
> I was considering that as well, but then also thought one step at a time.
> Not sure if it's possible to look up handles without running into some
> locking fun, thought.

Since you pointed out the acquire_ctx fun I think that's really the only
issue. It's the combo of e.g. v3d_lookup_bos() and
v3d_lock_bo_reservations().
-Daniel

> Thanks for the review,
> Christian.
> 
> > 
> > Thoughts?
> > -Daniel
> > 
> > > +}
> > > +
> > > +void drm_exec_init(struct drm_exec *exec, bool interruptible);
> > > +void drm_exec_fini(struct drm_exec *exec);
> > > +bool drm_exec_cleanup(struct drm_exec *exec);
> > > +int drm_exec_prepare_obj(struct drm_exec *exec, struct drm_gem_object *obj,
> > > +			 unsigned int num_fences);
> > > +
> > > +#endif
> > > -- 
> > > 2.25.1
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch