[Intel-gfx] [PATCH v3] drm/amdgpu: Transfer fences to dmabuf importer

Tue Aug 7 12:47:12 UTC 2018

On Tue, Aug 07, 2018 at 02:43:22PM +0200, Daniel Vetter wrote:
> On Tue, Aug 07, 2018 at 01:23:19PM +0200, Christian König wrote:
> > Am 07.08.2018 um 13:05 schrieb Chris Wilson:
> > > amdgpu only uses shared-fences internally, but dmabuf importers rely on
> > > implicit write hazard tracking via the reservation_object.fence_excl.
> > 
> > Well I would rather suggest a solution where we stop doing this.
> > 
> > The problem here is that i915 assumes that a write operation always needs
> > exclusive access to an object protected by an reservation object.
> > 
> > At least for amdgpu, radeon and nouveau this assumption is incorrect, but
> > only amdgpu has a design where userspace is not lying to the kernel about
> > it's read/write access.
> > 
> > What we should do instead is to add a flag to each shared fence to note if
> > it is a write operation or not. Then we can trivially add a function to wait
> > on on those in i915.
> > 
> > I should have pushed harder for this solution when the problem came up
> > initially,
> 
> For shared buffers in an implicit syncing setup like dri2 the exclusive
> fence _is_ your write fence. That's how this stuff works. Note it's only
> for implicit fencing for dri2 shared buffers. i915 lies as much as
> everyone else for explicit syncing.

Just realized that dri3 is exactly in the same boat still afaiui. Anyway,
tldr here isn't that i915 is the exception, but amdgpu. If you have a
shared buffer used in an implicitly synced protocol, you have to set the
exclusive fence to guard writes. How you do that is up to you really. And
if a bitfield in reservation_object would help in tracking these I guess
we could add that, but I think that could as well just be put into an
amdgpu structure somewhere. And would be less confusing for everyone else
if it's not in struct reservation_object.
-Daniel

> 
> Now as long as you only share within amdgpu you can do whatever you want
> to, but for anything shared outside of amdgpu, if the buffer isn't shared
> through an explicit syncing protocol, then you do have to set the
> exclusive fence. That's at least how this stuff works right now with all
> other drivers.
> 
> For i915 we had to do some uapi trickery to make this work in all cases,
> since only really userspace knows whether a write should be a shared or
> exclusive write. But that's stricly an issue limited to your driver, and
> dosn't need changes to reservation object all throughout the stack.
> 
> Aside: If you want to attach multiple write fences to the exclusive fence,
> that should be doable with the array fences.
> -Daniel
> 
> > Christian.
> > 
> > > For example, the importer use the write hazard for timing a page flip to
> > > only occur after the exporter has finished flushing its write into the
> > > surface. As such, on exporting a dmabuf, we must either flush all
> > > outstanding fences (for we do not know which are writes and should have
> > > been exclusive) or alternatively create a new exclusive fence that is
> > > the composite of all the existing shared fences, and so will only be
> > > signaled when all earlier fences are signaled (ensuring that we can not
> > > be signaled before the completion of any earlier write).
> > > 
> > > v2: reservation_object is already locked by amdgpu_bo_reserve()
> > > 
> > > Testcase: igt/amd_prime/amd-to-i915
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Alex Deucher <alexander.deucher at amd.com>
> > > Cc: "Christian König" <christian.koenig at amd.com>
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 68 ++++++++++++++++++++---
> > >   1 file changed, 60 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> > > index 1c5d97f4b4dd..576a83946c25 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> > > @@ -37,6 +37,7 @@
> > >   #include "amdgpu_display.h"
> > >   #include <drm/amdgpu_drm.h>
> > >   #include <linux/dma-buf.h>
> > > +#include <linux/dma-fence-array.h>
> > >   static const struct dma_buf_ops amdgpu_dmabuf_ops;
> > > @@ -188,6 +189,57 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
> > >   	return ERR_PTR(ret);
> > >   }
> > > +static int
> > > +__reservation_object_make_exclusive(struct reservation_object *obj)
> > > +{
> > > +	struct reservation_object_list *fobj;
> > > +	struct dma_fence_array *array;
> > > +	struct dma_fence **fences;
> > > +	unsigned int count, i;
> > > +
> > > +	fobj = reservation_object_get_list(obj);
> > > +	if (!fobj)
> > > +		return 0;
> > > +
> > > +	count = !!rcu_access_pointer(obj->fence_excl);
> > > +	count += fobj->shared_count;
> > > +
> > > +	fences = kmalloc_array(count, sizeof(*fences), GFP_KERNEL);
> > > +	if (!fences)
> > > +		return -ENOMEM;
> > > +
> > > +	for (i = 0; i < fobj->shared_count; i++) {
> > > +		struct dma_fence *f =
> > > +			rcu_dereference_protected(fobj->shared[i],
> > > +						  reservation_object_held(obj));
> > > +
> > > +		fences[i] = dma_fence_get(f);
> > > +	}
> > > +
> > > +	if (rcu_access_pointer(obj->fence_excl)) {
> > > +		struct dma_fence *f =
> > > +			rcu_dereference_protected(obj->fence_excl,
> > > +						  reservation_object_held(obj));
> > > +
> > > +		fences[i] = dma_fence_get(f);
> > > +	}
> > > +
> > > +	array = dma_fence_array_create(count, fences,
> > > +				       dma_fence_context_alloc(1), 0,
> > > +				       false);
> > > +	if (!array)
> > > +		goto err_fences_put;
> > > +
> > > +	reservation_object_add_excl_fence(obj, &array->base);
> > > +	return 0;
> > > +
> > > +err_fences_put:
> > > +	for (i = 0; i < count; i++)
> > > +		dma_fence_put(fences[i]);
> > > +	kfree(fences);
> > > +	return -ENOMEM;
> > > +}
> > > +
> > >   /**
> > >    * amdgpu_gem_map_attach - &dma_buf_ops.attach implementation
> > >    * @dma_buf: shared DMA buffer
> > > @@ -219,16 +271,16 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf,
> > >   	if (attach->dev->driver != adev->dev->driver) {
> > >   		/*
> > > -		 * Wait for all shared fences to complete before we switch to future
> > > -		 * use of exclusive fence on this prime shared bo.
> > > +		 * We only create shared fences for internal use, but importers
> > > +		 * of the dmabuf rely on exclusive fences for implicitly
> > > +		 * tracking write hazards. As any of the current fences may
> > > +		 * correspond to a write, we need to convert all existing
> > > +		 * fences on the resevation object into a single exclusive
> > > +		 * fence.
> > >   		 */
> > > -		r = reservation_object_wait_timeout_rcu(bo->tbo.resv,
> > > -							true, false,
> > > -							MAX_SCHEDULE_TIMEOUT);
> > > -		if (unlikely(r < 0)) {
> > > -			DRM_DEBUG_PRIME("Fence wait failed: %li\n", r);
> > > +		r = __reservation_object_make_exclusive(bo->tbo.resv);
> > > +		if (r)
> > >   			goto error_unreserve;
> > > -		}
> > >   	}
> > >   	/* pin buffer into GTT */
> > 
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch