[PATCH v4] drm/amdgpu: Transfer fences to dmabuf importer

Tue Aug 7 18:14:19 UTC 2018

Quoting Christian König (2018-08-07 18:57:16)
> Am 07.08.2018 um 18:08 schrieb Chris Wilson:
> > amdgpu only uses shared-fences internally, but dmabuf importers rely on
> > implicit write hazard tracking via the reservation_object.fence_excl.
> > For example, the importer use the write hazard for timing a page flip to
> > only occur after the exporter has finished flushing its write into the
> > surface. As such, on exporting a dmabuf, we must either flush all
> > outstanding fences (for we do not know which are writes and should have
> > been exclusive) or alternatively create a new exclusive fence that is
> > the composite of all the existing shared fences, and so will only be
> > signaled when all earlier fences are signaled (ensuring that we can not
> > be signaled before the completion of any earlier write).
> >
> > v2: reservation_object is already locked by amdgpu_bo_reserve()
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341
> > Testcase: igt/amd_prime/amd-to-i915
> > References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Alex Deucher <alexander.deucher at amd.com>
> > Cc: "Christian König" <christian.koenig at amd.com>
> > ---
> >
> > This time, hopefully proofread and references complete.
> > -Chris
> >
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 68 ++++++++++++++++++++---
> >   1 file changed, 60 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> > index 1c5d97f4b4dd..dff2b01a3d89 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> > @@ -37,6 +37,7 @@
> >   #include "amdgpu_display.h"
> >   #include <drm/amdgpu_drm.h>
> >   #include <linux/dma-buf.h>
> > +#include <linux/dma-fence-array.h>
> >   
> >   static const struct dma_buf_ops amdgpu_dmabuf_ops;
> >   
> > @@ -188,6 +189,57 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
> >       return ERR_PTR(ret);
> >   }
> >   
> > +static int
> > +__reservation_object_make_exclusive(struct reservation_object *obj)
> > +{
> > +     struct reservation_object_list *fobj;
> > +     struct dma_fence_array *array;
> > +     struct dma_fence **fences;
> > +     unsigned int count, i;
> > +
> > +     fobj = reservation_object_get_list(obj);
> > +     if (!fobj)
> > +             return 0;
> > +
> > +     count = !!rcu_access_pointer(obj->fence_excl);
> > +     count += fobj->shared_count;
> > +
> > +     fences = kmalloc_array(count, sizeof(*fences), GFP_KERNEL);
> > +     if (!fences)
> > +             return -ENOMEM;
> > +
> > +     for (i = 0; i < fobj->shared_count; i++) {
> > +             struct dma_fence *f =
> > +                     rcu_dereference_protected(fobj->shared[i],
> > +                                               reservation_object_held(obj));
> > +
> > +             fences[i] = dma_fence_get(f);
> > +     }
> > +
> > +     if (rcu_access_pointer(obj->fence_excl)) {
> > +             struct dma_fence *f =
> > +                     rcu_dereference_protected(obj->fence_excl,
> > +                                               reservation_object_held(obj));
> > +
> > +             fences[i] = dma_fence_get(f);
> > +     }
> > +
> > +     array = dma_fence_array_create(count, fences,
> > +                                    dma_fence_context_alloc(1), 0,
> > +                                    false);
> > +     if (!array)
> > +             goto err_fences_put;
> > +
> > +     reservation_object_add_excl_fence(obj, &array->base);
> > +     return 0;
> > +
> > +err_fences_put:
> > +     for (i = 0; i < count; i++)
> > +             dma_fence_put(fences[i]);
> > +     kfree(fences);
> > +     return -ENOMEM;
> > +}
> > +
> 
> This can be simplified a lot. See amdgpu_pasid_free_delayed() for an 
> example:

{
	if (!reservation_object_get_list(obj))
		return 0;

	r = reservation_object_get_fences_rcu(obj, NULL, &count, &fences);
	if (r)
		return r;

	array = dma_fence_array_create(count, fences,
				       dma_fence_context_alloc(1), 0,
				       false);
	if (!array)
		goto err_fences_put;

	reservation_object_add_excl_fence(obj, &array->base);
	return 0;

err:
	...
}

My starting point was going to be use get_fences_rcu, but get_fences_rcu
can hardly be called simple for where the lock is required to be held
start to finish ;)
-Chris