drm prime/dma-buf import lifetime issue

Daniel Vetter daniel at ffwll.ch
Tue Dec 18 00:18:34 PST 2012


Hi Dave!

On Tue, Dec 18, 2012 at 5:38 AM, Dave Airlie <airlied at gmail.com> wrote:
> So I've gotten back to playing with prime for a day, and found some
> old intel/radeon tests I had failing,
>
> Tracked it down to a lifetime issue with the current code and can
> think of two fixes,
>
> The problem scenario is
>
> 1. i915: create gem object
> 2. i915: export gem object to prime
> 3. radeon: import gem object
> 4. close prime fd
> 5. radeon: unref object
> 6. i915: unref object
>
> So we end up at this point, with a imported buffer record for the
> dma_buf on the i915 file private.
>
> Now if a subsequent test (without closing the drm fd) reallocates the
> dma_buf with the same address, we can end up seeing that.
>
> So why doesn't that reference get cleaned up?
>
> So the reference gets added above at step 2, and when radeon unrefs
> the object, i915 gets the dma-buf release callback, however at that
> stage
> we don't actually have the file priv to remove the pointer from, so it
> dangles there.

I haven't checked yet, but we seem to have similar leaks by simply
self-importing a dma-buf on a different open fd of the same driver
already (i.e. the wayland/dri3/rendernode use-case).

> Possible fixes:
>
> a) take a reference on the dma_buf attached to the gem handle when we
> export it, keep it until the gem handle goes away. I'm unsure if this
> could create zombie objects, since the dma buf has a reference on the
> gem object, but since the gem handle is separate to the object it
> might work.

I'm leaning towards this option, since it'll give us cleaner lifetime rules:
- As long as we still have gem handle around, we keep an explicit
reference to the dma_buf. We need to keep a full reference since
otherwise re-exporting by doing reusing the cached dma_buf will be a
royal pain and fraught with races. From a drm core/prime perspective I
think we could even unify the 2 dma_buf poiners we have currently -
drivers could keep track of whether a gem bo is native or imported in
their own state with just one bit ...
- If gem->handle_refcount drops to zero we clean up the dma-buf
pointer&cache entry and drop that reference. We need to be careful
here with races against other users resurrect the object with a new
userspace gem handle. But essentially it's the same dance as with
flink, with the only complication that we now have two ways to create
a new gem handle while we're doing the teardown when the last gem
handle is closed.

This way dma-buf mirrors the lifetime rules of flink, where we hold a
reference on the lookup object (flink name/dma-buf) for as long a gem
handle exists. flink names are driver private there's no need for a
real refcount, but the current rules amount to an implicit refcount
with only 0,1 as values. The lookup-object themselves also hold a
reference onto the real backing storage.

The second reason why I think we should give the dma-buf (re-)import
cache a real reference is my framebuffer lifetime rework. I've played
around with tons of different schemes which all kinda looked somewhat
silly, until settling on holding a reference while the object is in
the lookup idr. This reference gets dropped when the real user
disappears: Either at rmfb ioctl time for userspace fbs, or at driver
destruction for the fbdev fb (or whenever the driver drops an
internally-created fb).

Ime that pattern of keeping an explicit reference just leads to less
locking headache.

> b) don't keep track of dma_buf, keep track of gem objects, when we get
> a lookup, check inside the gem object, since currently we NULL out the
> export_dma_buf when we hit the release path, apart from the fact I'm
> sure the locking is foobar,

Scrolled through your patch quickly, but didn't do a full review. Good
thing is that I've resurrect my ivb/nvc0 test machine and beaten the
igt/prime tests a bit into shape, so I should be able to test things
(not like last time around, where my attempt to fix up lifetime rules
a bit resulted in some unduly oopses).

I'll look at this in-depth next year, pls yell in case I'll forget by then ;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the dri-devel mailing list