[PATCH] drm: Replace kref with a simple atomic reference count

Daniel Vetter daniel at ffwll.ch
Sun Nov 28 07:13:42 PST 2010


Hi Thomas,

On Sun, Nov 28, 2010 at 03:19:50PM +0100, Thomas Hellstrom wrote:
> I'm not saying that the current gem code is incorrect. In
> particular, the gem name system allows upping the handle_count even
> if it's zero and the name is just about to be destroyed. I see that
> more as a curiosity, and gem would work just as well if it used
> kref_get_unless_zero on the handle_count.

Yeah, noticed that, too. But that's just userspace racing against itself,
so no problem.

> However this is a generic problem.
> It appears in the general user-visible object case. For example,
> currently gem needs to take the struct_mutex *before* unreferencing
> to avoid hitting this problem with a racing mmap lookup (At least I
> think that's why the struct_mutex is taken).  If mmap were using
> kref_get_unless_zero, and returning error on failure, the
> struct_mutex would be needed only inside the kref release function,
> and that is what Dave was referring to in his mail.

Indeed, I haven't noticed that. drm/i915 doesn't hold a ref to the
underlying bo for the offset_hash, which requires the dev->struct_mutex
around unref. An actual mmap gets itself a ref (protected by
dev->struct_mutex) so from there on its all fine. I think the offset_hash
should hold a reference too, which gets destroyed when handle_count
reaches 0 (like the flink name). Haven't checked what ttm does for this
case.

I think having a pointer without an accompanying reference is a bug, i.e.
I've jotted this down on my list of things to fix. Then we could move the
mutex taking inside the actual free function.

> In particular this also applies to the rcu case, were we cannot
> block readers. If we were to move over all object lookup to use RCU
> locks, we'd want to be completely 100% sure that once a refcount
> reaches zero, it won't be upped again, and we can safely go ahead
> and use call_rcu for destruction. Let's say this wasn't true,
> call_rcu could in principle be called multiple times for the same
> object, and that isn't allowed until the callback has been run...

Well, if the lookup structure has a reference to the underlying object we
only need to defer the unref till all readers are gone with call_rcu.

> And the simple solution to this problem is kref_get_unless_zero().

IMHO kref is for kernel references. And shouldn't be abused for userspace
lookup with strange semantics. So kref_get_unless_zero sounds like a very
bad idea hiding brittle locking semantics at best, bugs at worst.

Cheers, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


More information about the dri-devel mailing list