[PATCH] drm: Replace kref with a simple atomic reference count

Thomas Hellstrom thomas at shipmail.org
Sun Nov 28 09:20:55 PST 2010


On 11/28/2010 04:13 PM, Daniel Vetter wrote:
> Hi Thomas,
>
> On Sun, Nov 28, 2010 at 03:19:50PM +0100, Thomas Hellstrom wrote:
>    
>> I'm not saying that the current gem code is incorrect. In
>> particular, the gem name system allows upping the handle_count even
>> if it's zero and the name is just about to be destroyed. I see that
>> more as a curiosity, and gem would work just as well if it used
>> kref_get_unless_zero on the handle_count.
>>      
> Yeah, noticed that, too. But that's just userspace racing against itself,
> so no problem.
>
>    
>> However this is a generic problem.
>> It appears in the general user-visible object case. For example,
>> currently gem needs to take the struct_mutex *before* unreferencing
>> to avoid hitting this problem with a racing mmap lookup (At least I
>> think that's why the struct_mutex is taken).  If mmap were using
>> kref_get_unless_zero, and returning error on failure, the
>> struct_mutex would be needed only inside the kref release function,
>> and that is what Dave was referring to in his mail.
>>      
> Indeed, I haven't noticed that. drm/i915 doesn't hold a ref to the
> underlying bo for the offset_hash, which requires the dev->struct_mutex
> around unref. An actual mmap gets itself a ref (protected by
> dev->struct_mutex) so from there on its all fine. I think the offset_hash
> should hold a reference too, which gets destroyed when handle_count
> reaches 0 (like the flink name). Haven't checked what ttm does for this
> case.
>
>    

For TTM the "name" does not hold a reference just like the gem vm 
handle, but we take a lock around unref'ing so that destruction of the 
"name" and unreffing is atomic. That doesn't work with RCU since we 
can't block readers.

> I think having a pointer without an accompanying reference is a bug, i.e.
> I've jotted this down on my list of things to fix. Then we could move the
> mutex taking inside the actual free function.
>
>    
>> In particular this also applies to the rcu case, were we cannot
>> block readers. If we were to move over all object lookup to use RCU
>> locks, we'd want to be completely 100% sure that once a refcount
>> reaches zero, it won't be upped again, and we can safely go ahead
>> and use call_rcu for destruction. Let's say this wasn't true,
>> call_rcu could in principle be called multiple times for the same
>> object, and that isn't allowed until the callback has been run...
>>      
> Well, if the lookup structure has a reference to the underlying object we
> only need to defer the unref till all readers are gone with call_rcu.
>    

So how do you determine when to delete the lookup structure (or in the 
gem case, the name)? Currently gem does this when all handles are gone, 
but then again, how do you block a handle from registering a new 
reference just after you've removed the name and called rcu, but before 
the rcu callback is called? You can't block readers while you remove 
first the name and then the reference.

What you *can* do is to check again rcu in the callback for the refcount 
to be nonzero and abort, but then again you need a protection to avoid 
the rcu callback to be called again when the refcount reaches zero the 
second time, while the rcu head sits on the run list.... This quickly 
gets quite ugly, but if you have a good solution to this problem, I'm 
all ears.

>    
>> And the simple solution to this problem is kref_get_unless_zero().
>>      
>    
> IMHO kref is for kernel references. And shouldn't be abused for userspace
> lookup with strange semantics.
These are perfectly valid kernel references. Typically the pointer in 
the handle's referencing idr entry. If kref_get_unless_zero() returns an 
error, you're not allowed to create a new pointer, and lookup returns an 
error. It's as easy as that.

>   So kref_get_unless_zero sounds like a very
> bad idea hiding brittle locking semantics at best, bugs at worst.
>    

I don't agree. IMHO it's a good thing to do in situations where you 
might be racing with lookup object removal, but the underlying object is 
guaranteed to not have been destroyed. Perfectly suitable for RCU-like 
situations.

/Thomas



> Cheers, Danie
>    



More information about the dri-devel mailing list