[cairo] Fixing concurrency bugs in cairo's reference counting

Kristian Høgsberg krh at bitplanet.net
Mon Dec 18 12:36:02 PST 2006


On 12/18/06, Jamey Sharp <jamey at minilop.net> wrote:
> On Thu, Dec 14, 2006 at 05:35:01AM -0800, Carl Worth wrote:
> > Similarly, we test the value for 0 after decrementing. It's harder to
> > guarantee that that's safe, but I think it is.
>
> I'm not sure if this is the same thing Kristian just pointed out, but a
> decrement and test implemented like this is not safe:

No, I was discussing the point below, but you're right, it has to be
an atomic dec-and-test operation.

> > The thing to look out for is a race condition between acquiring a
> > pointer to an object and incrementing the reference count. If another
> > thread can decrement the reference counter to 0 during that window we
> > lose. (And note that Monty's mutex locks wouldn't even help here.)
>
> I had to read this a few times to understand the issue and figure out
> that it wasn't related to the previous sentence. I guess the only option
> is to ensure that the source of the pointer continues to hold a
> reference until you can get your own. That should be pretty easy to
> ensure statically, at least with the help of a mutex here and there.

This was the point I was trying to argue against (I realize that
perhaps my mail wasn't too coherent): I don't think it's feasible to
punt the locking to users for this case.  If you look up an object
from a cairo internal data structure (say, a font from a font cache)
and the object returned only have a reference belonging to the cache,
how exactly do you design your locking to prevent cairo from dropping
that reference before you can add you own?  Either cairo needs to
expose the lock that protects the data structure in question, or the
user need to serialize all acccess to cairo, so that another thread
doesn't inadvertently evict the font you just looked up.

The only way to do this is to add the reference on behalf of the
caller when you're traversing the data structure.  At that point you
have the lock and can safely add another reference.  When you release
the lock and return the object, the data structure might be accessed
from another thread and drop that object, but at that point you have
reference to return to the caller.

Given that the create functions and lookup functions like the
situation described above now take a reference on behalf of the caller
(cases 1 and 3 from Carls original mail), it just makes a lot of sense
to also make the "unshared resource" case (case 2) take a reference.
This breaks API, though, since those getter functions (eg
cairo_pattern_get_surface) already exist and doesn't take a reference.

cheers,
Kristian


More information about the cairo mailing list