[Bug 43239] Tp::SharedPtr tries to be thread-safe, but secretly isn't

Sun Dec 4 22:59:39 CET 2011

https://bugs.freedesktop.org/show_bug.cgi?id=43239

--- Comment #5 from Olli Salli <ollisal at gmail.com> 2011-12-04 13:59:39 PST ---
(In reply to comment #3)
> The thing is that some places, private class constructors mainly, cannot store
> a WeakPtr of the "parent" class, as our WeakPtr impl cannot be created from a
> bare pointer, it has to be created from a SharedPtr which would unref itself in
> the own constructor, thus deleting the object.

But we CAN make weak pointers able to be created from bare pointers! Because we
can make them do anything, as long as it is correct. Think about why e.g.
boost::weak_ptr doesn't allow this. It is because if you promoted the weak ref
thus created to a strong ref, there wouldn't be any way of digging up the
correct refcount structure. So it's the same problem as that which prevents us
from using boost or std::(tr1::) shared_ptr or QSharedPointer.

When you construct a boost::weak_ptr from a boost::shared_ptr, it takes a
reference to the shared_count structure from a member variable of shared_ptr
(perhaps through a protected/private friended getter, haven't looked). There is
no way for it to look up it from the pointed-to object.

But!

We DO have a way to dig up the correct refcount structure from just a
pointed-to object (reachable through the raw pointer)! Because our RefCounted
auxiliary "mix-in" class stores a pointer to it. So we can make Tp::WeakPtr
safely constructable from a raw pointer. It will just look up this pointer and
initialize itself with it. The existence of the object the raw pointer points
to guarantees that the SharedCount is still valid as well (because it holds the
single ref).

Or is there some other issue I don't see?

> So yeah, I checked everything and the places using QPointer now have to be
> changed to use QWeakPointer but cannot use WeakPtr. I've changed all other
> places that can use WeakPtr safely to properly use it.

So when you've added that, please proceed with using WeakPtr even in those
places, if it makes sense (it's a more common operation to promote to strong
and use the pointed to object, than to check whether the pointed to object
still exists). Or otherwise, please point out why specifically the WeakPtr we
ourselves design shouldn't be constructable from a raw pointer :)

> > a public header... not desirable. (It needs to be exported though if it is
> > created and destroyed by inline code)
> Will see what I can do, but I think I prefer it to be hidden from doxygen, as
> well as the public "sc" var from Tp::RefCounted. We need them public as we
> cannot friend templates (SharedPtr<T>, WeakPtr<T>) from non-template classes.
> Or can we?
> 

I thought template <class T> friend class SharedPtr; worked just fine? i.e. the
exact same syntax as for template class and function forward declarations.

> > -    explicit inline SharedPtr(const QWeakPointer<T> &o)
> > +    explicit inline SharedPtr(const WeakPtr<T> &o)
> > 
> > Please document separately that we've removed the QWeakPointer constructor
> > because it wasn't thread safe and if people are absolutely sure that their code
> > doesn't EVER need multithreading for ANYTHING they can still construct
> > SharedPtrs from them by getting the raw pointer.
> So where to add this documentation? Or you mean a commit log, NEWS, or
> something?

SharedPtr overall doxygen description and NEWS API changes section for 0.9.0
preferably. Commit log it should self-evidently be in - you've removed it in a
commit, right? So the commit message should explain that.

> As long as we have at least one WeakPtr that ever pointed to a valid (non-null)
> SharedPtr, "WeakPtr::sc" or "o.sc", will exist. So sc will either be null or
> non-null during the whole method, won't change due to threads.

Right, existence of a WeakPtr alone keeps the SharedCount alive.

> If "sc" is non-null, let's see if it is still pointing to a valid data. In

If the RefCounted object is still valid - a pointer to which is inside the
SharedCount

> order to check we need to get the number of strongrefs and check if we can

Mm, though I wonder whether there is some real reason you are using the
fetch-and-add with *fully ordered* semantics there? The memory read to check
the current value doesn't depend on anything other than the, well, current
value. And the only negative way in which the current value might surprise us
if it has gone to zero. But that can only happen through deref(), which makes
sure that no other memory read after it reads the old value (implemented on
e.g. x86/x86_64 using the LOCK prefix). And our test-and-set operation makes
even that harmless... because if the refcount changes slightly before or after
the read to check doesn't matter, the test and set op guarantees that only if
that didn't happen *at all* is the refcount set to a new, higher value.

So I think that could safely be relaxed as well. But it seems that on x86 and
x86_64 (only archs whose ASM I can understand) the implementation for ordered
access is fast enough so that a separate relaxed implementation doesn't even
exist in Qt. So let's keep the safe one (most guarantees, i.e. ordered).

This bit we *can* always change later if it proves to be a performance problem
e.g. on some ARM architecture (and somebody explains why that is so :>) - as it
operates on the same data, so nothing needs to be changed binary layout wise.

> atomically increase from "current value" to "current value + 1", with "current
> value" being > 0 (at least one strongref before the test). testAndSetRelaxed
> will atomically test if the "current value" is still the same ("tmp") and if so
> increment it by 1 returning true, if not, nothing is changed, which means that
> on success we hold a strongref and are guaranteed to be pointing to a valid
> data.

Yeah, would seem to me that it is correct indeed. Good job!

Now for the test attempting to trigger the potential races here. Overall I
think you caught my idea pretty well, the layout and macro structure being
pretty much exactly what I had in mind. However I have a few small suggestions

+            QVERIFY(!ptrtmp.isNull());
+
+            usleep(10);
+            DataPtr ptrtmp2(ptrtmp);

What are you trying to achieve with the sleeps? To reproduce the race
conditions  even on a single logical processor system by forcing a context
switch to a different thread? I don't think that's a very good idea. By doing
this, the context switch points become very deterministic - when the OS resumes
execution of the thread it will ALWAYS for sure execute it until the next
sleep, because that's just a few dozen or maybe hundred instructions, but
anyway well below any pre-emption timeslice. While the point of the
repetitions, more so, could be to try and trigger races from a context switch
sometimes happening at an unfavorable point of time - for that, we need
variance on where it happens. Also you could increase the loop counts
significantly while still running the loops in the same amount of wall clock
time if there weren't sleeps, further helping in triggering races.

Also, because the Thread instance already holds a single strong ref, I think
the run() method is too heavy on how it keeps the refs to trigger the most
interesting cases. Because even within a single thread, a single run() loop
iteration increases the strong refs held by that thread first to two before the
first sleep, then to three before the second, and finally *four* for the single
thread before dropping all of them when the loop iteration finishes. In
particular because more than one ref is held during the sleeps, the only
pre-emption points any practical OS kernel scheduler is going to give to this
code, there is an awful lot of head room for the ref counts to momentarily
screw up but still stay safely above zero in any thread executing at a given
time.

Thus, more likely to trigger any races would be to 
1) ditch the sleeps
2) reset the source strong refs and weak refs after they've been copied to the
next pointer - you're still supposed to be able to promote back to strong etc
even from a single weak ref in the loop, because there is one strong ref as a
member variable of the object

+    QVERIFY(!weakPtr.isNull());
+
+    for (int i = 0; i < 5; ++i) {
+        t[i] = new Thread(ptr, this);
+        t[i]->start();
+    }
+
+    ptr.reset();
+    QVERIFY(ptr.isNull());
+

Might be sensible to verify that neither weakPtr or ptr is null before the
reset. Because start() is what, well, starts executing a thread, not join()
(threads are not futures) so it's very likely that anything the threads cause
to affect the refcounts are already done by the time you even do that reset().
Including screwing up refcounting and killing the object. (Would manifest as
SIGSEGV or valgrind reported memory corruption more likely than assertion fail,
though). You could also at that point again verify that ptr.data() still is
equal to savedData.

So still to go:

1) make WeakPtr constructible from raw pointer, if there isn't some issue I'm
overlooking, and use it wherever sensible
2) explain in the docs, NEWS API changes section and commit message why
promotion from QWeakPointer is gone
3) ponder making SharedCount more of an internal implementation detail
(compiler checkably) than a public class, because we *can* if my memory serves
me correctly on the template <class X...> friend class Y; syntax
4) adjust tests

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA Contact for the bug.