Bug 43239

Summary: Tp::SharedPtr tries to be thread-safe, but secretly isn't
Product: Telepathy Reporter: Olli Salli <ollisal>
Component: tp-qtAssignee: Andre Moreira Magalhaes <andrunko>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: blocker    
Priority: high CC: ollisal
Version: git masterKeywords: patch
Hardware: Other   
OS: All   
URL: http://cgit.collabora.com/git/user/andrunko/telepathy-qt4.git/log/?h=shared-ptr
Whiteboard: review+
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 43240    

Description Olli Salli 2011-11-25 03:20:07 UTC
As discovered when porting TpQt4 to Qt5,

    explicit inline SharedPtr(const QWeakPointer<T> &o)
    {
        if (o.data() && o.data()->strongref.fetchAndAddOrdered(0) > 0) {
            d = static_cast<T*>(o.data());
            d->ref();
        }

Used like that, the atomic integer used to represent the reference count of Tp::RefCounted objects is completely useless. First, after the
o.data() check, an another thread can cause data() to go NULL, and then after the
refcount check, an another thread can cause the refcount to drop to zero and the object
be deleted.

The second issue is more severe. The same (by-value) WeakPtr instance doesn't really need to be accessed safely from multiple threads. But the reference count part of the RefCounted object pointed to by o.data() needs to, because multiple QWeakPointer and Tp::SharedPtr value instances can point to it (exactly one instance).

This is perhaps also why they removed the implicit integer conversion in qt5 -
the whole point of the atomic integer is to be able to implement
inspection/test AND modify in one atomic operation to guard against an another
thread ruining our day in between them. And that removal of API helped us to
notice the pathological usage here.

The resolution can't simply be "even this part of tp-qt4 doesn't need to be thread-safe". Why then
would we even be using an atomic integer? It'd serve no purpose other than
to slow things down in solely single-threaded usage...

However, passing object refs across threads safely (even if not using the objects pointed to from multiple threads at a time) IS useful. So we must fix this before committing this inline implementation to tp-qt >= 0.9 release headers (that results in an ABI commitment, because all clients will contain this code, and try to access RefCounted in the exact way presented here so we can't change it afterwards, in addition to suffering from no thread safety).

I suggest taking a look at other Qt shared pointer impls (webkit qt's ?
qsharedpointers? qexplicitlyshared...'s ?) for ideas on getting the thread
safety right. I expect that to turn out as either a certain pattern of atomic integer operations, or an explicit locking of the target object for the weak pointer to strong pointer promotion.

Could that actually be as simple as doing fetchAndAddOrdered(1)? And if that only yields <= 1 (not < 1, note), consider the object dead and initialize the SharedPtr as null. If it yields > 1, make the SharedPtr point to the object and keep the ref.
Comment 1 Andre Moreira Magalhaes 2011-12-01 19:31:33 UTC
Ok, first attempt to fix this at URL.
Comment 2 Olli Salli 2011-12-02 07:19:20 UTC
-    QWeakPointer<Connection> conn;
+    QPointer<Connection> conn;

What on earth are these? QPointer was wholly and completely deprecated already in Qt 4.6 or so. Strange if it wasn't gone completely by the time Qt5 final is out. Note that *we can still use QWeakPointer for tracking QObjects getting deleted*, or use our own WeakPtr, now reintroduced, for that, whatever.

Other than the deprecation, there's supposedly a performance advantage to using QWeakPointer instead of QPointer. Promotion to strong ref (Tp::SharedPtr) from a weak ref is going to be even faster with our own WeakPtr though, but checking if the object is still there is going to be faster with QWeakPointer (requires an implicit temporary promotion to strong ref with WeakPtr/SharedPtr). And weak ptr -> strong ptr is only thread safe with our own, but yeah for pointers which are completely hidden inside tp-qt4, that's not an issue.

So given those considerations, try and decide between QWeakPointer and Tp::WeakPtr as appropriate, but never use QPointer.

+    }
+    inline virtual ~RefCounted()
+    {

empty line

+
+class TP_QT_EXPORT SharedCount
+{

Make that a private class inside RefCounted (with friends as needed) or at least somehow hide it from Doxygen. Currently it's an exported public class in a public header... not desirable. (It needs to be exported though if it is created and destroyed by inline code)

-    explicit inline SharedPtr(const QWeakPointer<T> &o)
+    explicit inline SharedPtr(const WeakPtr<T> &o)

Please document separately that we've removed the QWeakPointer constructor because it wasn't thread safe and if people are absolutely sure that their code doesn't EVER need multithreading for ANYTHING they can still construct SharedPtrs from them by getting the raw pointer.

+        SharedCount *sc = o.sc;
+        if (sc) {
+            // increase the strongref, but never up from zero
+            // or less (negative is used on untracked objects)
+            register int tmp = sc->strongref.fetchAndAddOrdered(0);
+            while (tmp > 0) {
+                // try to increment from "tmp" to "tmp + 1"
+                if (sc->strongref.testAndSetRelaxed(tmp, tmp + 1)) {
+                    // succeeded
+                    break;
+                }
+                // failed, try again
+                tmp = sc->strongref.fetchAndAddOrdered(0);
+            }
+
+            if (tmp > 0) {
+                d = dynamic_cast<T*>(sc->d);
+                Q_ASSERT(d != NULL);
+            } else {
+                d = 0;
+            }
         } else {
             d = 0;
         }

You'll have to explain to me if and WHY this works. In some other fashion than saying "it's from Qt which has a similar pointer concept". Similar, but not identical.

Have you considered what happens if the corresponding RefCounted destructor or SharedCount destructor even more specifically happens to be executed in ALL points of the execution of that loop (including its condition)?

Didn't look at the test thoroughly yet. But as it's not deterministic, I want to make sure we first convince ourselves by thinking about the properties of the code that it should work. There's only a few lines of (the critical parts of) it anyway!
Comment 3 Andre Moreira Magalhaes 2011-12-02 08:12:08 UTC
(In reply to comment #2)
> -    QWeakPointer<Connection> conn;
> +    QPointer<Connection> conn;
> 
> What on earth are these? QPointer was wholly and completely deprecated already
> in Qt 4.6 or so. Strange if it wasn't gone completely by the time Qt5 final is
> out. Note that *we can still use QWeakPointer for tracking QObjects getting
> deleted*, or use our own WeakPtr, now reintroduced, for that, whatever.
> 
> Other than the deprecation, there's supposedly a performance advantage to using
> QWeakPointer instead of QPointer. Promotion to strong ref (Tp::SharedPtr) from
> a weak ref is going to be even faster with our own WeakPtr though, but checking
> if the object is still there is going to be faster with QWeakPointer (requires
> an implicit temporary promotion to strong ref with WeakPtr/SharedPtr). And weak
> ptr -> strong ptr is only thread safe with our own, but yeah for pointers which
> are completely hidden inside tp-qt4, that's not an issue.
> 
> So given those considerations, try and decide between QWeakPointer and
> Tp::WeakPtr as appropriate, but never use QPointer.
Yeah, as for QPointer, I don't know what I was thinking, but I should have kept using QWeakPointer. I will revert this specific changes. 
The thing is that some places, private class constructors mainly, cannot store a WeakPtr of the "parent" class, as our WeakPtr impl cannot be created from a bare pointer, it has to be created from a SharedPtr which would unref itself in the own constructor, thus deleting the object.
So yeah, I checked everything and the places using QPointer now have to be changed to use QWeakPointer but cannot use WeakPtr. I've changed all other places that can use WeakPtr safely to properly use it.

> +    }
> +    inline virtual ~RefCounted()
> +    {
> 
> empty line
Will do

> +
> +class TP_QT_EXPORT SharedCount
> +{
> 
> Make that a private class inside RefCounted (with friends as needed) or at
> least somehow hide it from Doxygen. Currently it's an exported public class in
> a public header... not desirable. (It needs to be exported though if it is
> created and destroyed by inline code)
Will see what I can do, but I think I prefer it to be hidden from doxygen, as well as the public "sc" var from Tp::RefCounted. We need them public as we cannot friend templates (SharedPtr<T>, WeakPtr<T>) from non-template classes. Or can we?
 
> -    explicit inline SharedPtr(const QWeakPointer<T> &o)
> +    explicit inline SharedPtr(const WeakPtr<T> &o)
> 
> Please document separately that we've removed the QWeakPointer constructor
> because it wasn't thread safe and if people are absolutely sure that their code
> doesn't EVER need multithreading for ANYTHING they can still construct
> SharedPtrs from them by getting the raw pointer.
So where to add this documentation? Or you mean a commit log, NEWS, or something?

> +        SharedCount *sc = o.sc;
> +        if (sc) {
> +            // increase the strongref, but never up from zero
> +            // or less (negative is used on untracked objects)
> +            register int tmp = sc->strongref.fetchAndAddOrdered(0);
> +            while (tmp > 0) {
> +                // try to increment from "tmp" to "tmp + 1"
> +                if (sc->strongref.testAndSetRelaxed(tmp, tmp + 1)) {
> +                    // succeeded
> +                    break;
> +                }
> +                // failed, try again
> +                tmp = sc->strongref.fetchAndAddOrdered(0);
> +            }
> +
> +            if (tmp > 0) {
> +                d = dynamic_cast<T*>(sc->d);
> +                Q_ASSERT(d != NULL);
> +            } else {
> +                d = 0;
> +            }
>          } else {
>              d = 0;
>          }
> 
> You'll have to explain to me if and WHY this works. In some other fashion than
> saying "it's from Qt which has a similar pointer concept". Similar, but not
> identical.
> 
> Have you considered what happens if the corresponding RefCounted destructor or
> SharedCount destructor even more specifically happens to be executed in ALL
> points of the execution of that loop (including its condition)?
> 
> Didn't look at the test thoroughly yet. But as it's not deterministic, I want
> to make sure we first convince ourselves by thinking about the properties of
> the code that it should work. There's only a few lines of (the critical parts
> of) it anyway!
As long as we have at least one WeakPtr that ever pointed to a valid (non-null) SharedPtr, "WeakPtr::sc" or "o.sc", will exist. So sc will either be null or non-null during the whole method, won't change due to threads.
If "sc" is non-null, let's see if it is still pointing to a valid data. In order to check we need to get the number of strongrefs and check if we can atomically increase from "current value" to "current value + 1", with "current value" being > 0 (at least one strongref before the test). testAndSetRelaxed will atomically test if the "current value" is still the same ("tmp") and if so increment it by 1 returning true, if not, nothing is changed, which means that on success we hold a strongref and are guaranteed to be pointing to a valid data.
Comment 4 Andre Moreira Magalhaes 2011-12-02 08:41:09 UTC
Branch updated.
Comment 5 Olli Salli 2011-12-04 13:59:39 UTC
(In reply to comment #3)
> The thing is that some places, private class constructors mainly, cannot store
> a WeakPtr of the "parent" class, as our WeakPtr impl cannot be created from a
> bare pointer, it has to be created from a SharedPtr which would unref itself in
> the own constructor, thus deleting the object.

But we CAN make weak pointers able to be created from bare pointers! Because we can make them do anything, as long as it is correct. Think about why e.g. boost::weak_ptr doesn't allow this. It is because if you promoted the weak ref thus created to a strong ref, there wouldn't be any way of digging up the correct refcount structure. So it's the same problem as that which prevents us from using boost or std::(tr1::) shared_ptr or QSharedPointer.

When you construct a boost::weak_ptr from a boost::shared_ptr, it takes a reference to the shared_count structure from a member variable of shared_ptr (perhaps through a protected/private friended getter, haven't looked). There is no way for it to look up it from the pointed-to object.

But!

We DO have a way to dig up the correct refcount structure from just a pointed-to object (reachable through the raw pointer)! Because our RefCounted auxiliary "mix-in" class stores a pointer to it. So we can make Tp::WeakPtr safely constructable from a raw pointer. It will just look up this pointer and initialize itself with it. The existence of the object the raw pointer points to guarantees that the SharedCount is still valid as well (because it holds the single ref).

Or is there some other issue I don't see?

> So yeah, I checked everything and the places using QPointer now have to be
> changed to use QWeakPointer but cannot use WeakPtr. I've changed all other
> places that can use WeakPtr safely to properly use it.

So when you've added that, please proceed with using WeakPtr even in those places, if it makes sense (it's a more common operation to promote to strong and use the pointed to object, than to check whether the pointed to object still exists). Or otherwise, please point out why specifically the WeakPtr we ourselves design shouldn't be constructable from a raw pointer :)

> > a public header... not desirable. (It needs to be exported though if it is
> > created and destroyed by inline code)
> Will see what I can do, but I think I prefer it to be hidden from doxygen, as
> well as the public "sc" var from Tp::RefCounted. We need them public as we
> cannot friend templates (SharedPtr<T>, WeakPtr<T>) from non-template classes.
> Or can we?
> 

I thought template <class T> friend class SharedPtr; worked just fine? i.e. the exact same syntax as for template class and function forward declarations.

> > -    explicit inline SharedPtr(const QWeakPointer<T> &o)
> > +    explicit inline SharedPtr(const WeakPtr<T> &o)
> > 
> > Please document separately that we've removed the QWeakPointer constructor
> > because it wasn't thread safe and if people are absolutely sure that their code
> > doesn't EVER need multithreading for ANYTHING they can still construct
> > SharedPtrs from them by getting the raw pointer.
> So where to add this documentation? Or you mean a commit log, NEWS, or
> something?

SharedPtr overall doxygen description and NEWS API changes section for 0.9.0 preferably. Commit log it should self-evidently be in - you've removed it in a commit, right? So the commit message should explain that.

> As long as we have at least one WeakPtr that ever pointed to a valid (non-null)
> SharedPtr, "WeakPtr::sc" or "o.sc", will exist. So sc will either be null or
> non-null during the whole method, won't change due to threads.

Right, existence of a WeakPtr alone keeps the SharedCount alive.

> If "sc" is non-null, let's see if it is still pointing to a valid data. In

If the RefCounted object is still valid - a pointer to which is inside the SharedCount

> order to check we need to get the number of strongrefs and check if we can

Mm, though I wonder whether there is some real reason you are using the fetch-and-add with *fully ordered* semantics there? The memory read to check the current value doesn't depend on anything other than the, well, current value. And the only negative way in which the current value might surprise us if it has gone to zero. But that can only happen through deref(), which makes sure that no other memory read after it reads the old value (implemented on e.g. x86/x86_64 using the LOCK prefix). And our test-and-set operation makes even that harmless... because if the refcount changes slightly before or after the read to check doesn't matter, the test and set op guarantees that only if that didn't happen *at all* is the refcount set to a new, higher value.

So I think that could safely be relaxed as well. But it seems that on x86 and x86_64 (only archs whose ASM I can understand) the implementation for ordered access is fast enough so that a separate relaxed implementation doesn't even exist in Qt. So let's keep the safe one (most guarantees, i.e. ordered).

This bit we *can* always change later if it proves to be a performance problem e.g. on some ARM architecture (and somebody explains why that is so :>) - as it operates on the same data, so nothing needs to be changed binary layout wise.

> atomically increase from "current value" to "current value + 1", with "current
> value" being > 0 (at least one strongref before the test). testAndSetRelaxed
> will atomically test if the "current value" is still the same ("tmp") and if so
> increment it by 1 returning true, if not, nothing is changed, which means that
> on success we hold a strongref and are guaranteed to be pointing to a valid
> data.

Yeah, would seem to me that it is correct indeed. Good job!

Now for the test attempting to trigger the potential races here. Overall I think you caught my idea pretty well, the layout and macro structure being pretty much exactly what I had in mind. However I have a few small suggestions

+            QVERIFY(!ptrtmp.isNull());
+
+            usleep(10);
+            DataPtr ptrtmp2(ptrtmp);

What are you trying to achieve with the sleeps? To reproduce the race conditions  even on a single logical processor system by forcing a context switch to a different thread? I don't think that's a very good idea. By doing this, the context switch points become very deterministic - when the OS resumes execution of the thread it will ALWAYS for sure execute it until the next sleep, because that's just a few dozen or maybe hundred instructions, but anyway well below any pre-emption timeslice. While the point of the repetitions, more so, could be to try and trigger races from a context switch sometimes happening at an unfavorable point of time - for that, we need variance on where it happens. Also you could increase the loop counts significantly while still running the loops in the same amount of wall clock time if there weren't sleeps, further helping in triggering races.

Also, because the Thread instance already holds a single strong ref, I think the run() method is too heavy on how it keeps the refs to trigger the most interesting cases. Because even within a single thread, a single run() loop iteration increases the strong refs held by that thread first to two before the first sleep, then to three before the second, and finally *four* for the single thread before dropping all of them when the loop iteration finishes. In particular because more than one ref is held during the sleeps, the only pre-emption points any practical OS kernel scheduler is going to give to this code, there is an awful lot of head room for the ref counts to momentarily screw up but still stay safely above zero in any thread executing at a given time.

Thus, more likely to trigger any races would be to 
1) ditch the sleeps
2) reset the source strong refs and weak refs after they've been copied to the next pointer - you're still supposed to be able to promote back to strong etc even from a single weak ref in the loop, because there is one strong ref as a member variable of the object

+    QVERIFY(!weakPtr.isNull());
+
+    for (int i = 0; i < 5; ++i) {
+        t[i] = new Thread(ptr, this);
+        t[i]->start();
+    }
+
+    ptr.reset();
+    QVERIFY(ptr.isNull());
+

Might be sensible to verify that neither weakPtr or ptr is null before the reset. Because start() is what, well, starts executing a thread, not join() (threads are not futures) so it's very likely that anything the threads cause to affect the refcounts are already done by the time you even do that reset(). Including screwing up refcounting and killing the object. (Would manifest as SIGSEGV or valgrind reported memory corruption more likely than assertion fail, though). You could also at that point again verify that ptr.data() still is equal to savedData.

So still to go:

1) make WeakPtr constructible from raw pointer, if there isn't some issue I'm overlooking, and use it wherever sensible
2) explain in the docs, NEWS API changes section and commit message why promotion from QWeakPointer is gone
3) ponder making SharedCount more of an internal implementation detail (compiler checkably) than a public class, because we *can* if my memory serves me correctly on the template <class X...> friend class Y; syntax
4) adjust tests
Comment 6 Olli Salli 2011-12-04 14:01:16 UTC
(In reply to comment #5)
> So it's the same problem as that which prevents us
> from using boost or std::(tr1::) shared_ptr or QSharedPointer.
> 

I mean in general, at all, because we occasionally need to construct strong pointers directly from raw pointers as well.
Comment 7 Andre Moreira Magalhaes 2011-12-05 07:50:30 UTC
(In reply to comment #5)
> 1) make WeakPtr constructible from raw pointer, if there isn't some issue I'm
> overlooking, and use it wherever sensible
Added

> 2) explain in the docs, NEWS API changes section and commit message why
> promotion from QWeakPointer is gone
Done in the docs (NEWS will be updated when merging), but I am not really happy with this, suggestions are welcome.

> 3) ponder making SharedCount more of an internal implementation detail
> (compiler checkably) than a public class, because we *can* if my memory serves
> me correctly on the template <class X...> friend class Y; syntax
Done

> 4) adjust tests
Done
Comment 8 Olli Salli 2011-12-06 07:12:33 UTC
(In reply to comment #7)
> (In reply to comment #5)
> > 1) make WeakPtr constructible from raw pointer, if there isn't some issue I'm
> > overlooking, and use it wherever sensible
> Added

Yeah looks good now.

> 
> > 2) explain in the docs, NEWS API changes section and commit message why
> > promotion from QWeakPointer is gone
> Done in the docs (NEWS will be updated when merging), but I am not really happy
> with this, suggestions are welcome.

Otherwise it's ~fine but this bit

+ * \brief The WeakPtr class holds a weak reference to a shared pointer.

is a bit weird. The reference is to a "object managed by SharedPtrs", not to a shared pointer (the SharedPtr).

You could also mention that it's useful for breaking reference cycles which would result from using a SharedPtr for both ends of a pair of mutually linked objects to refer to each other.

> 
> > 3) ponder making SharedCount more of an internal implementation detail
> > (compiler checkably) than a public class, because we *can* if my memory serves
> > me correctly on the template <class X...> friend class Y; syntax
> Done
> 

Mm, couldn't you further move the entire declaration of class SharedCount here

> class TP_QT_EXPORT RefCounted
> {
>     Q_DISABLE_COPY(RefCounted)
> 
> public:

Before the start of the public section? This would hide the class completely from external (unintended) usage. (Needs changing the three lines which refer to SharedCount as global in WeakPtr and SharedPtr of course in addition, but nothing else afaics?). Then the header would only declare RefCounted, WeakPtr and SharedPtr as public types, which is what we want.

> > 4) adjust tests
> Done

Looks good now IMO.

With these, merge please.
Comment 9 Andre Moreira Magalhaes 2011-12-06 08:23:52 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #5)
> > > 1) make WeakPtr constructible from raw pointer, if there isn't some issue I'm
> > > overlooking, and use it wherever sensible
> > Added
> 
> Yeah looks good now.
> 
> > 
> > > 2) explain in the docs, NEWS API changes section and commit message why
> > > promotion from QWeakPointer is gone
> > Done in the docs (NEWS will be updated when merging), but I am not really happy
> > with this, suggestions are welcome.
> 
> Otherwise it's ~fine but this bit
> 
> + * \brief The WeakPtr class holds a weak reference to a shared pointer.
> 
> is a bit weird. The reference is to a "object managed by SharedPtrs", not to a
> shared pointer (the SharedPtr).
> 
> You could also mention that it's useful for breaking reference cycles which
> would result from using a SharedPtr for both ends of a pair of mutually linked
> objects to refer to each other.
Done
 
> > 
> > > 3) ponder making SharedCount more of an internal implementation detail
> > > (compiler checkably) than a public class, because we *can* if my memory serves
> > > me correctly on the template <class X...> friend class Y; syntax
> > Done
> > 
> 
> Mm, couldn't you further move the entire declaration of class SharedCount here
> 
> > class TP_QT_EXPORT RefCounted
> > {
> >     Q_DISABLE_COPY(RefCounted)
> > 
> > public:
> 
> Before the start of the public section? This would hide the class completely
> from external (unintended) usage. (Needs changing the three lines which refer
> to SharedCount as global in WeakPtr and SharedPtr of course in addition, but
> nothing else afaics?). Then the header would only declare RefCounted, WeakPtr
> and SharedPtr as public types, which is what we want.
Done

> With these, merge please.
Thanks for the review, changes merged upstream. It will be in tp-qt 0.9.0

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.