Created attachment 19153 [details] [review] PoC patch If two threads are simultaneously sending messages to the bus it is possible for one to enter this function to add a timeout, drop the connection lock (setting timeouts to NULL) and for the other one to enter at the same time to remove a timeout and find that timeouts is NULL, thus not trying to remove the timeout. If this last timeout expires in the event loop, an invalid access is due, as the pending structure is already freed. The attached patch is an "elegant solution" (read: dirty, ugly, shameful hack) to test this condition.
This does look like a race condition.
This thread may be helpful: http://lists.freedesktop.org/archives/dbus/2007-July/008146.html As the thread and I think some comments in the code explain, protected_change_timeout is a broken hack to begin with. Note that any fixes will also apply to protected_change_watch()
Hmm, see also http://lists.freedesktop.org/archives/dbus/2008-January/009100.html
*** Bug 20495 has been marked as a duplicate of this bug. ***
It's time we applied this patch. The error is serious. See Qt bugs reported: http://bugreports.qt.nokia.com/browse/QTBUG-7475 http://bugreports.qt.nokia.com/browse/QTBUG-10751
I did a little investigating with Google Code Search, and while a lot of things appear fine, e.g. this code would break with Havoc's patch: http://google.com/codesearch/p?hl=en#4FSOSMZ6Pxc/distfiles/Net-DBus-0.33.4.tar.gz%7CnuJfXtX-isw/Net-DBus-0.33.4/DBus.xs&q=dbus_connection_set_timeout_functions (For anyone who hasn't tried it, I highly recommend Code Search as a way to find out how people are actually using one's APIs in the real world)
The specific problem with the Perl bindings is that it calls dbus_connection_get_data, which tries to acquire the lock.
Almost everything else appears fine. I honestly don't have a strong opinion. Havoc knows the code here the best, so I'd be fine taking his patch and just saying the Perl bindings need to be patched for 1.4 (and doing so probably isn't hard). If we wanted to be maximally compatible we could make it some sort of opt-in behavior (dbus_connection_set_blocking_timeout_functions? very eww, I admit). In the big picture, I'm still really uncomfortable with these ad-hoc thread changes over time, because I'm not at all convinced that we're telling a coherent story about how things work to API consumers. If we need to e.g. change which thread things get called in in certain circumstances, it's going to be really painful.
Havoc introduced a separate lock for the get_data / set_data functions, so that those can be used from inside the timeout/watch callbacks. The perl bindings shouldn't break. I'm not sure I understand what you meant by "these ad-hoc thread changes".
(In reply to comment #9) > Havoc introduced a separate lock for the get_data / set_data functions, so that > those can be used from inside the timeout/watch callbacks. The perl bindings > shouldn't break. Ah, indeed. OK.
Could this patch be applied unconditional yet (In reply to comment #9) > Havoc introduced a separate lock for the get_data / set_data functions, so that > those can be used from inside the timeout/watch callbacks. The perl bindings > shouldn't break. > I do not see a patch in the last weeks from havoc in dbus-1.2 branch Could this patch be applied unconditional now or is still some "api fix enable" functions like dbus_set_api_policy(DBUS_API_THREAD_SAFETY_FIX,1) required, which have to be called from the related binding to activate this fix.
That would be bad API. The patch that is in this bug report is for the 1.3 branch. Forget 1.2 for it. For 1.2, I would recommend a separate configuration for the callback functions. Forget "policy" -- it doesn't make sense and the config is not global, it has to be per connection anyway.
patch applied to master branch
I thought you were going to use my patch to add a separate lock, instead of the loop-and-sleep hack
Please, take out that patch from the repository. Especially with my name on it. :-P That patch wasn't intended as a solution to the problem, only as a proof of concept that there is a problem related to the current locking/unlocking mechanism, as I told in my first comment. I guess the correct solution is Havoc's patch with a separate lock for the protected_change_* functions.
Patch reverted with e40c45fb4635321f17f3d6e4f9a4566fb62b3623
Ok, but will Havoc's patch make its way to 1.3 ?
Yes. I'm running the patch for a few days in my own computer under real-life situations to see if there are problems. Then I'll push it to 1.3.
Fix is committed as 6ff1d079316cb730a54b4e0e95bd3e6e31f439de (dbus-1.3.1-2-g6ff1d07)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.