Bug 98521

Summary: dbus-python: Segmentation fault v 1.2.4
Product: dbus Reporter: Tony Asleson <tasleson>
Component: pythonAssignee: Simon McVittie <smcv>
Status: RESOLVED MOVED QA Contact: D-Bus Maintainers <dbus>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Core file
All thread back trace

Description Tony Asleson 2016-10-31 20:40:35 UTC
Created attachment 127653 [details]
Core file

While running the unit tests:

https://git.fedorahosted.org/cgit/lvm2.git/tree/test/dbus

for:

https://git.fedorahosted.org/cgit/lvm2.git/tree/daemons/lvmdbusd

if I run the unit test in a loop, usually with a count <= 20 the dbus service will seg. fault.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  DBusPyConnection_GetObjectPathHandlers (self=self@entry=0x7fe326f47990, path=path@entry=0x7fe314cc70b0) at ../_dbus_bindings/conn.c:141
141	    Py_INCREF(callbacks);

bt
#0  0x00007efc180dc9c2 in DBusPyConnection_GetObjectPathHandlers (self=self@entry=<SystemBus(_bus_type=1, _signals_lock=<_thread.lock at remote 0x7efc2478e7d8>, _bus_names=<WeakValueDictionary(_pending_removals=[], _iterating=set(), _remove=<function at remote 0x7efc246dc6a8>, data={'com.redhat.lvmdbus1': <KeyedRef at remote 0x7efc14714bf8>}) at remote 0x7efc14725160>, _signal_recipients_by_object_path={}, _dbus_Connection_initialized=1, _Connection__call_on_disconnection=[], _signal_sender_matches={}) at remote 0x7efc246dfa40>, path=path@entry=b'com.redhat.lvmdbus1.Snapshot') at ../_dbus_bindings/conn.c:141
#1  0x00007efc180dd490 in _object_path_message (conn=<optimized out>, message=message@entry=0x5608172c8f50, user_data=user_data@entry=0x7efc147329b0) at ../_dbus_bindings/conn-methods.c:98
#2  0x00007efc17ea7a33 in _dbus_object_tree_dispatch_and_unlock (tree=0x56081706b730, message=message@entry=0x5608172c8f50, found_object=found_object@entry=0x7fff5c0dd648) at ../../dbus/dbus-object-tree.c:1020
#3  0x00007efc17e990a4 in dbus_connection_dispatch (connection=connection@entry=0x5608172c46d0) at ../../dbus/dbus-connection.c:4744
#4  0x00007efc15af38d5 in message_queue_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at dbus-gmain.c:90
#5  0x00007efc155d26ba in g_main_context_dispatch (context=0x5608172c62f0) at gmain.c:3154
#6  0x00007efc155d26ba in g_main_context_dispatch (context=context@entry=0x5608172c62f0) at gmain.c:3769
#7  0x00007efc155d2a70 in g_main_context_iterate (context=0x5608172c62f0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3840
#8  0x00007efc155d2d92 in g_main_loop_run (loop=0x5608172c4bb0) at gmain.c:4034
#9  0x00007efc1c3ecc58 in ffi_call_unix64 () at ../src/x86/unix64.S:76
#10 0x00007efc1c3ec6ba in ffi_call (cif=cif@entry=0x5608172e4808, fn=<optimized out>, rvalue=<optimized out>, rvalue@entry=0x7fff5c0dd970, avalue=0x5608171fb418) at ../src/x86/ffi64.c:525
#11 0x00007efc14d9cf4c in pygi_invoke_c_callable (function_cache=0x5608172e4760, state=<optimized out>, py_args=<optimized out>, py_kwargs=<optimized out>) at pygi-invoke.c:674
#12 0x00007efc14d9ea1a in pygi_function_cache_invoke (function_cache=<optimized out>, py_args=py_args@entry=(<MainLoop(_signal_source=4) at remote 0x7efc1470f8e8>,), py_kwargs=<optimized out>) at pygi-cache.c:859
#13 0x00007efc14d9d6a8 in pygi_callable_info_invoke (info=<optimized out>, py_args=py_args@entry=(<MainLoop(_signal_source=4) at remote 0x7efc1470f8e8>,), kwargs=<optimized out>, cache=<optimized out>, user_data=user_data@entry=0x0)
    at pygi-invoke.c:717
#14 0x00007efc14d9d6dc in _wrap_g_callable_info_invoke (self=<optimized out>, py_args=py_args@entry=(<MainLoop(_signal_source=4) at remote 0x7efc1470f8e8>,), kwargs=<optimized out>) at pygi-invoke.c:754
#15 0x00007efc14d92619 in _callable_info_call (self=0x7efc14741c70, args=(), kwargs=0x0) at pygi-info.c:557
#16 0x00007efc241d7c67 in PyObject_Call (func=<gi.FunctionInfo at remote 0x7efc14741c70>, arg=<optimized out>, kw=<optimized out>) at /usr/src/debug/Python-3.5.1/Objects/abstract.c:2165
#17 0x00007efc2429267e in PyEval_EvalFrameEx (nk=<optimized out>, na=0, pp_stack=0x7fff5c0ddb80, func=<optimized out>) at /usr/src/debug/Python-3.5.1/Python/ceval.c:4916
#18 0x00007efc2429267e in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7fff5c0ddb80) at /usr/src/debug/Python-3.5.1/Python/ceval.c:4712
#19 0x00007efc2429267e in PyEval_EvalFrameEx (f=f@entry=Frame 0x7efc147483c8, for file /usr/lib64/python3.5/site-packages/gi/overrides/GLib.py, line 573, in run (self=<MainLoop(_signal_source=4) at remote 0x7efc1470f8e8>), throwflag=throwflag@entry=0) at /usr/src/debug/Python-3.5.1/Python/ceval.c:3209
#20 0x00007efc24297753 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0x56081725b4f0, argcount=1, kws=0x56081725b4f8, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=(<cell at remote 0x7efc14fd83d8>,), name='run', qualname='MainLoop.run') at /usr/src/debug/Python-3.5.1/Python/ceval.c:3995
#21 0x00007efc24293e19 in PyEval_EvalFrameEx (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=0x7fff5c0ddd90, func=<optimized out>) at /usr/src/debug/Python-3.5.1/Python/ceval.c:4793
#22 0x00007efc24293e19 in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7fff5c0ddd90) at /usr/src/debug/Python-3.5.1/Python/ceval.c:4710
#23 0x00007efc24293e19 in PyEval_EvalFrameEx (f=f@entry=Frame 0x56081725b338, for file /home/tasleson/projects/lvm2/daemons/lvmdbusd/main.py, line 190, in main (start=<float at remote 0x7efc1bd530c0>, parser=<ArgumentParser(prog='lvmdbusd', usage=None, epilog=None, _mutually_exclusive_groups=[], _negative_number_matcher=<_sre.SRE_Pattern at remote 0x7efc147a63d8>, _defaults={}, formatter_class=<type at remote 0x5608172b6bf8>, allow_abbrev=True, _optionals=<_ArgumentGroup(prefix_chars='-', _group_actions=[<_HelpAction(dest='help', help='show this help message and exit', container=<...>, const=None, option_strings=['-h', '--help'], required=False, default='==SUPPRESS==', choices=None, metavar=None, type=None, nargs=0) at remote 0x7efc246d9080>, <_StoreTrueAction(dest='use_udev', help='Use udev for updating state', container=<...>, const=True, option_strings=['--udev'], required=False, default=False, choices=None, metavar=None, type=None, nargs=0) at remote 0x7efc246e5d68>, <_StoreTrueAction(dest='debug', help='Dump debug messages', container=<...>, const=True, optio...(truncated), throwflag=throwflag@entry=0) at /usr/src/debug/Python-3.5.1/Python/ceval.c:3209
#24 0x00007efc24295d3b in PyEval_EvalFrameEx (nk=<optimized out>, na=<optimized out>, n=0, pp_stack=0x7fff5c0dded0, func=<optimized out>) at /usr/src/debug/Python-3.5.1/Python/ceval.c:4783
#25 0x00007efc24295d3b in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7fff5c0dded0) at /usr/src/debug/Python-3.5.1/Python/ceval.c:4710
#26 0x00007efc24295d3b in PyEval_EvalFrameEx (f=f@entry=Frame 0x7efc247f8828, for file ./lvmdbusd, line 16, in <module> (), throwflag=throwflag@entry=0) at /usr/src/debug/Python-3.5.1/Python/ceval.c:3209
#27 0x00007efc24297753 in _PyEval_EvalCodeWithName (_co=_co@entry=<code at remote 0x7efc24762930>, globals=globals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, locals=locals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, args=args@entry=0x0, argcount=argcount@entry=0, kws=kws@entry=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at /usr/src/debug/Python-3.5.1/Python/ceval.c:3995
#28 0x00007efc24297833 in PyEval_EvalCodeEx (_co=_co@entry=<code at remote 0x7efc24762930>, globals=globals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, locals=locals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, args=args@entry=0x0, argcount=argcount@entry=0, kws=kws@entry=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at /usr/src/debug/Python-3.5.1/Python/ceval.c:4016
#29 0x00007efc2429785b in PyEval_EvalCode (co=co@entry=<code at remote 0x7efc24762930>, globals=globals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, locals=locals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}) at /usr/src/debug/Python-3.5.1/Python/ceval.c:797
#30 0x00007efc242b6d84 in run_mod (mod=mod@entry=0x560816fe3ab0, filename=filename@entry='./lvmdbusd', globals=globals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, locals=locals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, flags=flags@entry=0x7fff5c0de1c0, arena=arena@entry=0x560816fcece0)
    at /usr/src/debug/Python-3.5.1/Python/pythonrun.c:970
#31 0x00007efc242b9325 in PyRun_FileExFlags (fp=fp@entry=0x560816f71f60, filename_str=filename_str@entry=0x7efc247ae050 "./lvmdbusd", start=start@entry=257, globals=globals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, locals=locals@entry={'sys': <module at remote 0x7efc247c3e08>, '__builtins__': <module at remote 0x7efc248335e8>, '__file__': './lvmdbusd', '__name__': '__main__', '__doc__': None, '__spec__': None, '__cached__': None, 'main': <function at remote 0x7efc14723b70>, '__loader__': <SourceFileLoader(name='__main__', path='./lvmdbusd') at remote 0x7efc24758128>, '__package__': None}, closeit=closeit@entry=1, flags=0x7fff5c0de1c0)
    at /usr/src/debug/Python-3.5.1/Python/pythonrun.c:923
#32 0x00007efc242b9495 in PyRun_SimpleFileExFlags (fp=fp@entry=0x560816f71f60, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fff5c0de1c0) at /usr/src/debug/Python-3.5.1/Python/pythonrun.c:396
#33 0x00007efc242b9903 in PyRun_AnyFileExFlags (fp=fp@entry=0x560816f71f60, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fff5c0de1c0) at /usr/src/debug/Python-3.5.1/Python/pythonrun.c:80
#34 0x00007efc242d002c in Py_Main (p_cf=0x7fff5c0de1c0, filename=0x560816f0f260 L"./lvmdbusd", fp=0x560816f71f60) at /usr/src/debug/Python-3.5.1/Modules/main.c:318
#35 0x00007efc242d002c in Py_Main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/Python-3.5.1/Modules/main.c:768
#36 0x0000560815813b70 in main (argc=2, argv=<optimized out>) at /usr/src/debug/Python-3.5.1/Programs/python.c:69

From looking around in the core I see that:

(gdb) print callbacks
$1 = 0x0
(gdb) whatis callbacks
type = PyObject *

Which is quite interesting considering the code explicitly checks for
NULL on line 135.

127 /* On KeyError or if unregistration is in progress, return None. */
128 PyObject *
129 DBusPyConnection_GetObjectPathHandlers(PyObject *self, PyObject *path)
130 {
131     PyObject *callbacks;
132
133     TRACE(self);
134     callbacks = PyDict_GetItem(((Connection *)self)->object_paths,path);
135     if (!callbacks) {
136         if (PyErr_ExceptionMatches(PyExc_KeyError)) {
137             PyErr_Clear();
138             Py_RETURN_NONE;
139         }
140     }
141     Py_INCREF(callbacks);
142     return callbacks;
143 }

Specifics (Fedora 24 x86_64):
dbus-python: py3 1.2.4
Python: 3.5.1
Comment 1 Tony Asleson 2016-11-02 19:12:13 UTC
From looking at the core file some more I can see that we have one thread that removed a dbus object and then was in the middle of sending a signal to announce that the object was removed (object manager).  In another thread that seg. faulted we appear to be trying to reference the same dbus object that was removed by the other thread.  However, I'm still at a loss how a race condition between threads could cause this, as the variable in question is on the stack.

However, to mitigate this issue I've come up with a change to the dbus service where all dbus library calls are executed in the main event loop thread.  With this change the service has run > 12 hours continuously in a loop without crashing.
Comment 2 Simon McVittie 2016-11-02 19:21:29 UTC
(In reply to Tony Asleson from comment #1)
> From looking at the core file some more I can see that we have one thread
> that removed a dbus object and then was in the middle of sending a signal to
> announce that the object was removed (object manager).  In another thread
> that seg. faulted we appear to be trying to reference the same dbus object
> that was removed by the other thread.  However, I'm still at a loss how a
> race condition between threads could cause this, as the variable in question
> is on the stack.

Interesting. Please post a backtrace from all threads, "thread apply all bt"? (I am not running Fedora 24, so the core dump is not directly useful to me.)
Comment 3 Tony Asleson 2016-11-02 20:09:52 UTC
Created attachment 127703 [details]
All thread back trace

This is from a different core file, but one that I have debug logs from service too.
Comment 4 Tony Asleson 2016-11-02 20:15:31 UTC
(In reply to Simon McVittie from comment #2)
> Interesting. Please post a backtrace from all threads, "thread apply all
> bt"? (I am not running Fedora 24, so the core dump is not directly useful to
> me.)

I've added the bt from all threads for a different core file as a file attachment.  Using this as I have some application debug logs which go with it for more context.

The last bit of application logs:

Nov 01 10:24:15.044272: 1454:1454 - SIGNAL: InterfacesRemoved(/com/redhat/lvmdbus1/Job/12, ['com.redhat.lvmdbus1.Job'])
Nov 01 10:24:15.044629: 1454:1454 - Removing /com/redhat/lvmdbus1/Job/12 object complete!
Nov 01 10:24:15.475326: 1454:1454 - SIGNAL: InterfacesAdded(/com/redhat/lvmdbus1/Job/13, {'com.redhat.lvmdbus1.Job': {'Percent': dbus.Double(0.0), 'Complete': dbus.Boolean(False), 'GetError': dbus.Struct((-1, 'Job is not complete!'), signature=dbus.Signature('(is)')), 'Result': dbus.ObjectPath('/')}})
Nov 01 10:24:15.475879: 1454:1468 - Running method: <function merge at 0x7f29fdaa6d90> with args ('com.redhat.lvmdbus1.Snapshot', dbus.String('JKQOMV-Cdvg-3sQ8-eXPB-N7Lq-xcMn-Lt9mVA'), 'mvgzobfg_vg/wwszcyej_lv_snap', dbus.Dictionary({}, signature=dbus.Signature('sv')), <lvmdbusd.job.JobState object at 0x7f29f6477b00>)
Nov 01 10:24:15.481401: 1454:2368 - Background process for ['/usr/sbin/lvm', 'lvconvert', '--merge', '-i', '1', 'mvgzobfg_vg/wwszcyej_lv_snap'] is 2369
Nov 01 10:24:16.222282: 1454:1467 - Removing thread: thread job.Wait: /com/redhat/lvmdbus1/Job/12
Nov 01 10:24:16.222544: 1454:1467 - Removing thread: thread job.Wait: /com/redhat/lvmdbus1/Job/11
Nov 01 10:24:16.222732: 1454:1467 - Removing thread: thread job.Wait: /com/redhat/lvmdbus1/Job/10
Nov 01 10:24:17.857064: 1454:2368 - Background process 2369 complete!
Nov 01 10:24:17.857503: 1454:1468 - load entry
Nov 01 10:24:17.857961: 1454:1468 - lvmdb - refresh entry
Nov 01 10:24:17.868701: 1454:1454 - SIGNAL: InterfacesRemoved(/com/redhat/lvmdbus1/Job/13, ['com.redhat.lvmdbus1.Job'])
Nov 01 10:24:17.869170: 1454:1454 - Removing /com/redhat/lvmdbus1/Job/13 object complete!
Nov 01 10:24:18.176634: 1454:1468 - lvmdb - refresh exit
Nov 01 10:24:18.179295: 1454:1468 - SIGNAL: PropertiesChanged(/com/redhat/lvmdbus1/Pv/3, com.redhat.lvmdbus1.Pv, {'PeSegments': dbus.Array([('0', '4'), ('4', '4287')], signature=dbus.Signature('(tt)')), 'UsedBytes': dbus.UInt64(16777216), 'Lv': dbus.Array([('/com/redhat/lvmdbus1/Lv/8', [('0', '3', 'linear')])], signature=dbus.Signature('(oa(tts))')), 'FreeBytes': dbus.UInt64(17980981248), 'PeAllocCount': dbus.UInt64(4)}, [])
Nov 01 10:24:18.180491: 1454:1468 - SIGNAL: PropertiesChanged(/com/redhat/lvmdbus1/Vg/4, com.redhat.lvmdbus1.Vg, {'SnapCount': dbus.UInt64(0), 'FreeCount': dbus.UInt64(17160), 'FreeBytes': dbus.UInt64(71974256640), 'LvCount': dbus.UInt64(1), 'Lvs': dbus.Array(['/com/redhat/lvmdbus1/Lv/8'], signature=dbus.Signature('o')), 'Seqno': dbus.UInt64(7)}, [])
Nov 01 10:24:18.181863: 1454:1468 - SIGNAL: PropertiesChanged(/com/redhat/lvmdbus1/Lv/8, com.redhat.lvmdbus1.LvCommon, {'TargetType': dbus.Struct(('-', 'Unspecified'), signature=dbus.Signature('(ss)')), 'Attr': dbus.String('-wi-a-----'), 'VolumeType': dbus.Struct(('-', 'Unspecified'), signature=dbus.Signature('as')), 'Roles': dbus.Array(['public'], signature=dbus.Signature('s'))}, [])
Nov 01 10:24:18.182213: 1454:1468 - SIGNAL: InterfacesRemoved(/com/redhat/lvmdbus1/Lv/9, ['com.redhat.lvmdbus1.LvCommon', 'com.redhat.lvmdbus1.Lv', 'com.redhat.lvmdbus1.Snapshot'])
Segmentation fault (core dumped)


The pid:tid is in each debug message. tid 1468 (gdb thread 5) which is broadcasting some signals and then the object removal when tid: 1454 seg. faults (gdb thread 1).
Comment 5 GitLab Migration User 2018-08-22 22:04:33 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus-python/issues/8.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.