Bug 51657 - dbus_bus_get cause crash when start using dbus application in early start stage
Summary: dbus_bus_get cause crash when start using dbus application in early start stage
Status: RESOLVED FIXED
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Simon McVittie
QA Contact: John (J5) Palmieri
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-02 09:34 UTC by Weng Xuetian
Modified: 2012-07-03 12:13 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
DBusTransport: do not assert that autolaunch address is non-empty (1.03 KB, patch)
2012-07-03 01:31 UTC, Simon McVittie
Details | Splinter Review
Set enable-developer default to 'no' (1.21 KB, patch)
2012-07-03 08:26 UTC, Simon McVittie
Details | Splinter Review

Description Weng Xuetian 2012-07-02 09:34:30 UTC
dbus-1.6.2 on debian sid.

This might be a special case, an application using dbus and is started before dbus-daemon started.

downgrade to 1.6.0 will fix this.

related backtrace:

#4  0x00007fef749c8475 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fef749cb6f0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007fef733ecc85 in _dbus_abort () at ../../dbus/dbus-sysdeps.c:94
#7  0x00007fef733dbd21 in _dbus_real_assert (func=<optimized out>, line=256, 
    file=0x7fef733f8209 "../../dbus/dbus-transport.c", 
    condition_text=0x7fef733f823b "*address != '\\0'", 
    condition=<optimized out>) at ../../dbus/dbus-internals.c:931
#8  _dbus_real_assert (condition=<optimized out>, 
    condition_text=condition_text@entry=0x7fef733f823b "*address != '\\0'", 
    file=file@entry=0x7fef733f8209 "../../dbus/dbus-transport.c", 
    line=line@entry=256, func=func@entry=0x7fef733f84a3 "check_address")
    at ../../dbus/dbus-internals.c:921
#9  0x00007fef733d69de in check_address (error=0x7fff85e60490, 
    address=0x137d280 "") at ../../dbus/dbus-transport.c:256
#10 _dbus_transport_new_for_autolaunch (error=0x7fff85e60490, 
    scope=<optimized out>) at ../../dbus/dbus-transport.c:299
#11 _dbus_transport_open_autolaunch (error=0x7fff85e60490, 
    transport_p=0x7fff85e60488, entry=0x137cff0)
    at ../../dbus/dbus-transport.c:324
#12 _dbus_transport_open_autolaunch (entry=0x137cff0, 
    transport_p=0x7fff85e60488, error=0x7fff85e60490)
    at ../../dbus/dbus-transport.c:311
#13 0x00007fef733d6582 in _dbus_transport_open (entry=0x137cff0, 
    error=error@entry=0x7fff85e60510) at ../../dbus/dbus-transport.c:392
#14 0x00007fef733b7842 in connection_try_from_address_entry (
    error=0x7fff85e60510, entry=<optimized out>)
    at ../../dbus/dbus-connection.c:1801
#15 _dbus_connection_open_internal (address=<optimized out>, 
    shared=shared@entry=1, error=error@entry=0x7fff85e60640)
    at ../../dbus/dbus-connection.c:1870
#16 0x00007fef733b8004 in dbus_connection_open (address=<optimized out>, 
    error=error@entry=0x7fff85e60640) at ../../dbus/dbus-connection.c:2580
#17 0x00007fef733b2823 in internal_bus_get (type=DBUS_BUS_SESSION, private=0, 
    error=0x7fff85e60640) at ../../dbus/dbus-bus.c:480
Comment 1 Simon McVittie 2012-07-03 01:19:15 UTC
(In reply to comment #0)
> This might be a special case, an application using dbus and is started before
> dbus-daemon started.

How, exactly, can I reproduce this?

Is it an application run from Xsession.d before 75dbus_dbus-launch, for instance?
Comment 2 Simon McVittie 2012-07-03 01:29:40 UTC
> #8  _dbus_real_assert (condition=<optimized out>, 
>     condition_text=condition_text@entry=0x7fef733f823b "*address != '\\0'",

This assertion is wrong: _dbus_get_autolaunch_address() can return an empty or invalid address, and it's up to its caller to validate it. The next line of code after the assertion is a call to dbus_parse_address() which will raise a graceful DBusError if it's empty, so the assertion can safely be removed.

That doesn't solve autolaunch apparently regressing, but it will avoid the crash.
Comment 3 Simon McVittie 2012-07-03 01:31:50 UTC
Created attachment 63751 [details] [review]
DBusTransport: do not assert that autolaunch address is  non-empty

dbus-launch can apparently return an empty address under certain
circumstances, and dbus_parse_address() in the next line will return
a nice DBusError for an empty address rather than aborting the process.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=51657
Bug-Debian: http://bugs.debian.org/680027
Comment 4 Simon McVittie 2012-07-03 01:55:25 UTC
(In reply to comment #0)
> This might be a special case, an application using dbus and is started before
> dbus-daemon started.

I'm afraid I'm going to need more information than that.

Which application is this?

How is the application started?

Does the application run with DISPLAY in its environment? What X server does DISPLAY point to?

If the application is started in an X session, what errors appear in your ~/.xsession-errors?

-----

I can't reproduce this by starting a new X server and running dbus-launch --autolaunch in a clean environment, like this:

Xephyr :2 &
env - DISPLAY=:2 dbus-launch --autolaunch `cat /var/lib/dbus/machine-id`

When I do that, I correctly get a new dbus-daemon for the X display, for instance:

DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-M9fLnj7H12,guid=abde0ce647027bf97b6a400d4ff2b189
DBUS_SESSION_BUS_PID=699
DBUS_SESSION_BUS_WINDOWID=2097153

and process 699 exits when I close the X server (Xephyr window).

Similarly, this works:

Xephyr :2 &
env - DISPLAY=:2 dbus-launch --autolaunch `cat /var/lib/dbus/machine-id` --binary-syntax | xxd

and so does this:

Xephyr :2 &
env - DISPLAY=:2 dbus-monitor
Comment 5 Simon McVittie 2012-07-03 01:58:52 UTC
(In reply to comment #4)
> I'm afraid I'm going to need more information than that.

One more question: what is in the file /var/lib/dbus/machine-id?
Comment 6 Weng Xuetian 2012-07-03 02:38:30 UTC
fcitx, started by /etc/X11/Xsession.d/80im-config_launch

related code is here,
https://github.com/fcitx/fcitx/blob/master/src/module/dbus/dbusstuff.c

assert happened at the very first dbus_bus_get call.

if the startup is delayed (say, manually sleep for seconds), it won't be able to reproduce this, so that's why I say in "early" stage.

It's not a dbus-daemon crash, dbus-daemon startup normally and if I try to start fcitx later in desktop, it's also ok.
Comment 7 Simon McVittie 2012-07-03 02:57:24 UTC
(In reply to comment #6)
> fcitx, started by /etc/X11/Xsession.d/80im-config_launch

Is the actual startup done by 80im-config_launch sourcing data/22_fcitx.rc in im-config, which basically runs "fcitx &"?

This startup sequence isn't ideal - fcitx starts before dbus-launch is actually run, so it will be trying to autolaunch a dbus-daemon itself, and it and any child processes won't have the DBUS_SESSION_BUS_ADDRESS in their environment. This is *meant* to work, but autolaunching is a bit shaky, so I'm not surprised there can be problems.

> if the startup is delayed (say, manually sleep for seconds), it won't be able
> to reproduce this, so that's why I say in "early" stage.

What exact patch did you use for this?

If you insert a sleep call into fcitx, dbus-launch will win the race and fcitx won't have to autolaunch a dbus-daemon.

> It's not a dbus-daemon crash, dbus-daemon startup normally and if I try to
> start fcitx later in desktop, it's also ok.

Right, when you've got to the desktop stage you'll have DBUS_SESSION_BUS_ADDRESS in your environment, so autolaunching will be unnecessary.
Comment 8 Simon McVittie 2012-07-03 03:23:51 UTC
Which display manager (gdm3/gdm/xdm/kdm/etc.), or are you using startx or something?
Comment 9 Weng Xuetian 2012-07-03 03:51:57 UTC
I totally agree your idea.

Well.. that's why I think from im-config to im-switch it have never been right. Though that's a different problem. So what you suggest is, make fcitx start later than dbus-launch? I will try to ask im-config maintainer to fix this.

But hey.. the bug is still there.. such assert() call is still not very friendly, dbus_error should be used in most case.

BTW, there is a way to disable autolaunch for dbus IIRC? (not sure compiling time or runtime time, opensuse seems don't have autolaunch enabled).
Comment 10 Weng Xuetian 2012-07-03 03:52:34 UTC
(In reply to comment #8)
> Which display manager (gdm3/gdm/xdm/kdm/etc.), or are you using startx or
> something?

gdm3 and startx both have problem.
Comment 11 Simon McVittie 2012-07-03 04:31:07 UTC
(In reply to comment #9)
> But hey.. the bug is still there.. such assert() call is still not very
> friendly, dbus_error should be used in most case.

Yes, I believe Attachment #63751 [details] should fix that.
Comment 12 Simon McVittie 2012-07-03 08:24:59 UTC
(In reply to comment #0)
> downgrade to 1.6.0 will fix this.

The reason why this appears to be a regression is that before 1.6.0, assertions were disabled, so this (wrong) assertion was not compiled.

In 1.6.2, assertions were meant to still be disabled by default, but were actually enabled by default due to incorrect punctuation in configure.ac.
Comment 13 Simon McVittie 2012-07-03 08:26:35 UTC
Created attachment 63771 [details] [review]
Set enable-developer default to 'no'

Misplaced [] and () led to enable_developer=no being part of the
option's documentation instead of actually being the default value.

Regression in 1.6.2, caused by #34671.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=51657
Bug-Debian: http://bugs.debian.org/680027
Comment 14 Simon McVittie 2012-07-03 10:37:54 UTC
Attachment #63771 [details] fixes the regression between 1.6.0 and 1.6.2 (assertions are off-by-default in 1.6.0 but are wrongly on-by-default in 1.6.2).

Attachment #63751 [details] fixes the incorrect assertion, which has been there all along. I haven't been able to reproduce this assertion failure on a Debian wheezy virtual machine upgraded to dbus 1.6.2-1, though.

I'll upload a new dbus version to Debian with Attachment #63751 [details], and --disable-developer (equivalent to Attachment #63771 [details]).

(In reply to comment #9)
> BTW, there is a way to disable autolaunch for dbus IIRC? (not sure compiling
> time or runtime time, opensuse seems don't have autolaunch enabled).

Yes, it is possible to disable autolaunch: "./configure --disable-x11-autolaunch", or configure without X11 support, or don't install the dbus-launch binary (on Debian, dbus-launch is in the dbus-x11 package).

(In reply to comment #9)
> Well.. that's why I think from im-config to im-switch it have never been right.
> Though that's a different problem. So what you suggest is, make fcitx start
> later than dbus-launch? I will try to ask im-config maintainer to fix this.

Let's discuss that on the Debian bug <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=680027> rather than here, since Xsession.d is rather Debian-specific.
Comment 15 David Zeuthen (not reading bugmail) 2012-07-03 10:37:57 UTC
Comment on attachment 63751 [details] [review]
DBusTransport: do not assert that autolaunch address is  non-empty

Review of attachment 63751 [details] [review]:
-----------------------------------------------------------------

This looks good to me.
Comment 16 David Zeuthen (not reading bugmail) 2012-07-03 10:41:51 UTC
Comment on attachment 63771 [details] [review]
Set enable-developer default to 'no'

Review of attachment 63771 [details] [review]:
-----------------------------------------------------------------

Looks good to me (now matches similar constructs in configure.ac)
Comment 17 Simon McVittie 2012-07-03 12:13:48 UTC
Fixed in git for 1.6.4


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.