Bug 28355 - dbus-daemon hangs while starting if users are in LDAP/NIS/etc.
Summary: dbus-daemon hangs while starting if users are in LDAP/NIS/etc.
Status: REOPENED
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: 1.5
Hardware: All All
: medium normal
Assignee: D-Bus Maintainers
QA Contact: D-Bus Maintainers
URL:
Whiteboard:
Keywords:
: 66867 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-06-02 03:11 UTC by BinLi
Modified: 2014-09-25 14:51 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description BinLi 2010-06-02 03:11:28 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=502072

Please review this bug for more information.

Description of problem:
mesagebus service hangs on boot on system with ldap auth configured.

Version-Release number of selected component (if applicable):
dbus-0.60-7.2
kernel-2.6.15-1.1969_FC5

How reproducible:
always

Steps to Reproduce:
1.boot
2.
3.
  
Actual results:
hang until bored

Expected results:
fedora niceness

Additional info:
workaround is to remove the entry for ldap from the group line in
/etc/nsswitch.conf not an acceptible long term solution.
Comment 1 BinLi 2010-06-02 03:18:03 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=502072
It's also this issue.

And in SUSE Linux, I also found this issue. Any idea? Thanks!
Comment 2 Andreas Mueller 2010-06-02 03:55:24 UTC
Over on the redhat bugzilla, I provided the following analysis for
redhat bug 182464, which is also has to do with this same bug.

------ copied from redhat bugzilla bug 182464 -------

The root cause apparently has not been investigated yet. Reading the
source code of dbus-daemon has revealed the following:

dbus-daemon reads all the groups of the user root when it parses
the user="root" attributes in the configuration file. This triggers
many ldap lookups, that trigger the exponential back off of the
bind_policy hard setting in /etc/ldap.conf. So parsing the config
file takes long, and dbus-daemon forks only after parsing the config.
At that point, the boot continues.

The point is that dbus-daemon has a logical error in it. It is
not necessary to read the list of groups of a user ever. Such a
list is dynamic, it changes when naming services become available,
or when the ldap contents are changed. So dbus-daemon should rather
check group memberships when it needs to, i.e. when it has to
authorize a request. This could be done much more efficiently
using the getgrent family of calls instead of the getgrouplist
call dbus-daemon is currently using.

So I propose that the upstream providers of dbus-daemon are contacted
to get dbus-daemon fixed. Possible fixes;

1. quick and dirty: add an option to stop dbus-daemon from expanding
   group lists.

2. fix the logical error, don't use getgrouplist, check group membership
   late and rely on nscd's caching mechanism for performance.    

------ end of copy ------

In addition to what BinLi reports, there _is_ a better workaround, although
again an indesireable one: don't use "bind_policy hard" in /etc/ldap.conf,
use "bind_policy soft" instead. This causes the ldap lookups to fail, so
dbus-daemon will not get the LDAP groups but instead will quickly continue,
allowing the boot to go forward.
Comment 3 Colin Walters 2010-06-02 06:44:07 UTC
No, we can't call getgroups() dynamically; that implies parsing /etc/group on every message the daemon processes.  This is obviously even worse with LDAP.  While I haven't measured it, I'm sure it would be noticeable overhead even in the non-LDAP case.

The operating system needs a caching layer for this stuff.  And it turns out one exists:

https://fedoraproject.org/wiki/Features/SSSD

Actually there are two things here:

1) Move system bus services to PolicyKit, and thus gradually phase out all dbus daemon authorization.  Actually...an intermediate step here is to detect if any config file specifies group="".  If not, then we don't call getgroups().

2) Cache the groups, and get a notification from SSSD (over dbus even!) when the group list changes, and then do a reload.
Comment 4 Lennart Poettering 2010-09-06 10:34:42 UTC
(In reply to comment #3)
> No, we can't call getgroups() dynamically; that implies parsing /etc/group on
> every message the daemon processes.  This is obviously even worse with LDAP. 
> While I haven't measured it, I'm sure it would be noticeable overhead even in
> the non-LDAP case.

Maybe do the compromise? Lazily calling getgroups(), only when needed but then cache it for later?
Comment 5 Simon McVittie 2013-08-27 15:20:07 UTC
*** Bug 66867 has been marked as a duplicate of this bug. ***
Comment 6 Simon McVittie 2013-08-27 15:26:23 UTC
(In reply to comment #4)
> Maybe do the compromise? Lazily calling getgroups(), only when needed but
> then cache it for later?

I'd consider patches, but it sounds as though SSSD is a better solution to the problem of slow/potentially-offline NIS and LDAP than we're going to be able to NIH in libdbus.

(In reply to comment #3)
> Actually...an intermediate step here is to
> detect if any config file specifies group="".  If not, then we don't call
> getgroups().

I'd certainly consider patches for that - it sounds relatively unintrusive and moves us towards where we think we ought to be anyway.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.