Bug 28355

Summary: dbus-daemon hangs while starting if users are in LDAP/NIS/etc.
Product: dbus Reporter: BinLi <binli>
Component: coreAssignee: D-Bus Maintainers <dbus>
Status: RESOLVED MOVED QA Contact: D-Bus Maintainers <dbus>
Severity: normal    
Priority: medium CC: jimc, msniko14, walters
Version: 1.5   
Hardware: All   
OS: All   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=182464
Whiteboard:
i915 platform: i915 features:

Description BinLi 2010-06-02 03:11:28 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=502072

Please review this bug for more information.

Description of problem:
mesagebus service hangs on boot on system with ldap auth configured.

Version-Release number of selected component (if applicable):
dbus-0.60-7.2
kernel-2.6.15-1.1969_FC5

How reproducible:
always

Steps to Reproduce:
1.boot
2.
3.
  
Actual results:
hang until bored

Expected results:
fedora niceness

Additional info:
workaround is to remove the entry for ldap from the group line in
/etc/nsswitch.conf not an acceptible long term solution.
Comment 1 BinLi 2010-06-02 03:18:03 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=502072
It's also this issue.

And in SUSE Linux, I also found this issue. Any idea? Thanks!
Comment 2 Andreas Mueller 2010-06-02 03:55:24 UTC
Over on the redhat bugzilla, I provided the following analysis for
redhat bug 182464, which is also has to do with this same bug.

------ copied from redhat bugzilla bug 182464 -------

The root cause apparently has not been investigated yet. Reading the
source code of dbus-daemon has revealed the following:

dbus-daemon reads all the groups of the user root when it parses
the user="root" attributes in the configuration file. This triggers
many ldap lookups, that trigger the exponential back off of the
bind_policy hard setting in /etc/ldap.conf. So parsing the config
file takes long, and dbus-daemon forks only after parsing the config.
At that point, the boot continues.

The point is that dbus-daemon has a logical error in it. It is
not necessary to read the list of groups of a user ever. Such a
list is dynamic, it changes when naming services become available,
or when the ldap contents are changed. So dbus-daemon should rather
check group memberships when it needs to, i.e. when it has to
authorize a request. This could be done much more efficiently
using the getgrent family of calls instead of the getgrouplist
call dbus-daemon is currently using.

So I propose that the upstream providers of dbus-daemon are contacted
to get dbus-daemon fixed. Possible fixes;

1. quick and dirty: add an option to stop dbus-daemon from expanding
   group lists.

2. fix the logical error, don't use getgrouplist, check group membership
   late and rely on nscd's caching mechanism for performance.    

------ end of copy ------

In addition to what BinLi reports, there _is_ a better workaround, although
again an indesireable one: don't use "bind_policy hard" in /etc/ldap.conf,
use "bind_policy soft" instead. This causes the ldap lookups to fail, so
dbus-daemon will not get the LDAP groups but instead will quickly continue,
allowing the boot to go forward.
Comment 3 Colin Walters 2010-06-02 06:44:07 UTC
No, we can't call getgroups() dynamically; that implies parsing /etc/group on every message the daemon processes.  This is obviously even worse with LDAP.  While I haven't measured it, I'm sure it would be noticeable overhead even in the non-LDAP case.

The operating system needs a caching layer for this stuff.  And it turns out one exists:

https://fedoraproject.org/wiki/Features/SSSD

Actually there are two things here:

1) Move system bus services to PolicyKit, and thus gradually phase out all dbus daemon authorization.  Actually...an intermediate step here is to detect if any config file specifies group="".  If not, then we don't call getgroups().

2) Cache the groups, and get a notification from SSSD (over dbus even!) when the group list changes, and then do a reload.
Comment 4 Lennart Poettering 2010-09-06 10:34:42 UTC
(In reply to comment #3)
> No, we can't call getgroups() dynamically; that implies parsing /etc/group on
> every message the daemon processes.  This is obviously even worse with LDAP. 
> While I haven't measured it, I'm sure it would be noticeable overhead even in
> the non-LDAP case.

Maybe do the compromise? Lazily calling getgroups(), only when needed but then cache it for later?
Comment 5 Simon McVittie 2013-08-27 15:20:07 UTC
*** Bug 66867 has been marked as a duplicate of this bug. ***
Comment 6 Simon McVittie 2013-08-27 15:26:23 UTC
(In reply to comment #4)
> Maybe do the compromise? Lazily calling getgroups(), only when needed but
> then cache it for later?

I'd consider patches, but it sounds as though SSSD is a better solution to the problem of slow/potentially-offline NIS and LDAP than we're going to be able to NIH in libdbus.

(In reply to comment #3)
> Actually...an intermediate step here is to
> detect if any config file specifies group="".  If not, then we don't call
> getgroups().

I'd certainly consider patches for that - it sounds relatively unintrusive and moves us towards where we think we ought to be anyway.
Comment 7 GitLab Migration User 2018-10-12 21:06:54 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/27.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.