Please review this bug for more information.
Description of problem:
mesagebus service hangs on boot on system with ldap auth configured.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
hang until bored
workaround is to remove the entry for ldap from the group line in
/etc/nsswitch.conf not an acceptible long term solution.
It's also this issue.
And in SUSE Linux, I also found this issue. Any idea? Thanks!
Over on the redhat bugzilla, I provided the following analysis for
redhat bug 182464, which is also has to do with this same bug.
------ copied from redhat bugzilla bug 182464 -------
The root cause apparently has not been investigated yet. Reading the
source code of dbus-daemon has revealed the following:
dbus-daemon reads all the groups of the user root when it parses
the user="root" attributes in the configuration file. This triggers
many ldap lookups, that trigger the exponential back off of the
bind_policy hard setting in /etc/ldap.conf. So parsing the config
file takes long, and dbus-daemon forks only after parsing the config.
At that point, the boot continues.
The point is that dbus-daemon has a logical error in it. It is
not necessary to read the list of groups of a user ever. Such a
list is dynamic, it changes when naming services become available,
or when the ldap contents are changed. So dbus-daemon should rather
check group memberships when it needs to, i.e. when it has to
authorize a request. This could be done much more efficiently
using the getgrent family of calls instead of the getgrouplist
call dbus-daemon is currently using.
So I propose that the upstream providers of dbus-daemon are contacted
to get dbus-daemon fixed. Possible fixes;
1. quick and dirty: add an option to stop dbus-daemon from expanding
2. fix the logical error, don't use getgrouplist, check group membership
late and rely on nscd's caching mechanism for performance.
------ end of copy ------
In addition to what BinLi reports, there _is_ a better workaround, although
again an indesireable one: don't use "bind_policy hard" in /etc/ldap.conf,
use "bind_policy soft" instead. This causes the ldap lookups to fail, so
dbus-daemon will not get the LDAP groups but instead will quickly continue,
allowing the boot to go forward.
No, we can't call getgroups() dynamically; that implies parsing /etc/group on every message the daemon processes. This is obviously even worse with LDAP. While I haven't measured it, I'm sure it would be noticeable overhead even in the non-LDAP case.
The operating system needs a caching layer for this stuff. And it turns out one exists:
Actually there are two things here:
1) Move system bus services to PolicyKit, and thus gradually phase out all dbus daemon authorization. Actually...an intermediate step here is to detect if any config file specifies group="". If not, then we don't call getgroups().
2) Cache the groups, and get a notification from SSSD (over dbus even!) when the group list changes, and then do a reload.
(In reply to comment #3)
> No, we can't call getgroups() dynamically; that implies parsing /etc/group on
> every message the daemon processes. This is obviously even worse with LDAP.
> While I haven't measured it, I'm sure it would be noticeable overhead even in
> the non-LDAP case.
Maybe do the compromise? Lazily calling getgroups(), only when needed but then cache it for later?
*** Bug 66867 has been marked as a duplicate of this bug. ***
(In reply to comment #4)
> Maybe do the compromise? Lazily calling getgroups(), only when needed but
> then cache it for later?
I'd consider patches, but it sounds as though SSSD is a better solution to the problem of slow/potentially-offline NIS and LDAP than we're going to be able to NIH in libdbus.
(In reply to comment #3)
> Actually...an intermediate step here is to
> detect if any config file specifies group="". If not, then we don't call
I'd certainly consider patches for that - it sounds relatively unintrusive and moves us towards where we think we ought to be anyway.
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/27.