|Summary:||dbus-daemon hangs while starting if users are in LDAP/NIS/etc.|
|Component:||core||Assignee:||D-Bus Maintainers <dbus>|
|Status:||RESOLVED MOVED||QA Contact:||D-Bus Maintainers <dbus>|
|Priority:||medium||CC:||jimc, msniko14, walters|
|i915 platform:||i915 features:|
Description BinLi 2010-06-02 03:11:28 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=502072 Please review this bug for more information. Description of problem: mesagebus service hangs on boot on system with ldap auth configured. Version-Release number of selected component (if applicable): dbus-0.60-7.2 kernel-2.6.15-1.1969_FC5 How reproducible: always Steps to Reproduce: 1.boot 2. 3. Actual results: hang until bored Expected results: fedora niceness Additional info: workaround is to remove the entry for ldap from the group line in /etc/nsswitch.conf not an acceptible long term solution.
Comment 1 BinLi 2010-06-02 03:18:03 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=502072 It's also this issue. And in SUSE Linux, I also found this issue. Any idea? Thanks!
Comment 2 Andreas Mueller 2010-06-02 03:55:24 UTC
Over on the redhat bugzilla, I provided the following analysis for redhat bug 182464, which is also has to do with this same bug. ------ copied from redhat bugzilla bug 182464 ------- The root cause apparently has not been investigated yet. Reading the source code of dbus-daemon has revealed the following: dbus-daemon reads all the groups of the user root when it parses the user="root" attributes in the configuration file. This triggers many ldap lookups, that trigger the exponential back off of the bind_policy hard setting in /etc/ldap.conf. So parsing the config file takes long, and dbus-daemon forks only after parsing the config. At that point, the boot continues. The point is that dbus-daemon has a logical error in it. It is not necessary to read the list of groups of a user ever. Such a list is dynamic, it changes when naming services become available, or when the ldap contents are changed. So dbus-daemon should rather check group memberships when it needs to, i.e. when it has to authorize a request. This could be done much more efficiently using the getgrent family of calls instead of the getgrouplist call dbus-daemon is currently using. So I propose that the upstream providers of dbus-daemon are contacted to get dbus-daemon fixed. Possible fixes; 1. quick and dirty: add an option to stop dbus-daemon from expanding group lists. 2. fix the logical error, don't use getgrouplist, check group membership late and rely on nscd's caching mechanism for performance. ------ end of copy ------ In addition to what BinLi reports, there _is_ a better workaround, although again an indesireable one: don't use "bind_policy hard" in /etc/ldap.conf, use "bind_policy soft" instead. This causes the ldap lookups to fail, so dbus-daemon will not get the LDAP groups but instead will quickly continue, allowing the boot to go forward.
Comment 3 Colin Walters 2010-06-02 06:44:07 UTC
No, we can't call getgroups() dynamically; that implies parsing /etc/group on every message the daemon processes. This is obviously even worse with LDAP. While I haven't measured it, I'm sure it would be noticeable overhead even in the non-LDAP case. The operating system needs a caching layer for this stuff. And it turns out one exists: https://fedoraproject.org/wiki/Features/SSSD Actually there are two things here: 1) Move system bus services to PolicyKit, and thus gradually phase out all dbus daemon authorization. Actually...an intermediate step here is to detect if any config file specifies group="". If not, then we don't call getgroups(). 2) Cache the groups, and get a notification from SSSD (over dbus even!) when the group list changes, and then do a reload.
Comment 4 Lennart Poettering 2010-09-06 10:34:42 UTC
(In reply to comment #3) > No, we can't call getgroups() dynamically; that implies parsing /etc/group on > every message the daemon processes. This is obviously even worse with LDAP. > While I haven't measured it, I'm sure it would be noticeable overhead even in > the non-LDAP case. Maybe do the compromise? Lazily calling getgroups(), only when needed but then cache it for later?
Comment 5 Simon McVittie 2013-08-27 15:20:07 UTC
*** Bug 66867 has been marked as a duplicate of this bug. ***
Comment 6 Simon McVittie 2013-08-27 15:26:23 UTC
(In reply to comment #4) > Maybe do the compromise? Lazily calling getgroups(), only when needed but > then cache it for later? I'd consider patches, but it sounds as though SSSD is a better solution to the problem of slow/potentially-offline NIS and LDAP than we're going to be able to NIH in libdbus. (In reply to comment #3) > Actually...an intermediate step here is to > detect if any config file specifies group="". If not, then we don't call > getgroups(). I'd certainly consider patches for that - it sounds relatively unintrusive and moves us towards where we think we ought to be anyway.
Comment 7 GitLab Migration User 2018-10-12 21:06:54 UTC
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/27.