Bug 16727 - dbus has problems with larger groups for policies ("Unknown group "dev-usb" in message bus configuration file")
Summary: dbus has problems with larger groups for policies ("Unknown group "dev-usb" i...
Status: RESOLVED FIXED
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: 1.2.x
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Havoc Pennington
QA Contact: John (J5) Palmieri
URL: http://bugs.debian.org/489738
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-07-16 03:07 UTC by Noèl Köthe
Modified: 2008-07-28 13:17 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
use sysconf to retrieve buffer size (4.79 KB, patch)
2008-07-23 07:10 UTC, Loïc Minier
Details | Splinter Review
Use sysconf to get initial buf size, then try exponential backoff if we get ERANGE (5.56 KB, patch)
2008-07-27 04:55 UTC, Marc Brockschmidt
Details | Splinter Review

Description Noèl Köthe 2008-07-16 03:07:02 UTC
Hello,

running dbus 1.2.1 on Debian lenny i386.

dbus cannot use/handle larger groups for policies:

I configured the group dev-usb (the group is in ldap via nsswitch) in 
/etc/dbus-1/system.d/hal.conf for a policy.

# dbus-daemon --system
Unknown group "dev-usb" in message bus configuration file
# getent group dev-usb
dev-usb:*:10110:usera,userb,userc,userd,...... (very long list with 1411
users)

I rebuild dbus with verbose-mode:

# DBUS_VERBOSE=1 dbus-daemon --system
4531: Allocated slot 0 on allocator 0xb7f13730 total 1 slots allocated 1 used
4531: /dev/urandom fd 3 opened
4531: Read 12 bytes from /dev/urandom
4531: file fd 3 opened
4531: file fd 4 opened
4531: No cache for UID 0
4531: No cache for user "root"
4531: file fd 4 opened
4531: No cache for user "haldaemon"
4531: No cache for user "root"
4531: No cache for groupname "powerdev"
4531: No cache for groupname "dev-usb"
Unknown group "dev-usb" in message bus configuration file
4531: No cache for user "root"
4531: file fd 4 opened
4531: No cache for user "root"
4531: file fd 4 opened
4531: No cache for user "avahi"
4531: No cache for user "root"
4531: No cache for groupname "netdev"
4531: listening on unix socket /var/run/dbus/system_bus_socket abstract=0
4531: socket fd 3 opened
4531: /dev/urandom fd 4 opened
4531: Read 12 bytes from /dev/urandom
4531: Initialized server on address unix:path=/var/run/dbus/system_bus_socket,guid=f9a7a7bedfd68fc32e9192e7487daeb3
4531: Adding a read watch on fd 3 using newly-set add watch function
4531: Failed to open directory /usr/local/share/dbus-1/system-services: Failed to read directory "/usr/local/share/dbus-1/system-services": No such file or directory
4531: Allocated slot 0 on allocator 0xb7f136e8 total 1 slots allocated 1 used
4531: No cache for user "messagebus"
4531: Forking and becoming daemon
4531: Becoming a daemon...
4531: chdir to /
4531: forking...
4532: in child, closing std file descriptors
4531: writing pid file /var/run/dbus/pid
4531: No pid pipe to write to
4531: parent exiting

I tested it again with smaller groups (all are ldap groups)
again and it worked (no error message when starting dbus):

group1:*:153:test-a2-3,test-ab01,test-a2-1,test-a2-2
(4 users, 40 characters)

group2:*:152:test-a1-1,test-a1-2,test-a1-3
(3 users, 29 characters)

domain-admins:*:...
(56 users, 519 characters)

The next larger group has 75 users (819 characters) and with this group
I get the error again.

I reported to Debian:

http://bugs.debian.org/489738

and found a similar bugreport in the gentoo bugtracking system:

http://bugs.gentoo.org/show_bug.cgi?id=225895
Comment 1 Havoc Pennington 2008-07-16 06:46:19 UTC
Seems less likely to be a dbus issue than an OS issue. dbus just calls the C library routines ...

Kind of tough to debug for most dbus devs, since it involves ldap and debian and so forth.
Comment 2 Noèl Köthe 2008-07-21 00:53:07 UTC
I can reproduce the described problem when using local groups instead of LDAP groups.

How to reproduce:

(as root)

# for i in `seq 1 100`;do adduser --disable-password --gecos testuser$i test$i;done

# addgroup testgroup

# for i in `seq 1 100`;do addgroup test$1 testgroup; done

add testgroup to a policy for something (/etc/dbus-1/system.d/hal.conf) and when I restart dbus again I get the error:

Unknown group "testgroup" in message bus configuration file

So its not related to LDAP and because of the description on Gentoo (http://bugs.gentoo.org/show_bug.cgi?id=225895) its not related to Debian IMHO.

With more than 100 users in this case it always reproduceable.

Can you reproduce the problem with these 4 steps?

Thank you.
Comment 3 Havoc Pennington 2008-07-21 03:49:32 UTC
Looking at the code just now, maybe it's simple - dbus-sysdeps-util-unix.c:fill_group_info():

#ifdef HAVE_POSIX_GETPWNAM_R

    if (group_c_str)
      result = getgrnam_r (group_c_str, &g_str, buf, sizeof (buf),
                           &g);
    else
      result = getgrgid_r (gid, &g_str, buf, sizeof (buf),
                           &g);
#else
    g = getgrnam_r (group_c_str, &g_str, buf, sizeof (buf));
    result = 0;
#endif /* !HAVE_POSIX_GETPWNAM_R */
    if (result == 0 && g == &g_str)
      {
        return fill_user_info_from_group (g, info, error);
      }
    else
      {
        dbus_set_error (error, _dbus_error_from_errno (errno),
                        "Group %s unknown or failed to look it up\n",
                        group_c_str ? group_c_str : "???");
        return FALSE;
      }
  }

getgrnam_r() will fail and return ERANGE if the buffer is too small, as it probably is in this case.
Comment 4 Noèl Köthe 2008-07-22 07:15:43 UTC
:) fine. You will fix this problem/bug?

Or do you need anything else (from me)?

Thank you.
Comment 5 Havoc Pennington 2008-07-22 07:40:55 UTC
I probably won't get to this bug personally, but one of the more active dbus maintainers might, or anyone affected by it is welcome to submit a patch ;-)
In any case it's tracked here until someone patches it.

Comment 6 Loïc Minier 2008-07-23 07:10:56 UTC
Created attachment 17838 [details] [review]
use sysconf to retrieve buffer size

Hi,

this an initial patch which I thought would have helped with the hardcoded buffer sizes, but reported said it didn't help.

Noèl, could you please update the VERBOSE run output with the patch applied?

Thanks,
Comment 7 Loïc Minier 2008-07-23 07:13:37 UTC
Noèl, I'm tempted to agree with Havoc's comment that the issue might be with ERANGE handling.  Do you think you could gdb to this point and check whether ERANGE is effectively returned?
Comment 8 Colin Walters 2008-07-24 10:30:35 UTC
On my system (Fedora/glibc-2.8-3.x86_64):

os.sysconf('SC_GETGR_R_SIZE_MAX') => 1024

glibc bug?  Or maybe we need to avoid sysconf, and handle ERANGE and double the buffer size each time.
Comment 9 Noèl Köthe 2008-07-25 00:56:07 UTC
(In reply to comment #7)
> Noèl, I'm tempted to agree with Havoc's comment that the issue might be with
> ERANGE handling.  Do you think you could gdb to this point and check whether
> ERANGE is effectively returned?
> 

Will try on Tuesday.
Comment 10 Marc Brockschmidt 2008-07-27 04:25:44 UTC
I had some problems using gdb (no idea why), but hardwiring the output yielded that ERANGE is returned:

root@pindar:/tmp/dbus-1.2.1# sudo -u messagebus dbus-daemon --system
[...]
26349: No cache for groupname "testgroup"
26349: return value from getgr*_r: 34
Unknown group "testgroup" in message bus configuration file
No cache for groupname "testgroup"
[...]

I'll whip up a patch to increase the buf size until this succeeeds or a reasonable (512 KB?) limit is reached.

Marc
Comment 11 Loïc Minier 2008-07-27 04:42:49 UTC
Sysconf support is clearly orthogonal now; on my system's man page, it's the recommended way to compute the buffer size, would it be useful to merge this sysconf support?  If yes, should I be filing a separate bug for this?
Comment 12 Marc Brockschmidt 2008-07-27 04:53:15 UTC
Well, done. With a bit of verbose test output (not in the patch), I get this:

[...]
10646: No cache for groupname "testgroup"
10646: Tried getgr*_r with buflen 1024, got result 34
10646: Tried getgr*_r with buflen 2048, got result 34
10646: Tried getgr*_r with buflen 4096, got result 0
[...]

Marc
Comment 13 Marc Brockschmidt 2008-07-27 04:55:35 UTC
Created attachment 17913 [details] [review]
Use sysconf to get initial buf size, then try exponential backoff if we get ERANGE

This patch is based on lool's work and probably works around a libc bug (it was no problem to get a group with enough users to exceed the announced 1024 byte limit, provided that one used a bugger big enough)
Comment 14 Colin Walters 2008-07-27 05:25:23 UTC
Small patch nitpick: would be good to consistently match the DBus code style which has a space between identifiers and parens; while (1) and not while(1), dbus_free (foo) and not dbus_free(foo) etc.

Other than that it looks good to me, if no one else adds any comments I'll tweak the style and commit in a day or two.
Comment 15 Colin Walters 2008-07-28 13:17:48 UTC
Ok, I modified this patch to fix a memory leak caught by "make check", removed spurious dbus_free(buf) calls that wouldn't compile in the non-recursive ifdef case, and tweaked the rest to fit the code style guidelines.

Thanks for the patch!

commit 9d51f086b05df196b94234d6a0d388594feedd73
Author: Marc Brockschmidt <he@debian.org>
Date:   Mon Jul 28 16:09:53 2008 -0400

    Bug 16727: Handle ERANGE for getgr; fixes user in many groups
    
    	Patch originally from Noèl Köthe.
    	Modified by Colin Walters <walters@verbum.org>
    
    	* dbus/dbus-sysdeps-unix.c, dbus/dbus-sysdeps-unix-utils.c:
    	Use a while() loop to reallocate buffer if we get ERANGE
    	return.  This fixes the case where a user is in a large
    	number of groups.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.