systemd sends DBus messages for every unit added and removed, which is all of them for reloads. This is on v208 with the Fedora 20 patches, but it looks like the same occurs in master. systemd-logind can't keep up with these: Dec 17 23:07:02 519149-11pod01 dbus[1636]: [system] Rejected: destination has a full message queue, 0 matched rules; type="signal", sender=":1.55216" (uid=0 pid=1 comm="/usr/lib/systemd/systemd --system --deserialize 26") interface="org.freedesktop.systemd1.Manager" member="UnitRemoved" error name="(unset)" requested_reply="0" destination="org.freedesktop.DBus" (uid=0 pid=32361 comm="/usr/lib/systemd/systemd-logind ") Dec 17 23:07:02 519149-11pod01 dbus[1636]: [system] Rejected: destination has a full message queue, 0 matched rules; type="signal", sender=":1.55216" (uid=0 pid=1 comm="/usr/lib/systemd/systemd --system --deserialize 26") interface="org.freedesktop.systemd1.Manager" member="UnitRemoved" error name="(unset)" requested_reply="0" destination="org.freedesktop.DBus" (uid=0 pid=32361 comm="/usr/lib/systemd/systemd-logind ") Here's what it looks like from dbus-monitor --system: signal sender=:1.55216 -> dest=(null destination) serial=24348 path=/org/freedesktop/systemd1; interface=org.freedesktop.systemd1.Manager; member=UnitRemoved string "nginx_c4d9e5763f294ca19d6abf242a2575cc.socket" object path "/org/freedesktop/systemd1/unit/nginx_5fc4d9e5763f294ca19d6abf242a2575cc_2esocket" signal sender=:1.55216 -> dest=(null destination) serial=24349 path=/org/freedesktop/systemd1; interface=org.freedesktop.systemd1.Manager; member=UnitRemoved string "nginx_add76a5f40704da0832bb6065de7f342.socket" object path "/org/freedesktop/systemd1/unit/nginx_5fadd76a5f40704da0832bb6065de7f342_2esocket" signal sender=:1.55216 -> dest=(null destination) serial=24350 path=/org/freedesktop/systemd1; interface=org.freedesktop.systemd1.Manager; member=UnitRemoved
Because messages get lost, systemd-logind gets out of sync with systemd's unit data. This causes a proliferation of orphaned systemd-related processes on the system.
This also causes heavy context-switching in systemd and DBus. Linux 3.11.10-200.fc19.x86_64 (519149-11pod01.panth.io.pod1.panth.io) 12/17/2013 _x86_64_ (24 CPU) 11:22:09 PM UID TGID TID cswch/s nvcswch/s Command [...] Average: 81 1636 - 419.12 0.37 dbus-daemon Average: 81 - 1636 419.12 0.37 |__dbus-daemon [...] Average: 0 1 - 645.82 0.09 systemd Average: 0 - 1 645.82 0.09 |__systemd That's a 10-second sample, and it's sustained for a long time when a few reloads get queued.
Because we don't actually need systemd user sessions, we're working around this issue by masking logind: # systemctl mask systemd-logind.service Because the PAM module checks for a path created by logind to see if it's running, masking the service and rebooting puts things in good shape for us.
Just removing the lagging subscriber leaves D-Bus (and the journal, which has to log D-Bus's complaints) much happier. This is the same sort of 10-second sample from pidstat as before, just with logind masked. Linux 3.11.10-200.fc19.x86_64 (519149-11pod01.panth.io.pod1.panth.io) 12/19/2013 _x86_64_ (24 CPU) 12:29:53 AM UID TGID TID cswch/s nvcswch/s Command [...] Average: 81 1808 - 16.90 0.00 dbus-daemon Average: 81 - 1808 16.90 0.00 |__dbus-daemon
I am pretty sure this has been fixed since. If this is reproducible on current sytemd versions please file a new bug on github.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.