|Summary:||stop dbus-daemon memory usage ballooning if a client is slow to read|
|Product:||dbus||Reporter:||Simon McVittie <smcv>|
|Component:||core||Assignee:||D-Bus Maintainers <dbus>|
|Status:||REOPENED ---||QA Contact:||D-Bus Maintainers <dbus>|
|Priority:||medium||CC:||alban.crequy, bart.cerneels, me, msniko14, robin.bateboerop, walters, zeuthen|
|i915 platform:||i915 features:|
|Bug Depends on:||35358|
Description Simon McVittie 2011-01-27 10:12:14 UTC
In the thread starting here <http://lists.freedesktop.org/archives/dbus/2010-September/013575.html>, and continued in October <http://lists.freedesktop.org/archives/dbus/2010-October/013584.html>, Alban Crequy wrote: > What happens when a process does not read its incoming D-Bus messages on > the socket and other D-Bus peers keep sending messages to it? > > With the default configuration on the session bus, dbus-daemon's memory > usage is growing and the limit is really high: > <limit name="max_incoming_bytes">1000000000</limit> > <limit name="max_outgoing_bytes">1000000000</limit> Later in the thread, Colin and Havoc wrote: <http://lists.freedesktop.org/archives/dbus/2010-October/013621.html> > > Well, one thing we have discovered is that the current semantics of > > the bus daemon are pretty unfortunate, specifically the random > > arbitrary limits (127MiB and 1GB respectively for the system and > > session bus). > > It would be within the spec, if the spec existed, to set these limits > to infinite, if that's what you're suggesting. > Another approach, I think you could make the sender do the buffering > without changing any semantic. [*] Right now the daemon reads a message, > decides it can't buffer it, and generates an error. Instead the daemon > could read a message header, decide it can't buffer a message of that > size, and then just stop reading from the socket until it can buffer > it. Now the sender buffers messages and no messages ever fail to send. > The exception would be if the single message is itself too large and > can never succeed, then the daemon could generate an error. > > Those are probably both better than sending the error, since the error > isn't easily recoverable. The approach I've marked [*] above sounds reasonable to me; this means that if a sender is particularly spammy, it's that sender that suffers (by being made to buffer messages). At the moment it's dbus-daemon that suffers, by consuming the messages as fast as possible, not being able to get rid of them fast enough and ballooning in size. Turning this into a bug so it stays "on the radar" over time.
Comment 1 Colin Walters 2011-01-27 13:40:53 UTC
(In reply to comment #0) > > The approach I've marked [*] above sounds reasonable to me; this means that if > a sender is particularly spammy, it's that sender that suffers (by being made > to buffer messages). At the moment it's dbus-daemon that suffers, by consuming > the messages as fast as possible, not being able to get rid of them fast enough > and ballooning in size. This has the converse problem though that if a *receiver* is broken for some reason, then an innocent sender that happens to emit a signal for which the receiver will be receiving, the sender is punished. To be concrete, consider the system bus and upower's "I'm suspending" signal, and a buggy IM application which set up a match rule for that signal and is synchronously blocking on say an IM server. It'd be pretty bad in this situation for the system upower process to block. I think the right approach is to do punishment by uid, and not by process.
Comment 2 Simon McVittie 2011-01-28 03:26:37 UTC
(In reply to comment #1) > To be concrete, consider the system bus and upower's "I'm suspending" signal, > and a buggy IM application which set up a match rule for that signal and is > synchronously blocking on say an IM server. Doctor, doctor, it hurts when I (stick a fork in my leg | block on network traffic)! :-) But yes, that's a potential flaw in this approach, although I'm not sure why upower would have to block. My plan was to have some sort of quota per sender unique name ("queue quota", counting bytes of undelivered messages from that sender (or something). Anything that would fit in the queue quota would be queued up in dbus-daemon as is done now; anything that doesn't fit would cause dbus-daemon to stop polling the socket until the queue gets smaller than the queue quota. Because the maximum message size we support is currently unrealistically large, we'd either have to reduce the maximum message size to be no larger than the queue quota, or have a special case where we're willing to receive a single message larger than the queue quota as long as the queue is otherwise empty, or always allow the queue quota to be exceeded by up to 1 message, or something. With that approach, I think upower would only get its messages delayed (in practice this means: held in a queue inside the upower process rather than inside dbus-daemon) if it had so many messages queued that it exceeded the queue quota, and it'd only *block* if it called dbus_connection_flush() while in that situation?
Comment 3 Simon McVittie 2011-01-28 08:13:29 UTC
One rather drastic possibility would be to keep track of processes that are reading (all|broadcast) messages far too slowly, and if they're clearly unable to keep up, disconnect them. That's quite a blunt instrument, because typical D-Bus bindings handle disconnection poorly, and by definition the offending client isn't keeping up well enough that it's useful for us to stuff an error message into its socket either... but it's probably better than the session dbus-daemon becoming arbitrarily large and consuming all your memory. We can't just drop some of the messages on the floor without violating reliable delivery, which is a pretty important property. For what it's worth, disconnecting the offending client if it takes too long to receive a message is basically the same behaviour as the Clique protocol that telepathy-salut uses for reliable UDP-multicast chatrooms (and D-Bus tubes) on link-local XMPP. Clique doesn't provide total ordering like the bus daemon does, but it does guarantee causal ordering.
Comment 4 Colin Walters 2011-01-31 10:02:48 UTC
(In reply to comment #2) > (In reply to comment #1) > > To be concrete, consider the system bus and upower's "I'm suspending" signal, > > and a buggy IM application which set up a match rule for that signal and is > > synchronously blocking on say an IM server. > > Doctor, doctor, it hurts when I (stick a fork in my leg | block on network > traffic)! :-) Oh it's undeniably an application bug; but the problem here is that a buggy application would then have a bad effect on other innocent processes, and in this case in particular it's a *system* process. We definitely want to minimize the effect of buggy applications on the system. > But yes, that's a potential flaw in this approach, although I'm not sure why > upower would have to block. Well the upower process in this case (AFAIK) doesn't really *do* anything other than send/receive on DBus; also remember that while it could read further incoming messages, it couldn't reply to them. > With that approach, I think upower would only get its messages delayed (in > practice this means: held in a queue inside the upower process rather than > inside dbus-daemon) if it had so many messages queued that it exceeded the > queue quota, and it'd only *block* if it called dbus_connection_flush() while > in that situation? Again though that's a serious problem; think of say gnome-shell trying to send a Suspend request to upower and expecting a reply, but not getting it. Then the UI could be confused because it expected that successful reply to e.g. launch the screensaver or show a fade out animation. My instinct again is really that uids need to be punished, or really that resource control needs to be per-uid. Or maybe on Linux it makes sense to have the limit be per-cgroup.
Comment 5 Havoc Pennington 2011-02-01 08:31:57 UTC
Seems worth keeping this as simple as possible, since it's kind of a corner case that will be rarely-tested. What if we combined: * if a sender has too much stuff that hasn't been received yet, block that sender (as in comment 0) * if a receiver has had the same message pending read for N seconds without reading it, disconnect the receiver as unresponsive (which may then unblock senders). pretty harsh though. On the other hand, in the session bus, does it really matter if the big buffers are in the daemon or in the app? Seems equally bad either way. Perhaps more valuable would be some sort of reporting when this situation is happening, to allow apps to be fixed. Another brainstorm could be: add a message flag "can be dropped," meaning if the buffer is full they can get dumped. This is a sort of primitive event compression. I guess only useful if the cases where this is happening would be able to work this way. I'm guessing just a flag is too simple to work.
Comment 6 Colin Walters 2011-02-03 11:17:46 UTC
(In reply to comment #5) > Seems worth keeping this as simple as possible, since it's kind of a corner > case that will be rarely-tested. > > What if we combined: > > * if a sender has too much stuff that hasn't been received yet, block that > sender (as in comment 0) I don't think it's acceptable to do this, see my followup in comment #4. > * if a receiver has had the same message pending read for N seconds without > reading it, disconnect the receiver as unresponsive (which may then unblock > senders). pretty harsh though. This doesn't seem too bad, but it feels very "heuristic". > On the other hand, in the session bus, does it really matter if the big buffers > are in the daemon or in the app? Seems equally bad either way. Yeah we're mostly concerned about the system bus. > Perhaps more valuable would be some sort of reporting when this situation is > happening, to allow apps to be fixed. Yes.
Comment 7 Havoc Pennington 2011-02-03 14:21:04 UTC
> I don't think it's acceptable to do this the disconnect of a stuck receiver is supposed to address the dangers of this. though, maybe the timeout for disconnecting the receiver would have to be too long so senders would stick for too long. Here's another idea: try to figure out whose 'fault' the buffering is. For example, count bytes sent per second and received per second on each connection. Possibly the patch for that would be reasonably simple... I don't know. You could detect ridiculously spammy senders or sluggish receivers for sure. I guess this is basically Simon's idea in comment 3. You would only have to take action on spammers and slow-readers when the buffers actually got too big, the rest of the time you could cut them slack.
Comment 8 Simon McVittie 2011-02-04 03:30:13 UTC
(In reply to comment #6) > (In reply to comment #5) > > On the other hand, in the session bus, does it really matter if the big buffers > > are in the daemon or in the app? Seems equally bad either way. > > Yeah we're mostly concerned about the system bus. On Maemo, we're concerned about the session bus too; saying "it's user error" doesn't necessarily work on a somewhat low-memory device with a mixture of platform-supplied and third-party software. At the moment, it's possible to get the dbus-daemon's memory usage to balloon to an unreasonable size by spamming it with messages; due to memory fragementation in dbus-daemon (I haven't tracked down where yet), the memory apparently doesn't get returned when the message-spamming stops, leading to poor performance until most of it gets swapped out. One mitigation that's been suggested is reducing the (essentially arbitrary) limits to values that are reasonable for the device's available memory - most likely, equal values for each bus. To do that safely, I'll need to make sure the behaviour when the limits are hit is sane. Hopefully these criteria don't turn out to be mutually incompatible: * total ordering: deliver all messages to all receivers in the same order, until the receiver is disconnected (voluntarily or otherwise) * receivers can't force a sender to be disconnected (DoS): don't disconnect a sender that sends at a reasonable rate just because slow receivers can't (or deliberately don't) keep up with its broadcasts - otherwise this would penalize services, like upower in Comment #1, for being listened to by a bad application * receivers can't stall a sender's messages (DoS): a slow receiver shouldn't prevent other receivers from seeing a sender's messages; I think it'd be OK for it to *delay* the sender's messages, as long as it's for a bounded time (and we can adjust the upper bound) - otherwise this would penalize services, like upower in Comment #1, for being listened to by a bad application - this means messages must only be queued between the sender and the bus daemon for a finite time; they can be queued indefinitely between the bus daemon and the receiver, memory permitting * prevent receiver disconnection by senders: don't disconnect a receiver that reads at a reasonable rate just because a rapid sender floods it with messages (DoS) * reliable no-reply messages: having accepted a signal, method return or error from a sender, dbus-daemon must deliver it to all connected recipients or tell the sender it couldn't (which it can only do by disconnecting the sender, or disconnecting recipients to make the set of connected recipients smaller) * reliable methods: having accepted a method call from a sender, dbus-daemon must either deliver it to the recipient or synthesize an error reply * finite memory consumption: because nothing survives dbus-daemon restarts, dbus-daemon needs to not be killed by the OOM killer, regardless of what applications do to it; for performance, it should also avoid getting itself swapped out Are there any other invariants we need?
Comment 9 Simon McVittie 2011-02-04 10:28:49 UTC
First some background information, for those (like me) who didn't write this stuff. There are two potentially large queues in each DBusConnection in the bus daemon: | DBusConnection DBusConnection(s)| | representing representing | | sender destination(s) | kernel | dispatching | kernel socket ----> incoming queue -----\ | socket buffer | \-------> outgoing queue ----> buffer | | In each main loop iteration, we do these (not necessarily in this order): * read from any readable sockets into the corresponding incoming queue, unless the queue is full already * drain any non-empty outgoing queues into the corresponding socket, unless the kernel's socket buffer is full (poll does not report POLLOUT) * move incoming messages into recipients' outgoing queues, unless they're full The incoming queues effectively feed back to the sender: if the incoming queue gets full, then the socket buffer in the kernel isn't drained, so the sending process can't write to that socket, and the sending process can choose to report an error or just queue up messages. If the sender uses libdbus, it'll queue up messages indefinitely unless it takes special action. The outgoing queues only feed back as far as the dispatching stage: once a message has been dispatched, the sender isn't "charged" for it. I wonder whether it'd be reasonable to count outgoing messages towards the sender's incoming quota as well as the recipients' outgoing quotas? (In reply to comment #0) > > Another approach, I think you could make the sender do the buffering > > without changing any semantic. [*] Right now the daemon reads a message, > > decides it can't buffer it, and generates an error. Instead the daemon > > could read a message header, decide it can't buffer a message of that > > size, and then just stop reading from the socket until it can buffer > > it. From my reading of the bus code, the current behaviour is nearly this: the daemon will keep reading messages in units of max_bytes_read_per_iteration bytes (currently hard-coded to 2048) until max_incoming_bytes is *exceeded*, which means it can exceed the limit by up to the maximum length of a message. In fact, if we wanted to push the maximum amount of buffering into senders, we could probably set the arbitrary incoming queue length to 0, and dbus-daemon would dispatch 2048 bytes or one whole message at a time (whichever is greater), minimizing memory use but reducing throughput. (In reply to comment #8) > * reliable no-reply messages: having accepted a signal, method return or > error from a sender, dbus-daemon must deliver it to all connected > recipients or tell the sender it couldn't (which it can only do > by disconnecting the sender, or disconnecting recipients to make the set > of connected recipients smaller) It seems this is already violated; when the outgoing byte limit for a connection is exceeded, broadcast messages that match that recipient are silently dropped. It's obviously a less important criterion than I'd thought...
Comment 10 Simon McVittie 2011-02-23 08:58:11 UTC
16:32 < thiago> in the case of a slow reader, the sender's queue is full and the replier's queue is empty 16:33 < thiago> so the sender is penalised because it's communicating with the slow reader 16:34 < thiago> that means any other messages that it tries to send (to fast readers) will also be subject to the same queueing 16:34 < thiago> while messages are transmitted in a single, sequential channel like Unix sockets, this will remain the case 16:35 < thiago> it's similar to problems people face with IP-over-TCP (like VPN over SSH) 16:36 < smcv> that's not the only problem with IP-over-TCP (duelling exponential backoffs is the one I've seen more widely documented), but yes 16:38 < thiago> this could be fixed by changing the D-Bus wire protocol to require "ACK" from the other side 16:38 < thiago> which in turn means senders need to implement resending 16:38 < smcv> interesting idea... 16:39 < thiago> in this case, the daemon would keep reading the messages from the spammer and discarding the ones that are outgoing to the slow-reader, without ACK 16:39 < thiago> when it sees a message whose destination is somewhere else (that isn't throttled), it ACKs 16:40 < smcv> oh I see, so wire-level ACKs per-hop, not end-to-end 16:40 < smcv> hmm 16:40 < thiago> yes 16:40 < smcv> that does re-order messages though, if you retransmit? 16:41 < thiago> depends on whether you reorder when retransmitting 16:41 < smcv> I mean in the general case of "send A; send B; back to main loop; oh crap, A didn't arrive" 16:41 < thiago> I don't know if you can consider ordering when calling different destinations 16:41 < thiago> if A and B don't communicate, this is not an isse 16:42 < smcv> broadcasts are the problematic case 16:42 < thiago> if they do communicate, this depends on the system scheduler 16:42 < smcv> bits of Telepathy rely on: if you emit a signal, then reply to a method 16:42 < smcv> then the caller of the method will see the signal before the reply 16:43 < thiago> another upside of this is that if another client tries to send to the slow reader, it's *also* throttled (no ACK) 16:43 < smcv> I think we have a couple of places guaranteed to be the other way round, even 16:43 < thiago> hmm... broadcasts, that's an interesting issue 16:43 < smcv> yeah, it'd all be much easier if it was unicast, the broadcasts are the really tricky thing 16:43 < smcv> if the sender is upower and the slow-reading recipient is trying to DoS the system, 16:44 < thiago> reliable broadcasting is hard 16:44 < thiago> IP multicast, for example, isn't reliable 16:44 < walters_> signals are one of the big things dbus buys us over things just talking raw sockets 16:44 < smcv> you probably don't want upower to be unable to emit any more signals :-) 16:44 < smcv> yeah, most D-Bus APIs don't make any attempt to deal with signals that don't arrive 16:44 < smcv> which I only realised can happen by looking at the dbus-daemon source 16:45 < thiago> even without the ACK: what happens if the spammer is sending signals? 16:45 < thiago> and the receiver is *not* reading off its socket, but has a match rule for that signal? 16:45 < thiago> won't you in the end block off the sender from sending anything? 16:45 < smcv> who is "the sender"? 16:45 < smcv> a third process? 16:45 < thiago> any program emitting signals 16:46 < thiago> say: upower or udisks or anything system 16:46 < thiago> it's simply sending signals, going on its merry way 16:46 < thiago> but there's one program that connects to the bus, adds match rules for the sender's signals, but never processes the socket 16:46 < smcv> if max_incoming_bytes worked, then the spamming process's socket wouldn't be read very often and it gets throttled 16:47 < thiago> right 16:47 < thiago> so through no fault of its own, a legitimate program is considered to be a spammer and gets throttled 16:47 < smcv> but yes, the current (designed) semantics of max_incoming_bytes allow a slow reader to DoS any broadcast-sender 16:47 < thiago> worst case of DoS ever: connect and do *nothing* 16:47 < smcv> and like I said, as a separate issue, max_incoming_bytes doesn't do what it seems to be meant to do and I haven't worked out why yet 16:48 < thiago> with an ACK system, we attach the problem to its rightful side 16:49 < thiago> we'd need a grace for broadcast messages: keep them in the destination's queue 16:49 < thiago> and an algorithm to decide whether the fault is the sender or the receiver: which one we should disconnect
Comment 11 Simon McVittie 2011-03-02 08:43:24 UTC
There are several situations here, which the comments above may have conflated. Before I start, here's a diagram. Vertical lines are userland/kernel boundaries, with K representing the kernel and the other regions representing processes. To explain things consistently, I'll only discuss situations where messages flow from left to right, except for when a reply comes from the dbus-daemon. This means that the left client is either sending (either at a sensible rate, or too fast as part of a flood attack) or sending and then receiving a response from the bus daemon (either at a sensible rate, or too slowly as part of a "slow loris" attack), while the right client is only receiving. left client |K| dbus-daemon |K| right client | | | | left socket left (B-G) right socket right client ===========> server ------\-------> server ===========> client conn | | conn <-----/ conn | | conn | | (A) and some errors | | It seems to me that there are the following situations: (A) the left sends a method call to the bus driver, which responds with either a method reply or an error The method call is queued in the left server connection (subject to its DBusTransport's limit on live messages, called the "incoming" limit in configuration files). If the limit on live messages is exceeded, the left socket is not polled until the counter drops below the limit again. The reply is sent from bus_transpaction_send_from_driver and queued in the left server connection's outgoing queue. If the left process is reading too slowly, or being DoS'd by a process writing faster than it can read, the left server connection's outgoing queue might be full, in which case the reply is silently dropped on the floor. The left process will eventually fail the method call with a timeout. The left can DoS the dbus-daemon by flooding it with method calls and not reading the replies. This is not currently neutralized by the incoming limit, because the method call is discarded as soon as the reply has been constructed. It is neutralized by the outgoing limit, which prevents the left server connection's outgoing queue from growing too large. (B) dbus-daemon sends an unsolicited signal to the right If a malicious dbus-daemon wants to DoS clients, it certainly can, so for this discussion I'll assume it's non-malicious and behaving correctly. It sends the following signals: * NameOwnerChanged (broadcast) This is emitted as a result of either the right process or another process (the left process, say) taking or releasing a name. If the right process is reading too slowly or being DoS'd, the right server connection's outgoing queue might be full, in which case the signal is silently dropped on the floor. The right process has no way to "catch up", or even find out that it might need to. The left process can DoS the dbus-daemon by taking and releasing names rapidly. This is mitigated by the left server connection's incoming limit, but only fully neutralized by the right server connection's outgoing limit (because the presence of NameOwnerChanged messages referring to the left, in the right server connection's outgoing queue, doesn't prevent the left socket from being polled). The left process can also DoS the right process by taking and releasing names rapidly enough to keep the right server connection's outgoing queue at its limit, resulting in other processes being unable to send messages to the right process (method calls will get errors and other messages will be dropped). If it's the right process that takes and releases names, the only concern is whether it can DoS the dbus-daemon; this is mitigated by the right server connection's incoming limit, and neutralized by its outgoing limit. * NameAcquired (unicast) This is emitted as a result of either the left process releasing a name for which the right process was in the queue, or the right process taking a name. The same considerations as for NameOwnerChanged apply. * NameLost (unicast) This is emitted as a result of either the left process claiming a name which the right process owned (with replacement allowed), or the right process releasing a name. The same considerations as for NameOwnerChanged apply. (C) the left sends a method call to the right The message from the left is counted towards the left server connection's "incoming" limit (internally, the DBusTransport's live_messages counter), for as long as it exists in memory. If the left process has too many pending method calls without the NO_REPLY flag, the method call gets an error reply sent back to the left, which behaves just like situation (A). In particular, if the left server connection's outgoing queue is full, the error reply is dropped on the floor; that's not *so* bad, because the left process will eventually time out. If the right server connection's outgoing queue is full because the right process is either reading too slowly or being DoS'd, the method call gets an error reply sent back to the left, which again behaves like situation (A). The left can DoS the dbus-daemon by flooding it with messages, but this is neutralized by the "incoming" limit: when too many messages are in live_messages, the dbus-daemon stops reading the left socket. The right can DoS the left or the dbus-daemon by not reading the messages (or reading them incredibly slowly - a "slow loris" attack). If it doesn't read anything, the messages build up in the right server connection's outgoing queue, where they: - occupy the dbus-daemon's memory - neutralized by the outgoing limit - for non-NO_REPLY messages, partially neutralized by the limit on the number of pending method calls - still count towards the left server connection's live_messages, causing the left socket not to be polled - not neutralized by the outgoing limit, because several "right" processes can work together to DoS a "left" process that calls methods on all of them, and if the right process isn't reading at all, then the messages will never leave You could claim that if the left is calling methods on the right, this indicates that it "trusts" the right, but in some APIs (Telepathy ChannelDispatcher calling into Clients, BlueZ system service calling into PIN agents) the caller doesn't necessarily trust the service very much. The left process can DoS the right process by calling its methods rapidly enough to keep the right server connection's outgoing queue at its limit, resulting in other processes being unable to send messages to the right process (method calls will get errors and other messages will be dropped). (D) the left sends a method reply (or error) to the right This is the same as (C), except that: - the limit on pending method calls isn't relevant - if the right server connection's outgoing queue is full, the reply is dropped on the floor instead of provoking a reply, and the *right* process eventually times out - on the system bus, the left process cannot DoS the right process like this, because unsolicited method replies/errors are filtered by the dbus-daemon; on a typical session bus, it still can (E) the left sends a unicast signal to the right (rarely used) This is the same as (C), except that: - the limit on pending method calls isn't relevant - if the right server connection's outgoing queue is full, the error reply will be sent, but (in every binding I've seen) will be ignored by the left process (F) the left sends a broadcast signal for which the right has a match rule This is the same as (E), except that: - there can be more than one right process, and any of them can DoS the left process by retaining messages in the right server connection - if the left process can emit messages fast enough that it could DoS the right process in (C-E), here it can DoS *all* the right processes simultaneously with the same rate of sending - there is no error reply (G) the left sends any unicast message on which the right is eavesdropping This is mechanically the same as (F), but the semantics are different (it may be more acceptable to say that if the eavesdropper isn't keeping up, it doesn't deserve to see all the messages).
Comment 12 Simon McVittie 2011-03-02 09:14:59 UTC
Some ideas I'm going to think about further: some might be rubbish, some might be useful. * in situations like (A) where the dbus-daemon sends a reply to a caller, count that message towards the caller's live_messages ("incoming") quota, so reading from that caller can be throttled until it accepts some of its replies - perhaps also "charge" the caller of RequestName/ReleaseName for any NameOwnerChanged, NameAcquired, NameLost signals that result from it? This would make requesting/releasing a name essentially equivalent to (F), so is only viable if we can come up with a good solution for (F) * when an outgoing queue is full, drop messages from the right end of the queue until there's space to enqueue the new message on the left, so forwards progress is always made * put a time limit on each message in the outgoing queue, and if it can't be delivered, drop it * invent an org.freedesktop.DBus.MessagesSkipped() signal, which replaces n dropped messages with a single fixed-size failure report which can be preallocated (In reply to comment #10) > 16:47 < smcv> and like I said, as a separate issue, max_incoming_bytes doesn't > do what it seems to be meant to do and I haven't worked out why > yet I believe this was either Bug #34393 (for which I now have a patch) or user error during testing (I might have accidentally been calling methods on the dbus-daemon, in which case the incoming limit is ineffective and it's only the outgoing limit - which I didn't change - that prevents inflation). (In reply to comment #9) > I wonder whether it'd be reasonable to count outgoing messages towards the > sender's incoming quota as well as the recipients' outgoing quotas? For the record, this is in fact already done. > (In reply to comment #0) > In fact, if we wanted to push the maximum amount of buffering into > senders, we could probably set the arbitrary incoming queue length to 0 In fact it has to be at least 1 byte (non-positive values are ignored), but yes, this works, and if I turn off various arbitrarily-large caches inside dbus-daemon to reduce heap fragmentation (bug# forthcoming), heap growth mostly seems to be avoided. I haven't yet benchmarked the effects on throughput. You can still DoS the dbus-daemon by calling methods on *it*, unless either there's a finite outgoing limit, or the dbus-daemon is modified to count its outgoing replies towards the caller's incoming limit (resulting in reading from the left socket being throttled after a while, until the left connection accepts some replies back). (In reply to comment #4) > My instinct again is really that uids need to be punished, or really that > resource control needs to be per-uid. Or maybe on Linux it makes sense > to have the limit be per-cgroup. Either way, you have a limit imposed on an equivalence class of connections, and you have to decide what to do when the limit is hit, so I don't think this necessarily helps us. Currently our equivalence classes each only contain one connection, which is easy to implement and reason about; if you want to add aggregation into larger equivalence classes, go ahead, but I'm not convinced it makes anything easier :-( On Maemo, the security framework (<http://meego.gitorious.org/meego-platform-security>) can result in processes sharing a uid but having very different capabilities, so dividing them up by uid wouldn't really make sense there.
Comment 13 Simon McVittie 2011-03-09 06:33:38 UTC
(In reply to comment #12) > * when an outgoing queue is full, drop messages from the right end > of the queue until there's space to enqueue the new message on the left, > so forwards progress is always made The rightmost message in the outgoing queue might have been half-sent, so DBusConnection and DBusTransport need to work together to make sure a half-sent message isn't dropped. Similarly, MessagesSkipped would have to be inserted to the left of the half-sent message, if any.