Bug 101903 - Containers (#100344): allow container to live longer than container manager
Summary: Containers (#100344): allow container to live longer than container manager
Status: ASSIGNED
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: git master
Hardware: All All
: medium enhancement
Assignee: Simon McVittie
QA Contact: D-Bus Maintainers
URL:
Whiteboard:
Keywords:
Depends on: 101354
Blocks: 100344
  Show dependency treegraph
 
Reported: 2017-07-24 20:14 UTC by Simon McVittie
Modified: 2018-01-12 14:33 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon McVittie 2017-07-24 20:14:21 UTC
+++ This bug was initially created as a clone of Bug #100344 +++

Following on from Bug #101354, we want to let short-lived container manager helper processes create a container with a lifetime and then get out of the way. In particular, Flatpak does not have a "supervisor" process once the sandboxed app is running, except for bwrap (which is minimal, sometimes setuid, and does not use D-Bus) and the flatpak-dbus-proxy (which, long term, we want to remove).

In Bug #101354, the container would stop listening when its creator falls off the bus. This is not suitable for Flatpak long-term.

The design we had at the GTK hackfest was to fd-pass in the read end of a pipe. Until it polls readable, the container server keeps listening. When it polls readable, the container server stops.

This can go in the named parameters a{sv}.
Comment 1 James Henstridge 2018-01-04 08:33:23 UTC
From the snapd point of view, it would be useful to have a mode of operation where the app container remains active until it is explicitly deleted (possibly by some other D-Bus connection).

While the snapd process is usually running, it's lifecycle is not tied to the the runtime of confined applications.  It's designed so it can be restarted (e.g. when it is upgraded) without interrupting anything else.  This is done by having state stored in the file system rather than trying to pass resources across generations of the daemon directly.

So tying the lifetime to a pipe file descriptor rather than a D-Bus connection isn't much better.  In contrast, we can easily store the container ID for a future StopInstance call (possibly made by a future generation of the daemon).
Comment 2 Simon McVittie 2018-01-04 14:49:48 UTC
(In reply to James Henstridge from comment #1)
> From the snapd point of view, it would be useful to have a mode of operation
> where the app container remains active until it is explicitly deleted
> (possibly by some other D-Bus connection).

Straw man: a named parameter { "KeepListeningAfterInitiatorExit": <true> }?

I want the default to be "when the initiator dies, the server stops listening" so that in simple uses of the API, we know the server will eventually get cleaned up (they're a limited resource, particularly on the system bus, so avoiding accidental or deliberate denial of service is desirable).

If snapd explicitly passes some parameter to say "I promise I will clean this up eventually", then it's clear whose bug it is if it fails to do so :-)

(The cleanup action currently taken when the container manager dies is the equivalent of StopListening(), not StopInstance() - the server stops, but existing connections stay alive.)

> This is done by having state stored in the file system rather than trying to
> pass resources across generations of the daemon directly.

In /run so it won't persist between boots, I hope?

If you're doing this, and you have arranged things so that you can be confident that a crashed or otherwise inoperative snapd will restart reasonably soon and will detect app exit in a timely way, then I think it's OK to opt-in to taking responsibility for cleaning up the per-app servers.
Comment 3 James Henstridge 2018-01-08 12:39:15 UTC
I think defaulting to guaranteed cleanup is fine.  I agree that if snapd wants to enable manual cleanup it can ask for it.

We are already using /run to store some runtime data (mainly to do with the mount namespaces for sandboxed apps), so the path of least resistance would be to store app container IDs there too.
Comment 4 David Herrmann 2018-01-12 11:42:36 UTC
(In reply to James Henstridge from comment #1)
> So tying the lifetime to a pipe file descriptor rather than a D-Bus
> connection isn't much better.  In contrast, we can easily store the
> container ID for a future StopInstance call (possibly made by a future
> generation of the daemon).

So you're saying your apps have no "managing parent process" that survives the lifetime of the App?

In general, I like the proposal from Simon. It is very similar to kernel APIs: You *want* to return a file-descriptor that defines the lifetime of an object you created. But unlike kernel-APIs, we cannot create our own FDs. Hence, we simply pass one in, chosen by the client (always make sure to pass them in, rather than return them from the daemon, so they're accounted on the right uid by the kernel).

Note that you could just "leak" (clear O_CLOEXEC) this FD into your client App, and thus naturally limit its lifetime to the App. The only problem would be if the client App iterates /proc/self/fd/ and closes anything it doesn't care about. That sounds highly unlikely, though, since even glibc is allowed to open FDs behind your back.
Comment 5 Alexander Larsson 2018-01-12 14:15:05 UTC
Leaking an fd means your app dies if it double-forks though, or if for some other reason the initial process dies.
Comment 6 Alexander Larsson 2018-01-12 14:15:41 UTC
Well, the "sandbox" dies.
Comment 7 David Herrmann 2018-01-12 14:33:22 UTC
(In reply to Alexander Larsson from comment #5)
> Leaking an fd means your app dies if it double-forks though, or if for some
> other reason the initial process dies.

You would leak recursively, unless you take precaution and close them after fork(). Not saying this is nice, just saying it is possible.

I still don't see why it is a good idea to have an app outlive its babysitter? You introduce here a feature that allows pinning resources arbitrarily in dbus-daemon, with no way of automatic cleanup. The sandboxed app segfaults (or its manager) and you leak the resources in dbus-daemon until reboot. Nobody likes SYS-V-IPC for behavior like this.

By requiring an FD, you also require a process to be alive, and thus you pin any related resources and get natural responsibility and accounting.

By not requiring an FD, you might get dangling resources in dbus-daemon with not real-world connection whatsoever. What if all related processes to that App died, and systemd recycles the UID? What if someone exploits a sandbox bug to DoS dbus-daemon and exceed resources?

Sure, if nothing has bugs and nothing fails, pinning resources in dbus-daemon without an FD is fine. But... really?

I will not oppose this feature, but I really think it is a bad idea.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.