Bug 38784

Summary: Add standard DBus interface for managing the lifetime of objects based on shared ownership
Product: dbus Reporter: Will Manley <freedesktop>
Component: coreAssignee: D-Bus Maintainers <dbus>
Status: RESOLVED MOVED QA Contact: D-Bus Maintainers <dbus>
Severity: enhancement    
Priority: low CC: bugzilla, lars, lennart, msniko14, smcv
Version: 1.5Keywords: patch
Hardware: All   
OS: All   
Whiteboard: waiting for more users
i915 platform: i915 features:
Attachments: Patch to DBus specification describing interface RefCounted
Patch to DBus specification describing interface RefCounted V2

Description Will Manley 2011-06-29 14:34:24 UTC
Created attachment 48571 [details] [review]
Patch to DBus specification describing interface RefCounted

In our system[1] we have a DBus server that is in charge of playing back, recording, buffering, etc. video media.  Clients that want a video stream recorded or played back ask a DBus server to create an object for doing this.  While the object lives in the server, really it belongs to the client, and as such the server should remove it from its connection and and destroy it when the client becomes disinterested in it.  There is the additional complexity that the server might have sent one of these objects to multiple clients or may have sent the same object to a particular client multiple times.

I wanted to extend the warm fuzzy feeling that in-process reference counted pointers gives you, but to our multi-process system.  I wanted it to be asynchronous and non-chatty and I think I have achieved this.

Given that it was quite tricky to get right I thought it might be useful to other DBus users for it to become a standard DBus interface.  That way it is more likely that it might be implemented in bindings (other than my own).

I have attached a patch to the DBus spec describing what I've done.  I'm not expecting it to be immediately accepted but I created the patch so we would have something concrete to discuss.  As smcv said on IRC today "my rule of thumb would be, propose it as a standard D-Bus interface when you find your second use-case ;)".  Hopefully this use-case will surface.

[1] YouView TV set top boxes ( http://www.youview.com )
Comment 1 Simon McVittie 2011-07-11 04:01:20 UTC
Review of attachment 48571 [details] [review]:

As I said, I'd rather not have this as a "standard" interface until a second potential user appears (YouView + something else), but for the future:

::: doc/dbus-specification.xml
@@ +3204,3 @@
+      </para>
+      <para>
+        <literal>DropRef</literal> will not return.

I don't like this - in principle, every D-Bus method call should return (but as an optimization, if the NO_REPLY flag is set, the service may avoid returning).

So, I'd prefer it phrased more like:

DropRef does not return a useful result, and calling it with the NO_REPLY flag is suggested.

@@ +3213,3 @@
+        every RefCounted object.  One is in the client and one is in the
+        remote application.  (In fact the remote application has a
+        per client reference count for each object).  In the discussion

It'd probably be clearer if you moved the definition of "server" to the top, then used that?

The fact that the object might be refcounted inside the server is really an implementation detail. What's important is: the server maintains a separate refcount on behalf of each client (identifying clients by their unique connection name) and keeps the object intact as long as at least one client has a reference; and each client cooperates with the server to keep its refcount up to date.

@@ +3222,3 @@
+              Every time an object is returned to a client from a remote
+              application the reference count in the server for that
+              object/client is incremented.

Perhaps:

Every time a client calls a method that sends a RefCounted object's path in its method reply message, the reference count maintained on behalf of that client is incremented.

@@ +3228,3 @@
+            <para>
+              Whenever the client receives an object path the reference count
+              in the client for that object is incremented.

The fact that the client is expected to track refs internally is just a consequence of the API; it's not really part of the API, as such.

@@ +3271,3 @@
+              Reference counts are not affected for DBus paths sent as a
+              signal payload as it is impossible to know who has received
+              these messages.

In Telepathy we have the concept of "handles", which are/were a bit like RefCounted. We've found that it's often useful for APIs to guarantee that if a "parent" object emits a signal mentioning a refcounted "child", then the "child" will exist for at least as long as the "parent" does, even if no client holds a ref.

This does need to be case-by-case, though.

@@ +3294,3 @@
+        it has been confirmed that Client B has called <literal>AddRef</literal>
+        on the object to avoid race conditions.  <literal>AddRef</literal>
+        will not return.

Again, I'd prefer to turn this into "calling it with NO_REPLY is suggested". If called without NO_REPLY, it should reply (that's part of the - mostly unwritten - D-Bus object model).

@@ +3300,3 @@
+          The <literal>org.freedesktop.DBus.RefCounted</literal>
+          interface was added in version 0.18 of the D-Bus
+          specification.

In Telepathy we have a convention of not mentioning concrete version numbers until things are at least merged, and usually also about-to-be-released - we call it "version UNRELEASED" or "version 0.UNRELEASED" or something, and grep for UNRELEASED/fix the references just before releasing.

I think that might be useful for the D-Bus Specification too, particularly for things that are sitting in bugzilla awaiting a second use-case.
Comment 2 Simon McVittie 2011-07-27 03:10:29 UTC
(The D-Bus specification is actually dbus/core, dbus/doc is something else.)
Comment 3 Will Manley 2011-08-10 13:11:55 UTC
Created attachment 50102 [details] [review]
Patch to DBus specification describing interface RefCounted V2
Comment 4 Will Manley 2011-08-10 13:25:32 UTC
Thanks for the review.  I've made some changes accordingly

(In reply to comment #1)
> Review of attachment 48571 [details] [review]:
> 
> As I said, I'd rather not have this as a "standard" interface until a second
> potential user appears (YouView + something else), but for the future:
> 
> ::: doc/dbus-specification.xml
> @@ +3204,3 @@
> +      </para>
> +      <para>
> +        <literal>DropRef</literal> will not return.
> 
> I don't like this - in principle, every D-Bus method call should return (but as
> an optimization, if the NO_REPLY flag is set, the service may avoid returning).

Sounds sensible.  Fixed.

> @@ +3213,3 @@
> +        every RefCounted object.  One is in the client and one is in the
> +        remote application.  (In fact the remote application has a
> +        per client reference count for each object).  In the discussion
> 
> It'd probably be clearer if you moved the definition of "server" to the top,
> then used that?

Quite right, I was clinging to "remote application" as I saw it in other parts of the spec.  It sounds a lot less clumsy with server defined higher.

> The fact that the object might be refcounted inside the server is really an
> implementation detail. What's important is: the server maintains a separate
> refcount on behalf of each client (identifying clients by their unique
> connection name) and keeps the object intact as long as at least one client has
> a reference; and each client cooperates with the server to keep its refcount up
> to date.

I agree that much of this section is an implementation detail, but I believe there is probably only one sensible implementation so spelling it out seemed clearer.  Originally I had used "should" all over the place but it didn't read so well.  In particular when I've presented this to others in my organisation the initial confusion was related to not understanding that there were actually *two* reference counts in the system so I thought it was important state it explicitly.

> @@ +3222,3 @@
> +              Every time an object is returned to a client from a remote
> +              application the reference count in the server for that
> +              object/client is incremented.
> 
> Perhaps:
> 
> Every time a client calls a method that sends a RefCounted object's path in its
> method reply message, the reference count maintained on behalf of that client
> is incremented.

Agreed, that is much clearer.

> 
> @@ +3228,3 @@
> +            <para>
> +              Whenever the client receives an object path the reference count
> +              in the client for that object is incremented.
> 
> The fact that the client is expected to track refs internally is just a
> consequence of the API; it's not really part of the API, as such.

I still think it is important to state for the same reasons as above.

> @@ +3271,3 @@
> +              Reference counts are not affected for DBus paths sent as a
> +              signal payload as it is impossible to know who has received
> +              these messages.
> 
> In Telepathy we have the concept of "handles", which are/were a bit like
> RefCounted. We've found that it's often useful for APIs to guarantee that if a
> "parent" object emits a signal mentioning a refcounted "child", then the
> "child" will exist for at least as long as the "parent" does, even if no client
> holds a ref.
> 
> This does need to be case-by-case, though.

Sounds interesting.  I see no reason that the fact that a RefCounted object be solely kept alive by the refcount as long as it is kept alive when the refcount is >0.  Do you think this or something like it should be explicitly called out?

> @@ +3294,3 @@
> +        it has been confirmed that Client B has called
> <literal>AddRef</literal>
> +        on the object to avoid race conditions.  <literal>AddRef</literal>
> +        will not return.
> 
> Again, I'd prefer to turn this into "calling it with NO_REPLY is suggested". If
> called without NO_REPLY, it should reply (that's part of the - mostly unwritten
> - D-Bus object model).

Yep, done

> @@ +3300,3 @@
> +          The <literal>org.freedesktop.DBus.RefCounted</literal>
> +          interface was added in version 0.18 of the D-Bus
> +          specification.
> 
> In Telepathy we have a convention of not mentioning concrete version numbers
> until things are at least merged, and usually also about-to-be-released - we
> call it "version UNRELEASED" or "version 0.UNRELEASED" or something, and grep
> for UNRELEASED/fix the references just before releasing.
> 
> I think that might be useful for the D-Bus Specification too, particularly for
> things that are sitting in bugzilla awaiting a second use-case.

Sounds sensible - done
Comment 5 Philip Withnall 2016-10-01 12:31:56 UTC
Lars, Lennart, this would fix the systemd race where the org.freedesktop.systemd1.Job object created by calling StartUnit() can disappear before the caller manages to query its properties. What do you think?

StartUnit() would implicitly increase the reference count for the caller on the Job object whose path it returns. Note that this would be an API break, as it requires the caller to now call DropRef() on the Job once it’s done with the Job. I guess that means creating a new variant of StartUnit() for clients who care about avoiding that race.
Comment 6 Philip Withnall 2016-10-01 13:41:46 UTC
Comment on attachment 50102 [details] [review]
Patch to DBus specification describing interface RefCounted V2

Review of attachment 50102 [details] [review]:
-----------------------------------------------------------------

I’m not a D-Bus maintainer, but here are my thoughts about the API.

::: doc/dbus-specification.xml
@@ +3203,5 @@
> +
> +    <sect2 id="standard-interfaces-refcounted">
> +      <title><literal>org.freedesktop.DBus.RefCounted</literal></title>
> +      <para>
> +        The intent of this interface is to allow DBus clients to own (be

Nitpick: ‘D-Bus’, rather than ‘DBus’.

@@ +3221,5 @@
> +        be destroyed:
> +      </para>
> +      <para>
> +        <programlisting>
> +          org.freedesktop.DBus.RefCounted.DropRef (in UINT32 refcount);

The documentation doesn’t specify what the ‘refcount’ argument is for. Is it the number of references to drop? Given that the expected use cases cover each client having a single reference, is there much point in having this argument?

Ah, it is mentioned below, as part of the distributed reference count mechanism. It should be documented briefly here for clarity too.

@@ +3241,5 @@
> +          <listitem>
> +            <para>
> +              Every time a client calls a method that sends a RefCounted
> +              object's path in its method reply message, the reference
> +              count maintained on behalf of that client is incremented.

‘the reference count maintained on behalf of that client’ — i.e. the reference count stored in the server?

I think this paragraph pertains entirely to the server’s behaviour, so perhaps it would be clearer if it started with “Every time a method on the server is called that returns a RefCounted object’s path…”?

@@ +3270,5 @@
> +          <listitem>
> +            <para>
> +              In addition, to cope with unclean client exits, the server
> +              should reset the reference count for those clients that drop off
> +              the bus.  To do this without races the application should call:

s/application/server/?

@@ +3288,5 @@
> +            </para>
> +          </listitem>
> +          <listitem>
> +            <para>
> +              Reference counts are not affected for DBus paths sent as a

Nitpick: ‘D-Bus’ rather than ‘DBus’.

@@ +3289,5 @@
> +          </listitem>
> +          <listitem>
> +            <para>
> +              Reference counts are not affected for DBus paths sent as a
> +              signal payload as it is impossible to know who has received

Perhaps suggest that in this case (when a client has received an object path via a signal), it is expected that the client will call AddRef if it wants to use the object?

That’s racy though — the server could destroy the object before any of the clients have managed to call AddRef.

An alternative would be to suggest that if the server advertises object paths via signals, it should also emit a corresponding signal when one of those objects is removed. i.e. How ObjectManager works. That can be race-free if the ObjectManager.InterfacesRemoved signal is emitted *before* the server actually destroys the object in memory, so that any method calls by clients on that object can be handled correctly by the server between the object being advertised and InterfacesRemoved being emitted.

@@ +3311,5 @@
> +      <para>
> +        which should increment the server reference count for the
> +        calling client.  If Client A is passing an object to Client B it
> +        should not call <literal>DropRef</literal> on the object until
> +        it has been confirmed that Client B has called <literal>AddRef</literal>

How can client A confirm that client B has called AddRef? That introduces a new race condition, where client A passes an object path to client B, client B is slow in calling AddRef, and client A ends up calling DropRef before client B calls AddRef.

I guess the race could either be fixed on a case-by-case basis, by ensuring that the way client A passes an object path to client B is a round trip, such as a method call on one of client B’s objects — client B must call AddRef before returning from this method call, and hence afterwards client A is safe to call DropRef.

Alternatively, it could be fixed by having a TransferRef(unique name) method instead of AddRef. Client A would call TransferRef on the object on the server, and pass client B’s unique name. That would atomically add a reference on the server for (object, client B), and drop a reference for (object, client A). If client B has disappeared by the time the server handles the TransferRef call, it would end up being equivalent to a DropRef call from client A.

TransferRef seems a little less general than AddRef, but less prone to being implemented in a racy manner by clients.
Comment 7 GitLab Migration User 2018-10-12 21:09:14 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/49.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.