Bug 85261

Summary: Add an API to escape and unescape dbus names
Product: dbus Reporter: Andy Grover <agrover>
Component: pythonAssignee: Andy Grover <agrover>
Status: RESOLVED NOTABUG QA Contact: D-Bus Maintainers <dbus>
Severity: normal    
Priority: medium CC: smcv
Version: git master   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: patch.
patch v2

Description Andy Grover 2014-10-20 23:02:00 UTC
Created attachment 108141 [details] [review]
patch.

Instead of having callers do it, escape the object path. This saves new code using the library from having to do it.

No compat issues -- already-escaped object paths are unaltered by _escape_object_path().
Comment 1 Simon McVittie 2014-10-21 10:35:38 UTC
No, sorry, I don't want this change. It is up to the designers of D-Bus APIs to decide (and document, if appropriate) how they will map their identifiers into the restricted character-set of object paths.

(In reply to Andy Grover from comment #0)
> No compat issues -- already-escaped object paths are unaltered by
> _escape_object_path().

You cannot have this property without ambiguity, and indeed, your escaping algorithm is ambiguous: it is possible for an application to have two distinct identifiers, for instance "_5c" and "\", such that your escaping algorithm maps them both to "_5c". This seems undesirable: distinct inputs to an algorithm described as "escaping" should lead to distinct outputs.

I would not oppose the addition of an algorithm resembling Telepathy's escape_as_identifier(), or systemd's similar (but not identical) algorithm that is optimized for filenames, as a function in dbus that can be invoked explicitly; but "explicit is better than implicit", and there is no single algorithm that is the most suitable thing for all situations. For instance, algorithms based on "_%02x" % ord(c) are optimal for English strings that are almost valid object paths already, but pessimal for natural-language text in arbitrary languages.
Comment 2 Andy Grover 2014-10-21 19:21:08 UTC
Hi Simon, thanks for the review.

So the thing that we would need to fix (and that systemd and Telepathy's versions of the code already do) is to escape '_' as well.

We could almost just take Telepathy's code (from telepathy-glib tools/libtpcodegen.py), except there is no unescape implementation, so we'll need to convert systemd's bus_label_unescape (from shared/bus-label.c) to Python for this. So:

1) in Python
2) Separate functions that must be called explicitly
3) both escape and unescape
4) Unambiguous

Any more requirements?

Please excuse any process mistakes as I haven't contibuted to dbus before, but I think the appropriate thing is for me to reopen the bug and assign to myself, with you CC'd, until I have a v2 patch for you to review. Thanks again.
Comment 3 Andy Grover 2014-10-21 23:21:31 UTC
Created attachment 108215 [details] [review]
patch v2

Hi, please consider the attached patch.

In [1]: from dbus import service as s

In [2]: x = s.escape_object_path("/das/_d::.\\a___")

In [3]: x
Out[3]: '/das/_5fd_3a_3a_2e_5ca_5f_5f_5f'

In [4]: s.unescape_object_path(x)
Out[4]: '/das/_d::.\\a___'

In [5]: s.unescape_object_path(x + "_")
ValueError: invalid literal for int() with base 16: ''
Comment 4 Simon McVittie 2014-10-22 12:20:23 UTC
(In reply to Andy Grover from comment #2)
> Any more requirements?

MIT/X11 licensed like the rest of dbus-python, please. This rules out direct copying from systemd, I think, unless you ask the copyright holder for permission. (An independent reimplementation of its algorithm would be fine though.)

(In reply to Andy Grover from comment #2)
> Please excuse any process mistakes as I haven't contibuted to dbus before,
> but I think the appropriate thing is for me to reopen the bug and assign to
> myself, with you CC'd, until I have a v2 patch for you to review. Thanks
> again.

That's appropriate. Please leave it assigned to yourself as long as you're actively working on this feature request - I get bugmail regardless of Cc or assignee (I'm watching the dbus@maint.invalid pseudo-address).
Comment 5 Simon McVittie 2014-10-22 12:56:26 UTC
(In reply to Andy Grover from comment #3)
> In [2]: x = s.escape_object_path("/das/_d::.\\a___")

Not every input to your escape_object_path() yields a valid object path. "_", "abc", "/foo/", "//bar" are examples of strings that it can output, but that are not valid object paths.

If you are mechanically constructing object paths from some arbitrary string (filename or user's login name or whatever), I would really recommend having the result be an object path *component*, and substituting it into the fixed part of the object path *after* escaping, like Telepathy does with this pseudocode:

    ("/org/freedesktop/Telepathy/Connection/%s/%s/%s" %
        (connection_manager, protocol,
         escape_as_identifier(username_or_something))

(... just like you'd do construct a URL - they're fairly similar.)

> Add this to dbus-python so other clients can use this mechanism in a
> consistent way.

Other D-Bus APIs should not rely on their clients being able to decode the string, unless the D-Bus API documents how: this is not necessarily a universal convention. I think systemd does document its variation somewhere.

The way Telepathy uses this algorithm (to form an opaque string such that a human reader can see the important bits at a glance, and do the full decode with some effort, but applications are not meant to parse the string) is fine, but doesn't really need a decode function.

I'm somewhat hesitant to include a specific not-optimal-for-all-purposes implementation in dbus-python and risk misleading API authors into believing that it's somehow a requirement for "standard D-Bus". You can put whatever you like in your object paths, as long as the result is syntactically valid.

Here's an example of specifying how an object path can be parsed:<http://telepathy.freedesktop.org/spec/Account.html>.
Comment 6 Andy Grover 2014-10-22 16:58:09 UTC
(In reply to Simon McVittie from comment #5)
> I'm somewhat hesitant to include a specific not-optimal-for-all-purposes
> implementation in dbus-python and risk misleading API authors into believing
> that it's somehow a requirement for "standard D-Bus". You can put whatever
> you like in your object paths, as long as the result is syntactically valid.
> 
> Here's an example of specifying how an object path can be
> parsed:<http://telepathy.freedesktop.org/spec/Account.html>.

Ok makes sense. Given what you said, I'm inclined to drop this patch's development, and will just document my service's escape mechanism independently.
Comment 7 Simon McVittie 2015-01-06 21:01:51 UTC
(In reply to Andy Grover from comment #6)
> Given what you said, I'm inclined to drop this patch's
> development, and will just document my service's escape mechanism
> independently.

NOTABUG then.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.