Summary: | Implement Stream Tubes over ICE-UDP | ||
---|---|---|---|
Product: | Telepathy | Reporter: | Dario Freddi <drf54321> |
Component: | gabble | Assignee: | Dario Freddi <drf54321> |
Status: | RESOLVED MOVED | QA Contact: | Telepathy bugs list <telepathy-bugs> |
Severity: | enhancement | ||
Priority: | medium | CC: | olivier.crete, smcv, will, youness.alaoui |
Version: | unspecified | Keywords: | patch |
Hardware: | Other | ||
OS: | All | ||
URL: | http://cgit.freedesktop.org/~smcv/telepathy-gabble/log/?h=drf-tubes-ice-udp | ||
Whiteboard: | review- | ||
i915 platform: | i915 features: |
Description
Dario Freddi
2012-08-30 20:29:25 UTC
I would like to review this from an XMPP standardization point of view before it gets merged. Xavier and I tried to establish a Vino/Vinagre VNC stream tube using this branch and didn't succeed. The tube seems to open with no image appears in Vinagre. telepathy-gabble:14744): gabble-DEBUG: nice_component_state_changed (tube-stream.c:2779): libnice component state changed 3!!!! (telepathy-gabble:14744): gabble-DEBUG: nice_component_state_changed (tube-stream.c:2799): Nice component is now connected (telepathy-gabble:14744): gabble-DEBUG: nice_component_writable (tube-stream.c:2833): Component is writable, open the tube Do we need a special version of libnice or something? Also, Gabble crashes when I close vinagre: (telepathy-gabble:14744): GLib-CRITICAL **: g_hash_table_foreach_remove_or_steal: assertion `version == hash_table->version' failed #0 0x00007fd7dc1ba1e9 in g_logv (log_domain=0x7fd7dc2381ad "GLib", log_level=G_LOG_LEVEL_CRITICAL, format=0x7fd7dc23d080 "%s: assertion `%s' failed", args1=0x7fffc82ce538) at gmessages.c:758 domain = 0x0 data = 0x0 depth = 1 log_func = 0x4285c5 <log_handler> domain_fatal_mask = 5 masquerade_fatal = 0 test_level = 10 was_fatal = 0 was_recursion = 0 i = 3 #1 0x00007fd7dc1ba2dd in g_log (log_domain=0x7fd7dc2381ad "GLib", log_level=G_LOG_LEVEL_CRITICAL, format=0x7fd7dc23d080 "%s: assertion `%s' failed") at gmessages.c:792 args = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7fffc82ce610, reg_save_area = 0x7fffc82ce550}} #2 0x00007fd7dc1ba31e in g_return_if_fail_warning (log_domain=0x7fd7dc2381ad "GLib", pretty_function=0x7fd7dc238360 "g_hash_table_foreach_remove_or_steal", expression=0x7fd7dc238238 "version == hash_table->version") at gmessages.c:801 No locals. #3 0x00007fd7dc19e5dc in g_hash_table_foreach_remove_or_steal (hash_table=0x1d2a1e0, func=0x4366d9 <close_each_extra_bytestream>, user_data=0x1dbf040, notify=1) at ghash.c:1415 node_hash = 2 node_key = 0x1 node_value = 0x1d71520 deleted = 0 i = 2 version = 1 __PRETTY_FUNCTION__ = "g_hash_table_foreach_remove_or_steal" #4 0x00007fd7dc19e69b in g_hash_table_foreach_remove (hash_table=0x1d2a1e0, func=0x4366d9 <close_each_extra_bytestream>, user_data=0x1dbf040) at ghash.c:1454 __PRETTY_FUNCTION__ = "g_hash_table_foreach_remove" #5 0x0000000000437f76 in gabble_tube_iface_stream_close (tube=0x1dbf040, closed_remotely=0) at tube-stream.c:2068 self = 0x1dbf040 priv = 0x1dbf0a0 base = 0x1dbf040 cls = 0x1a271c0 ---Type <return> to continue, or q <return> to quit--- base_conn = 0x13c4370 conn = 0x13c4370 #6 0x000000000042f9e9 in gabble_tube_iface_close (self=0x1dbf040, closed_remotely=0) at tube-iface.c:45 virtual_method = 0x437ea4 <gabble_tube_iface_stream_close> __PRETTY_FUNCTION__ = "gabble_tube_iface_close" #7 0x0000000000437390 in gabble_tube_stream_close (base=0x1dbf040) at tube-stream.c:1686 No locals. #8 0x00007fd7dcf1bdbd in tp_base_channel_close (chan=0x1dbf040) at base-channel.c:541 klass = 0x1a271c0 __PRETTY_FUNCTION__ = "tp_base_channel_close" #9 0x00007fd7dcf1d29c in tp_base_channel_close_dbus (iface=0x1dbf040, context=0x13df8b0) at base-channel.c:1126 chan = 0x1dbf040 __PRETTY_FUNCTION__ = "tp_base_channel_close_dbus" #10 0x00007fd7dd04e822 in tp_svc_channel_close (self=0x1dbf040, context=0x13df8b0) at _gen/tp-svc-channel.c:57 impl = 0x7fd7dcf1d212 <tp_base_channel_close_dbus> #11 0x0000003fe4e05e90 in ffi_call_unix64 () at ../src/x86/unix64.S:75 No locals. #12 0x0000003fe4e058a0 in ffi_call (cif=0x7fffc82ceb30, fn=0x7fd7dd04e7d5 <tp_svc_channel_close>, rvalue=0x7fffc82ceaa0, avalue=<optimized out>) at ../src/x86/ffi64.c:486 classes = {X86_64_INTEGER_CLASS, X86_64_NO_CLASS, X86_64_INTEGER_CLASS, X86_64_NO_CLASS} stack = 0x7fffc82ce890 "@\360\333\001" argp = 0x7fffc82ce940 "" arg_types = <optimized out> gprcount = 3 ssecount = <optimized out> ngpr = 1 nsse = 0 i = <optimized out> avn = <optimized out> ret_in_memory = <optimized out> reg_args = 0x7fffc82ce890 #13 0x00007fd7dc6be787 in g_cclosure_marshal_generic (closure=0x7fffc82ced20, return_gvalue=0x0, n_param_values=2, param_values=0x1cfb950, invocation_hint=0x0, marshal_data=0x7fd7dd04e7d5) at gclosure.c:1454 rtype = 0x3fe4e062a0 rvalue = 0x7fffc82ceaa0 n_args = 3 ---Type <return> to continue, or q <return> to quit--- atypes = 0x7fffc82cea70 args = 0x7fffc82cea40 i = 2 cif = {abi = FFI_UNIX64, nargs = 3, arg_types = 0x7fffc82cea70, rtype = 0x3fe4e062a0, bytes = 0, flags = 0} cc = 0x7fffc82ced20 enum_tmpval = 0x7fffc82ceac0 tmpval_used = 0 #14 0x0000003fece0d73d in invoke_object_method (message=0x1d47d70, connection=0x1390f70, method=0x7fd7dd301000, object_info=0x7fd7dd2fdc60, object=0x1dbf040) at dbus-gobject.c:1889 had_error = <optimized out> value_array = 0x1db4280 gerror = 0x0 closure = {ref_count = 0, meta_marshal_nouse = 0, n_guards = 0, n_fnotifiers = 0, n_inotifiers = 0, in_inotify = 0, floating = 0, derivative_flag = 0, in_marshal = 0, is_invalid = 0, marshal = 0, data = 0x0, notifiers = 0x0} out_param_pos = <optimized out> have_retval = 0 send_reply = 0 in_signature = 0x1dbada0 "" out_param_count = <optimized out> out_param_gvalue_pos = <optimized out> retval_signals_error = 0 arg_metadata = <optimized out> is_async = <optimized out> out_param_values = 0x0 return_value = {g_type = 0, data = {{v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}} out_param_gvalues = 0x0 reply = 0x0 retval_is_synthetic = 0 retval_is_constant = 0 #15 object_registration_message (connection=0x1390f70, message=message@entry=0x1d47d70, user_data=user_data@entry=0x1ccb4e0) at dbus-gobject.c:2151 pspec = <optimized out> object = 0x1dbf040 setter = <optimized out> ---Type <return> to continue, or q <return> to quit--- getter = <optimized out> getall = <optimized out> s = <optimized out> requested_propname = <optimized out> wincaps_propiface = <optimized out> iter = {dummy1 = 0x0, dummy2 = 0x0, dummy3 = 0, dummy4 = 0, dummy5 = 20482064, dummy6 = 0, dummy7 = 20541136, dummy8 = 0, dummy9 = 0, dummy10 = 0, dummy11 = 0, pad1 = 0, pad2 = 0, pad3 = 0x1396a20} method = 0x7fd7dd301000 object_info = 0x7fd7dd2fdc60 ret = <optimized out> o = 0x1ccb4e0 #16 0x0000003fe761d685 in _dbus_object_tree_dispatch_and_unlock (tree=0x1390c30, message=message@entry=0x1d47d70) at dbus-object-tree.c:858 message_function = 0x3fece0cad0 <object_registration_message> user_data = 0x1ccb4e0 next = 0x0 path = 0x13e0ac0 exact_match = 0 list = 0x13b8168 link = <optimized out> result = DBUS_HANDLER_RESULT_NOT_YET_HANDLED subtree = <optimized out> #17 0x0000003fe760f90d in dbus_connection_dispatch (connection=connection@entry=0x1390f70) at dbus-connection.c:4685 message = 0x1d47d70 link = <optimized out> filter_list_copy = 0x0 message_link = 0x13b82b8 result = <optimized out> pending = <optimized out> reply_serial = <optimized out> status = <optimized out> __FUNCTION__ = "dbus_connection_dispatch" #18 0x0000003fece0ac45 in message_queue_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at dbus-gmain.c:90 connection = 0x1390f70 #19 0x00007fd7dc1b07ac in g_main_dispatch (context=0x1393c70) at gmain.c:2691 dispatch = 0x3fece0ac30 <message_queue_dispatch> ---Type <return> to continue, or q <return> to quit--- was_in_call = 0 user_data = 0x0 callback = 0 cb_funcs = 0x0 cb_data = 0x0 need_destroy = 20511392 current_source_link = {data = 0x1393d60, next = 0x0} source = 0x1393d60 current = 0x139fd90 i = 0 __PRETTY_FUNCTION__ = "g_main_dispatch" #20 0x00007fd7dc1b135c in g_main_context_dispatch (context=0x1393c70) at gmain.c:3195 No locals. #21 0x00007fd7dc1b153f in g_main_context_iterate (context=0x1393c70, block=1, dispatch=1, self=0x138d780) at gmain.c:3266 max_priority = 0 timeout = 0 some_ready = 1 nfds = 12 allocated_nfds = 12 fds = 0x1d2b910 #22 0x00007fd7dc1b196f in g_main_loop_run (loop=0x1398410) at gmain.c:3460 self = 0x138d780 __PRETTY_FUNCTION__ = "g_main_loop_run" #23 0x00007fd7dd035ace in tp_run_connection_manager (prog_name=0x4dcbc7 "telepathy-gabble", version=0x4dcbbe "0.17.0.1", construct_cm=0x4284e4 <construct_cm>, argc=1, argv=0x7fffc82cf278) at run.c:285 connection = 0x1390f70 bus_daemon = 0x1395890 error = 0x0 ret = 1 __PRETTY_FUNCTION__ = "tp_run_connection_manager" #24 0x0000000000428842 in gabble_main (argc=1, argv=0x7fffc82cf278) at gabble.c:179 loader = 0x138d660 out = 32727 fatal_mask = 13 #25 0x00000000004284e1 in main (argc=1, argv=0x7fffc82cf278) at main.c:33 Cc Youness, Olivier for their protocol design expertise. There should be a XEP-style document in docs/ describing how to implement Stream Tubes as a Jingle application, in the style of XEP-0234 or XEP-0247. (Indeed, I would like to submit it for standardization, but that's not necessarily on the critical path.) When this document has been written and looks basically sane, we should upload it to a location similar to its own namespace (xmlns) URL, and serve a pointer to it from its own namespace URL, so that it's somewhat self-documenting. (Just like I did for the original Tubes.) At the very least, there should be a document with some correct example stanzas illustrating the general flow (offer a tube, accept a tube, open two connections[1] to an accepted tube) and "FIXME" for the rest of the text... Here is my attempt to reverse-engineer the protocol, feel free to use this in writing the pseudo-XEP. -------------- When a tube is offered, the offering peer initiates a Jingle session with a <description/> in the namespace <http://telepathy.freedesktop.org/xmpp/tubes-ice>. The <description/> contains mandatory <service/>, <type/> and <id/> elements, and an optional <parameters/> element. <service/> contains a service name appropriate for use with TCP, in the sense used by the IANA Unified Service Name and Port Number Registry (e.g. "http", "rfb"). <type/> contains "1" (the numeric value of TP_TUBE_TYPE_STREAM, in ASCII decimal). <id/> contains a base-10 ASCII integer in the range 1 to 2**32-1 inclusive. Stream tubes can be uniquely identified by the pair (peer's full JID, content of <id/>). Offering a tube to a peer shares a service with that peer. It is analogous to listening on a TCP port with listen() and accept(). "Accepting" a tube is purely an XMPP action and has no TCP equivalent. It indicates that the accepter supports the stream tube protocol and wishes to connect to the shared service. After accepting a tube, the accepter can "connect to the tube" (analogous to connect() to a TCP port) up to 128 times. Each connection has an identifier in the range 0 to 127. Payload data for all connections to the tube is interleaved through a single Jingle content with a single component, using the following framing protocol: * 1 byte: the connection identifier, C * 2 bytes: the length of the frame, N, as an unsigned integer in either big- or little-endian. If 0, this frame is a "control frame" as described below, and does not contain any payload data. * N bytes: the next N bytes of payload from connection C Implementations may send frames of any length in the range 1 to 65535. The boundaries do not have to be preserved and must not be relied on. In addition, frames with N=0 are "control frames". There are two types of control frame, distinguished by their connection ID: * 0 <= C <= 127: end of connection with identifier C * 128 <= C <= 255: beginning of connection with identifier C The meaning of frames with N > 0 and C > 127 is undefined. Candidates whose protocol is UDP indicate use of "pseudo-TCP" compatible with Google libnice (FIXME: read libnice and document what that actually means), over ICE-UDP. Candidates whose protocol is TCP are not currently supported, but in future they should be. They will operate in the obvious way. [1] This is explicitly allowed, to support "payload" protocols that are conceptually connectionless, like HTTP. -------------- Review comments on that description: > http://telepathy.freedesktop.org/xmpp/tubes-ice I think this should be http://telepathy.freedesktop.org/xmpp/stream-tube or something (with urn:xmpp:jingle:apps:stream-tube:0 requested during standardization). There are two things wrong with this choice of namespace: * Stream tubes and D-Bus tubes have different semantics, so they should be defined as two separate Jingle "applications" each with its own namespace. * There's nothing to say that this version of stream tubes *has* to use ICE. What distinguishes them from the older stream-initiation tubes is that they're a Jingle application instead of using stream initiation. > <type/> contains "1" (the numeric value of TP_TUBE_TYPE_STREAM, > in ASCII decimal). Telepathy-specific enums shouldn't be part of our XMPP extension, and in any case, you should be able to tell it's a stream tube from the namespace of the <description/> instead. > <id/> contains a base-10 ASCII integer in the range 1 to > 2**32-1 inclusive. I think this should be an arbitrary string (in examples use a UUID for this), generated by the offerer. > Payload data for all connections to the tube is interleaved > through a single Jingle content with a single component, > using the following framing protocol Why is there a framing protocol? Why isn't each connection a new content or a new component? For instance, we could define it like this: * offerer adds a dummy content with some sort of special <description>, just to keep the Jingle session alive; it will never actually transfer payload data * accepter sends a content-add per connection * offerer sends content-accept when connect() to the service has succeeded or content-reject if it fails * (optionally) offerer may remove the dummy content as an equivalent of stopping listen()ing, which would solve Bug #13272 Alternatively, I've also considered a two-layer protocol which would make stream tubes in multi-user chat consistent with the 1-1 case that you're dealing with here: * offerer sends a non-Jingle <iq> (for the 1-1 case) or does some magic in its <presence> (for the MUC case) * accepter initiates a Jingle session per connection, with one content, and a reference to the tube advertisement <iq> in its <description> One advantage of the framing protocol is that you only have to do one ICE handshake for all n connections. If we keep the framing protocol, I would like to make the arbitrary number of connections larger (perhaps just using 2-byte connection IDs would be sufficient). > * 2 bytes: the length of the frame, N, as an unsigned integer in either > big- or little-endian This is clearly not right: an x86 peer and a PowerPC peer will fail to communicate. If we're using a framing protocol like this, we have to pick an endianness. Big-endian is conventional. ---------------- Drive-by review comments on the implementation (not necessarily complete): Some of your lines are pretty long. Please break argument lists, etc., before 80 characters where feasible, according to the guidelines in <http://telepathy.freedesktop.org/wiki/Style>. jingle-tube.c should be jingle-stream-tube.c, with the class renamed accordingly. If it turns out D-Bus tubes can share code, we can extract a superclass. jingle-*.c are meant to be basically ready to extract into Wocky, so they shouldn't include things that are Telepathy-, D-Bus- or Gabble-specific. In the case of jingle-tube.c, its use of connection.h is suspicious: it shouldn't need that. > + * <description xmlns='http://telepathy.freedesktop.org/xmpp/tubes-ice/stream'> > + * <service>webdav</service> > + * <id>666</id> > + * <parameters> > + * <parameter name='u' type='str'>anonymous</parameter> > + * <parameter name='p' type='str'>password</parameter> > + * <parameter name='path' type='str'>/plots/macbeth</parameter> > + * </parameters> > + * </description> This is not consistent with the implementation. The xmlns has changed, and the <type/> is missing. My suggested protocol changes actually make it closer to this example again :-) > + priv->id = g_ascii_strtoull (id_node->content, NULL, 10); This is made irrelevant by my suggested protocol changes, but anyway: priv->id is a guint32, so you're losing the top 32 bits. You should assign the result of g_ascii_strtoull to a guint64, range-check, and *then* (assuming success) assign to the guint32. > + guint32 tmp = self->priv->id; > + > + if (tmp == 0 || tmp > G_MAXUINT32) > + { > + DEBUG ("tube id is non-numeric or out of range: %u", tmp); > + return FALSE; > + } > + *tube_id = tmp; This is made irrelevant by my suggested protocol changes, but anyway: tmp is of type guint32. How can tmp > G_MAXUINT32 fail to be true? (Answer: it can't.) This range-checking should have happened while setting id, with the fixes above too. > + desc_node = wocky_node_add_child_with_content (content_node, "description", NULL); > + > + desc_node->ns = g_quark_from_string (NS_TUBES_ICE); You should be able to replace NULL with NS_TUBES_ICE in the first line, and then not need the second line. Even better, use wocky_node_add_build() and similar helpers, like this: wocky_node_add_build (content_node, '(', "description", ':', NS_TUBES_ICE, '(', "service", '$', type_str, ')', '(', "id", '$', id_str, ')', ... ')', NULL); > + /* At the moment, we don't support DTubes. This should > never happen, but just in case, return. */ You don't get to say "this should never happen" about things that come from the network, and in this case, the tube type comes directly from the network. With my suggested protocol changes, this wouldn't be an issue, because the tube type would be implicit in the namespace. > + case TP_CONNECTION_STATUS_CONNECTING: > + g_signal_connect (conn->jingle_mint, > + "incoming-session", > + G_CALLBACK (new_jingle_session_cb), self); This signal is never disconnected. At least use tp_g_signal_connect_object(). > + len = GPOINTER_TO_UINT (g_hash_table_lookup (priv->length_for_buffers, li->data)); > + if (!gibber_transport_send (transport, (const guint8 *) li->data, len, > + &error)) This list+hashtable is a really strange data-structure... > + size = *((guint16 *) &buffer[current_position + 1]); This is an unaligned access about half the time. It will fail on many non-x86 CPUs, sometimes with SIGBUS or sometimes with silent data corruption. You could avoid the alignment problems while also fixing the endianness, with: size = (256 * ((guchar) buffer[current_position + 1]) + (guchar) buffer[current_position + 2]); (or the opposite if you standardize on little-endian). You're also overrunning the end of the buffer if you receive exactly 1 or 2 bytes of the partial message. I think what you actually want to do is to replace all this TubePartialData stuff with a simple GByteArray containing unprocessed bytes, and do this: read() append all data from the read to the array pos = 0 /* pop complete messages until we run out of bytes */ while (array->len >= pos + 3) { connection ID = array[pos] size = from_big_endian (array[pos + 1 ... pos + 2, inclusive]) pos += 3 if (size + pos > array->len) break; dispatch message (connection ID, size, array[pos ... pos + size - 1, inclusive]) } /* discard processed bytes, move unprocessed bytes to the beginning */ g_byte_array_remove_range (array, 0, pos); (In reply to comment #3) > Also, Gabble crashes when I close vinagre: > > (telepathy-gabble:14744): GLib-CRITICAL **: > g_hash_table_foreach_remove_or_steal: assertion `version == > hash_table->version' failed This usually indicates that you did a g_hash_table_remove() (or g_hash_table_insert(), g_hash_table_replace() or similar) while iterating over the same hash table. I didn't spot where this could happen. (In reply to comment #4) > > + len = GPOINTER_TO_UINT (g_hash_table_lookup (priv->length_for_buffers, li->data)); > > + if (!gibber_transport_send (transport, (const guint8 *) li->data, len, > > + &error)) > > This list+hashtable is a really strange data-structure... I think a better way to do buffer management would be: * have a single GByteArray * when asked to write a blob of data, append it to the GByteArray and then attempt to write (you never know, it might work) * when the socket becomes writeable, attempt to write where "attempt to write" means: * write up to array->length bytes of array->data * if n >= 1 bytes are written, g_byte_array_remove_range (array, 0, n) and try again * if write fails with EINTR, try again * if write fails with EAGAIN or EWOULDBLOCK, go back to the main loop * if write fails with another error, close the connection (In reply to comment #4) > > + if (tmp == 0 || tmp > G_MAXUINT32) If your aim here is to be consistent with SI tubes, their ID range is actually 1 to G_MAXINT32, inclusive (i.e. the high bit in a 32-bit integer is never set). I would prefer to use arbitrary strings, though. (In reply to comment #0) > Tubes over ice-udp work reliably Something that jumps out at me about this branch is there is no regression test to ensure that this remains true. It needs at least this smoke-test: * Connect to the fake Jabber server twice (like exec_file_transfer_test in tests/twisted/jingle-share/file_transfer_helper.py) * On one connection, offer a stream tube * On the other connection, accept that stream tube * Connect to the stream tube twice * Send some bytes (either direction, doesn't matter) * Verify that they arrive at the other end * Disconnect and verify that *that* arrives at the other end (preferably with the offerer closing one connection and the accepter closing the other) Decent coverage for edge cases (rejecting the tube, trying to share a socket that isn't listening, simulating an incoming tube with an invalid <description/>, that sort of thing) would also be appreciated. Protocol comments: First, I agree with everything Simon said, please follow his advice. In the data framing protocol, please make the channel ID be 2 bytes instead of 1, so the header can be 4 bytes, not 3.. That will also match the ChannelData message in TURN. Branch archived in my repository on fd.o. <http://cgit.freedesktop.org/~smcv/telepathy-gabble/?h=drf-tubes-ice-udp> (In reply to comment #10) > Branch archived in my repository on fd.o. > <http://cgit.freedesktop.org/~smcv/telepathy-gabble/?h=drf-tubes-ice-udp> Er, make that <http://cgit.freedesktop.org/~smcv/telepathy-gabble/log/?h=drf-tubes-ice-udp> -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/telepathy/telepathy-gabble/issues/240. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.