Bug 14341

Summary: Gabble blocked on recv() when creating a jabber account with SSL
Product: Telepathy Reporter: Alban Crequy <alban.crequy>
Component: gabbleAssignee: Senko Rasic <senko>
Status: RESOLVED NOTOURBUG QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
URL: http://loudmouth.lighthouseapp.com/projects/17276/tickets/5
Whiteboard:
i915 platform: i915 features:

Description Alban Crequy 2008-02-02 16:55:53 UTC
I try to create a new Jabber account with Empathy and Gabble is blocked with this stack:

(gdb) bt
#0  0x00002b5719b4f885 in recv () from /lib/libc.so.6
#1  0x00002b571a01301a in ?? () from /usr/lib/libgnutls.so.13
#2  0x00002b571a0133ca in _gnutls_io_read_buffered ()
   from /usr/lib/libgnutls.so.13
#3  0x00002b571a01046a in _gnutls_recv_int () from /usr/lib/libgnutls.so.13
#4  0x00002b571a012956 in _gnutls_handshake_io_recv_int ()
   from /usr/lib/libgnutls.so.13
#5  0x00002b571a016d79 in _gnutls_recv_handshake ()
   from /usr/lib/libgnutls.so.13
#6  0x00002b571a017cc9 in _gnutls_handshake_client ()
   from /usr/lib/libgnutls.so.13
#7  0x00002b571a0181df in gnutls_handshake () from /usr/lib/libgnutls.so.13
#8  0x00002b5718cca6a9 in _lm_ssl_begin () from /usr/lib/libloudmouth-1.so.0
#9  0x00002b5718cc7bd3 in _lm_connection_succeeded ()
   from /usr/lib/libloudmouth-1.so.0
#10 0x00002b5718cc7ffc in ?? () from /usr/lib/libloudmouth-1.so.0
#11 0x00002b57197defd3 in g_main_context_dispatch ()
   from /usr/lib/libglib-2.0.so.0
#12 0x00002b57197e22dd in ?? () from /usr/lib/libglib-2.0.so.0
#13 0x00002b57197e25ea in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#14 0x00002b5718eee7ef in tp_run_connection_manager (
    prog_name=<value optimized out>, version=0x44f9c1 "0.7.2", 
    construct_cm=0x40dbd0 <tp_svc_channel_get_type@plt+344>, 
---Type <return> to continue, or q <return> to quit---
    argc=<value optimized out>, argv=<value optimized out>) at run.c:235
#15 0x00002b5719a98b44 in __libc_start_main () from /lib/libc.so.6
#16 0x000000000040dab9 in ?? ()
#17 0x00007fff92005728 in ?? ()
#18 0x0000000000000000 in ?? ()
(gdb) c
Continuing.

I cannot chat with other Jabber account since Gabble is blocked. Restarting Empathy does not help because Gabble is still blocked.

According to strace+netstat, Gabble waits some bytes from the Jabber server and this tcp connection is established.

There is maybe 2 problems here:
1/ the jabber server sends nothing and Gabble want to receive something. I don't know if this is a bug from the server or the client.
2/ Gabble do not use non-blocking socket. If the server is bogus, using blocking socket will prevent the user to chat with contacts on other Jabber accounts.
Comment 1 Guillaume Desmottes 2008-02-04 01:19:26 UTC
Which version of Gabble and Loudmouth are you using?
Comment 2 Alban Crequy 2008-02-04 04:09:08 UTC
On Ubuntu Gutsy,
- telepathy-gabble 0.7.2-0ubuntu1~ppa701+1
- libloudmouth1-0 1.2.3-2

I tested with the last darcs/SVN version of Gabble, MC, Empathy (but still libloudmouth1-0 1.2.3-2) and the bug also happens.

It is reproductible each time, even when I kill CMs, MC, Empathy and restart Empathy.

Same stack but with symbols:

(gdb) bt
#0  0x00002b5719b4f885 in recv () from /lib/libc.so.6
#1  0x00002b571a01301a in _gnutls_read (session=0x764f10, iptr=0x7692f0, sizeOfPtr=5, flags=0) at gnutls_buffers.c:323
#2  0x00002b571a0133ca in _gnutls_io_read_buffered (session=0x764f10, iptr=<value optimized out>, sizeOfPtr=<value optimized out>, 
    recv_type=<value optimized out>) at gnutls_buffers.c:579
#3  0x00002b571a01046a in _gnutls_recv_int (session=0x764f10, type=GNUTLS_HANDSHAKE, htype=GNUTLS_HANDSHAKE_SERVER_HELLO, 
    data=0x7659c8 "", sizeofdata=1) at gnutls_record.c:877
#4  0x00002b571a012956 in _gnutls_handshake_io_recv_int (session=0x764f10, type=GNUTLS_HANDSHAKE, 
    htype=GNUTLS_HANDSHAKE_SERVER_HELLO, iptr=0x7659c8, sizeOfPtr=1) at gnutls_buffers.c:1168
#5  0x00002b571a016d79 in _gnutls_recv_handshake (session=0x764f10, data=0x0, datalen=0x0, type=GNUTLS_HANDSHAKE_SERVER_HELLO, 
    optional=MANDATORY_PACKET) at gnutls_handshake.c:943
#6  0x00002b571a017cc9 in _gnutls_handshake_client (session=0x764f10) at gnutls_handshake.c:2204
#7  0x00002b571a0181df in gnutls_handshake (session=0x764f10) at gnutls_handshake.c:2111
#8  0x00002b5718cca6a9 in _lm_ssl_begin (ssl=0x6dd490, fd=11, server=0x6a5e30 "light.bluelinux.co.uk", error=0x7fff920054b8)
    at lm-ssl-gnutls.c:189
#9  0x00002b5718cc7bd3 in _lm_connection_succeeded (connect_data=0x6c8130) at lm-connection.c:392
#10 0x00002b5718cc7ffc in connection_connect_cb (source=0x72a220, condition=G_IO_OUT, connect_data=0x6c8130) at lm-connection.c:595
#11 0x00002b57197defd3 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#12 0x00002b57197e22dd in ?? () from /usr/lib/libglib-2.0.so.0
#13 0x00002b57197e25ea in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#14 0x00002b5718eee7ef in tp_run_connection_manager (prog_name=<value optimized out>, version=0x44f9c1 "0.7.2", 
    construct_cm=0x40dbd0 <tp_svc_channel_get_type@plt+344>, argc=<value optimized out>, argv=<value optimized out>) at run.c:235
#15 0x00002b5719a98b44 in __libc_start_main () from /lib/libc.so.6
#16 0x000000000040dab9 in ?? ()
#17 0x00007fff92005728 in ?? ()
#18 0x0000000000000000 in ?? ()
Comment 3 Simon McVittie 2008-02-04 04:34:25 UTC
I suspect this is Loudmouth's fault - IIRC, it mostly uses non-blocking sockets, and Gabble always uses it asynchronously, but there have been blocking sockets used in the TLS code in the past. Can you try with a backport of Loudmouth 1.3.3 from Debian or Hardy/universe?

Senko, could you take this bug? I think you're the LM expert at the moment...
Comment 4 Senko Rasic 2008-02-04 06:36:54 UTC
This is either LM (maybe for some reason the gnutls library wasn't properly
initialised) or GnuTLS bug. It'd be good if you could try and reproduce it
with backported 1.3.3 as Simon suggested, and see what happens.

In any case, when we're doing ssl init, we temporarly block (that part
of the code is quite old, but the comments suggest that it was needed
to do it syncronously), even if we're using async API otherwise.

Taking this bug.
Comment 5 Alban Crequy 2008-02-04 07:59:35 UTC
Same bug after rebuilding and installing libloudmouth1-0 1.3.3-1 (downloaded from Debian Unstable). The new stack is very close to the previous stack:

(gdb) bt
#0  0x00002b96e5748885 in recv () from /lib/libc.so.6
#1  0x00002b96e605401a in _gnutls_read (session=0x6ea280, iptr=0x6eafa0, sizeOfPtr=5, 
    flags=0) at gnutls_buffers.c:323
#2  0x00002b96e60543ca in _gnutls_io_read_buffered (session=0x6ea280, 
    iptr=<value optimized out>, sizeOfPtr=<value optimized out>, 
    recv_type=<value optimized out>) at gnutls_buffers.c:579
#3  0x00002b96e605146a in _gnutls_recv_int (session=0x6ea280, type=GNUTLS_HANDSHAKE, 
    htype=GNUTLS_HANDSHAKE_SERVER_HELLO, data=0x6ead38 "", sizeofdata=1)
    at gnutls_record.c:877
#4  0x00002b96e6053956 in _gnutls_handshake_io_recv_int (session=0x6ea280, 
    type=GNUTLS_HANDSHAKE, htype=GNUTLS_HANDSHAKE_SERVER_HELLO, iptr=0x6ead38, sizeOfPtr=1)
    at gnutls_buffers.c:1168
#5  0x00002b96e6057d79 in _gnutls_recv_handshake (session=0x6ea280, data=0x0, datalen=0x0, 
    type=GNUTLS_HANDSHAKE_SERVER_HELLO, optional=MANDATORY_PACKET) at gnutls_handshake.c:943
#6  0x00002b96e6058cc9 in _gnutls_handshake_client (session=0x6ea280)
    at gnutls_handshake.c:2204
#7  0x00002b96e60591df in gnutls_handshake (session=0x6ea280) at gnutls_handshake.c:2111
#8  0x00002b96e48608c8 in _lm_ssl_begin (ssl=0x6956f0, fd=5, 
    server=0x691610 "light.bluelinux.co.uk", error=0x7fffc646f510) at lm-ssl-gnutls.c:217
#9  0x00002b96e48627b1 in _lm_socket_ssl_init (socket=0x69aec0, delayed=0)
    at lm-socket.c:320
#10 0x00002b96e486298a in _lm_socket_succeeded (connect_data=0x691690) at lm-socket.c:385
#11 0x00002b96e4862b30 in socket_connect_cb (source=0x6a8bc0, condition=G_IO_OUT, 
    connect_data=0x691690) at lm-socket.c:509
#12 0x00002b96e53d7fd3 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
---Type <return> to continue, or q <return> to quit---
#13 0x00002b96e53db2dd in ?? () from /usr/lib/libglib-2.0.so.0
#14 0x00002b96e53db5ea in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#15 0x00002b96e4ad43a1 in tp_run_connection_manager (prog_name=0x45da48 "telepathy-gabble", 
    version=0x45da40 "0.7.2.1", construct_cm=0x40dbc8 <construct_cm>, argc=1, 
    argv=0x7fffc646f808) at run.c:238
#16 0x000000000040dd26 in main (argc=1, argv=0x7fffc646f808) at gabble.c:83

Comment 6 Alban Crequy 2008-05-23 08:41:48 UTC
Stack reproduced on Debian Sid with a different server (talk.google.com) and a different scenario (I just try to connect, not to register an account):


#0  0x00007f4480f102a5 in recv () from /lib/libc.so.6
#1  0x00007f448059f44f in _gnutls_read (session=0x1e154b0, iptr=0x1ffb340, 
    sizeOfPtr=0, flags=0) at gnutls_buffers.c:313
#2  0x00007f448059f7e8 in _gnutls_io_read_buffered (session=0x1e154b0, 
    iptr=<value optimized out>, sizeOfPtr=<value optimized out>, 
    recv_type=<value optimized out>) at gnutls_buffers.c:571
#3  0x00007f448059cc1d in _gnutls_recv_int (session=0x1e154b0, 
    type=GNUTLS_HANDSHAKE, htype=GNUTLS_HANDSHAKE_SERVER_HELLO, 
    data=0x1e15fd8 "", sizeofdata=<value optimized out>) at gnutls_record.c:893
#4  0x00007f448059f166 in _gnutls_handshake_io_recv_int (session=0x1e154b0, 
    type=GNUTLS_HANDSHAKE, htype=GNUTLS_HANDSHAKE_SERVER_HELLO, 
    iptr=0x1e15fd8, sizeOfPtr=1) at gnutls_buffers.c:1124
#5  0x00007f44805a34c9 in _gnutls_recv_handshake (session=0x1e154b0, data=0x0, 
    datalen=0x0, type=GNUTLS_HANDSHAKE_SERVER_HELLO, optional=MANDATORY_PACKET)
    at gnutls_handshake.c:1023
#6  0x00007f44805a3f03 in _gnutls_handshake_client (session=0x1e154b0)
    at gnutls_handshake.c:2325
#7  0x00007f44805a4a5f in gnutls_handshake (session=0x1e154b0)
    at gnutls_handshake.c:2246
#8  0x00007f4481932ae9 in _lm_ssl_begin (ssl=0x1dc8360, fd=5, 
    server=0x1df3180 "talk.google.com", error=0x7fff8a3f8910)
    at lm-ssl-gnutls.c:217
#9  0x00007f4481934a21 in _lm_socket_ssl_init (socket=0x23d0780, delayed=0)
    at lm-socket.c:338
#10 0x00007f4481934c2a in _lm_socket_succeeded (connect_data=0x1dd4e90)
    at lm-socket.c:406
#11 0x00007f4481934ce7 in socket_connect_cb (source=0x23cde60, 
    condition=<value optimized out>, connect_data=0x1dd4e90) at lm-socket.c:559
#12 0x00007f448146f0f2 in IA__g_main_context_dispatch (context=0x1dbb2e0)
    at /tmp/buildd/glib2.0-2.16.3/glib/gmain.c:2009
#13 0x00007f4481472396 in g_main_context_iterate (context=0x1dbb2e0, block=1, 
    dispatch=1, self=<value optimized out>)
    at /tmp/buildd/glib2.0-2.16.3/glib/gmain.c:2642
#14 0x00007f4481472657 in IA__g_main_loop_run (loop=0x1dbb3c0)
    at /tmp/buildd/glib2.0-2.16.3/glib/gmain.c:2850
#15 0x00007f44812051a3 in tp_run_connection_manager (
    prog_name=0x451721 "telepathy-gabble", version=0x452ffb "0.7.6.1", 
    construct_cm=0x40df80 <construct_cm>, argc=1, argv=0x7fff8a3f8c08)

loudmouth version: git git://github.com/hallski/loudmouth.git 66f80949bc20d1127658904815098f0895e74680 (Tue Apr 15 17:45:12 2008 +0200)

gnutls version: from Debian Sid libgnutls26 2.2.5-1
Comment 7 Alban Crequy 2008-05-23 10:19:36 UTC
This seems to be a loudmouth bug to me:

The function loudmouth/lm-socket.c _lm_socket_ssl_init() does:

1/ set the socket to be blocking:
  /* GNU TLS requires the socket to be blocking */
  _lm_sock_set_blocking (socket->fd, TRUE);

2/ call gnutls_handshake() through _lm_ssl_begin() without checking for GNUTLS_E_AGAIN error code.
  if (!_lm_ssl_begin (socket->ssl, socket->fd, ssl_verify_domain, &error)) {

(_lm_ssl_begin checks if the error is >=0 or <0 and returns FALSE or TRUE)

3/ set the socket to be non blocking


Comment 8 Alban Crequy 2008-05-26 11:25:09 UTC
I think this Loudmouth bug has the same cause (blocking read):
http://developer.imendio.com/issues/browse/LM-44
Comment 9 Simon McVittie 2009-02-02 10:03:52 UTC
The new location of the bug formerly known as LM-44 seems to be:

http://loudmouth.lighthouseapp.com/projects/17276/tickets/5
Comment 10 Dafydd Harries 2009-09-24 09:50:23 UTC
Iz Loudmouth bug. Will worked on improving the SSL code to be async, but the Gnutls code still has some sync bits that cause this.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.