Bug 28747

Summary: [telepathy-butterfly 0.5.11-1~ppa10.04+1] telepathy-butterfly crashed with UnicodeDecodeError in _signal_text_received()
Product: Telepathy Reporter: Cristian Aravena <caravena>
Component: butterflyAssignee: Telepathy bugs list <telepathy-bugs>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium CC: benjavalero
Version: git master   
Hardware: Other   
OS: All   
URL: http://git.collabora.co.uk/?p=telepathy-butterfly.git;a=commit;h=397f50a7018af53cd5fff17fcd4110ce05e5d7f8
Whiteboard:
i915 platform: i915 features:
Attachments: _usr_lib_telepathy_telepathy-butterfly.1000.crash
_usr_lib_telepathy_telepathy-butterfly.1000.crash

Description Cristian Aravena 2010-06-24 19:55:58 UTC
Created attachment 36482 [details]
_usr_lib_telepathy_telepathy-butterfly.1000.crash

ProblemType: Crash
Date: Thu Jun 24 22:48:23 2010
ExecutablePath: /usr/lib/telepathy/telepathy-butterfly
InterpreterPath: /usr/bin/python2.6
ProcCmdline: /usr/bin/python /usr/lib/telepathy/telepathy-butterfly
ProcCwd: /
ProcEnviron:
 SHELL=/bin/bash
 LANG=es_CL.utf8

DistroRelease: Ubuntu 10.04
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta i386 (20100406.1)
Package: telepathy-butterfly 0.5.11-1~ppa10.04+1
PackageArchitecture: all
SourcePackage: telepathy-butterfly
Tags: lucid
Title: telepathy-butterfly crashed with UnicodeDecodeError in _signal_text_received()
Uname: Linux 2.6.34-020634-generic i686
UnreportableReason: This is not a genuine Ubuntu package

raceback:
 Traceback (most recent call last):
   File "/usr/lib/pymodules/python2.6/papyon/switchboard_manager.py", line 348, in _sb_message_received
     handler._on_message_received(message)
   File "/usr/lib/pymodules/python2.6/papyon/conversation.py", line 362, in _on_message_received
     self._dispatch("on_conversation_message_received", sender, msg)
   File "/usr/lib/pymodules/python2.6/papyon/event/__init__.py", line 44, in _dispatch
     if event_handler._dispatch_event(name, *args):
   File "/usr/lib/pymodules/python2.6/papyon/event/__init__.py", line 65, in _dispatch_event
     handler(*params)
   File "/usr/lib/python2.6/dist-packages/butterfly/channel/text.py", line 295, in on_conversation_message_received
     self._signal_text_received(id, timestamp, handle, type, 0, message.display_name, content)
   File "/usr/lib/python2.6/dist-packages/butterfly/channel/text.py", line 167, in _signal_text_received
     headers[dbus.String('sender-nickname')] = dbus.String(sender_nick)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 14: ordinal not in range(128)
Comment 1 Cristian Aravena 2010-06-24 20:00:30 UTC
Nick of my friend:
[b][c=0][a=2] ★ [/a][a=46] Hanzzman [/a] [/c][/b]

Problem with: ★
Comment 2 Cristian Aravena 2010-06-24 20:11:32 UTC
Created attachment 36484 [details]
_usr_lib_telepathy_telepathy-butterfly.1000.crash
Comment 3 Benjamín Valero Espinosa 2010-07-24 11:45:56 UTC
This happens when a nickname has non-Latin characters, in this block of code:

    def _signal_text_received(self, id, timestamp, sender, type, flags, sender_nick, text):
        self.Received(id, timestamp, sender, type, flags, text)
        headers = dbus.Dictionary({dbus.String('message-received') : dbus.UInt64(timestamp),
                   dbus.String('pending-message-id') : dbus.UInt32(id),
                   dbus.String('message-sender') : dbus.UInt32(sender),
                   dbus.String('message-type') : dbus.UInt32(type)
                  }, signature='sv')

        if sender_nick not in (None, ''):
            headers[dbus.String('sender-nickname')] = dbus.String(sender_nick)

        body = dbus.Dictionary({dbus.String('content-type'): dbus.String('text/plain'),
                dbus.String('content'): dbus.String(text)
               }, signature='sv')
        message = dbus.Array([headers, body], signature='a{sv}')
        self.MessageReceived(message)

I am not an expert in Python, but I think that the problem is in the signature of the dbus dictionaries (sv), because it supposes that the string given as key is ascii. Perhaps that can be solved using "uv" as signature, or no signature at all.
Comment 4 Simon McVittie 2010-07-26 03:54:09 UTC
(In reply to comment #3)
> I think that the problem is in the signature
> of the dbus dictionaries (sv)

That's not it; the signatures are defined by D-Bus, not Python, and a D-Bus string is always UTF-8. dbus.String is a subclass of Python's unicode data type.

>         if sender_nick not in (None, ''):
>             headers[dbus.String('sender-nickname')] = dbus.String(sender_nick)

Based on what you said, the problem is probably that dbus.String(sender_nick) is basically the same as unicode(sender_nick), which will fail if sender_nick is non-ASCII. Butterfly should be using sender_nick.decode('SOME-ENCODING') where SOME-ENCODING is whatever the protocol uses - hopefully UTF-8?
Comment 5 Benjamín Valero Espinosa 2010-07-26 04:50:11 UTC
(In reply to comment #4)
> Based on what you said, the problem is probably that dbus.String(sender_nick)
> is basically the same as unicode(sender_nick), which will fail if sender_nick
> is non-ASCII. Butterfly should be using sender_nick.decode('SOME-ENCODING')
> where SOME-ENCODING is whatever the protocol uses - hopefully UTF-8?

I thought that so, but also looked what the D-Bus signature means, and the first letter "s" means "String" type, so I thought that maybe changing it to "u" (Unicode type) will do the trick.
Comment 6 Benjamín Valero Espinosa 2010-07-26 04:51:39 UTC
By the way, there other blocks in the code that also use D-Bus Dictionaries, and do not specify signature.
Comment 7 Will Thompson 2010-08-03 06:52:11 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Based on what you said, the problem is probably that dbus.String(sender_nick)
> > is basically the same as unicode(sender_nick), which will fail if sender_nick
> > is non-ASCII. Butterfly should be using sender_nick.decode('SOME-ENCODING')
> > where SOME-ENCODING is whatever the protocol uses - hopefully UTF-8?
> 
> I thought that so, but also looked what the D-Bus signature means, and the
> first letter "s" means "String" type, so I thought that maybe changing it to
> "u" (Unicode type) will do the trick.

No... 's' is the D-Bus signature for a UTF-8 string. 'u' is a unsigned 32-bit integer. See the table in http://dbus.freedesktop.org/doc/dbus-specification.html#message-protocol-signatures
Comment 8 Benjamín Valero Espinosa 2010-08-03 07:00:29 UTC
Oops, sorry!
Comment 9 Louis-Francis Ratté-Boulianne 2010-08-23 12:09:27 UTC
Fixed in git master
Comment 10 Louis-Francis Ratté-Boulianne 2010-08-25 09:45:01 UTC
*** Bug 29709 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.