Bug 42462 - Crash when getting a non-utf8 presence status
Summary: Crash when getting a non-utf8 presence status
Status: RESOLVED FIXED
Alias: None
Product: Telepathy
Classification: Unclassified
Component: gabble (show other bugs)
Version: git master
Hardware: Other All
: medium blocker
Assignee: Telepathy bugs list
QA Contact: Telepathy bugs list
URL:
Whiteboard:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2011-11-01 02:31 UTC by Sjoerd Simons
Modified: 2011-11-29 04:00 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
straw-men patch (972 bytes, patch)
2011-11-01 02:34 UTC, Sjoerd Simons
Details | Splinter Review

Description Sjoerd Simons 2011-11-01 02:31:31 UTC
It seems some-one in the prosody muc has the following byte sequence as their presence status: {0xef, 0xb7, 0xaf}. Which apparently xml2 doesn't validate or chokes upon, causing a nice crash when emitting this status message over d-bus :/
Comment 1 Sjoerd Simons 2011-11-01 02:34:55 UTC
Created attachment 52982 [details] [review]
straw-men patch

Silly patch that works around the issue. Needs more checking for all the places where we could get non-utf8 out of libxml and make sure we verify them all + add tests to make sure things are happy..

Really like to have some fuzzing tests at some point :)
Comment 2 Sjoerd Simons 2011-11-01 13:54:17 UTC
So it seems the issue stems from the fact that prosody and probably other xmpp server pass through all valid unicode code-points. Even though some of those codepoint are specified as being Non Characters which should only be used for internal use.

D-Bus and Glib on the other hand only consider Unicode Characters to be *valid*, not all Unicode codepoints..

Great fun!
Comment 3 Sjoerd Simons 2011-11-28 02:19:05 UTC
Fixed in my branch:
  http://cgit.collabora.com/git/user/sjoerd/wocky.git/log/?h=invalid-character-test
Comment 4 Simon McVittie 2011-11-28 02:33:38 UTC
This would benefit from https://bugzilla.gnome.org/show_bug.cgi?id=610969 being fixed. Maybe a GLib reviewer will notice that bug one day, or maybe based on your experience of writing this patch you can give feedback on which of the proposed features on that bug you would/wouldn't find useful...

+ g_string_append (result, "�");

I'm not sure how portable it is to have UTF-8 in our string constants: the version in GLib is "\357\277\275" with a comment explaining that it's U+FFFD REPLACEMENT CHARACTER.

Otherwise this looks fine.
Comment 5 Sjoerd Simons 2011-11-29 04:00:53 UTC
Fixed in git


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.