Bug 90118 - SyncEvolution crash trying to convert a string to a GVariant string
Summary: SyncEvolution crash trying to convert a string to a GVariant string
Status: RESOLVED FIXED
Alias: None
Product: SyncEvolution
Classification: Unclassified
Component: SyncEvolution (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Patrick Ohly
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-20 17:53 UTC by renato filho
Modified: 2015-06-05 15:26 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
full crash backtrace (12.06 KB, text/plain)
2015-04-20 17:54 UTC, renato filho
Details
Avoid crash during the log (1.42 KB, patch)
2015-04-20 17:57 UTC, renato filho
Details | Splinter Review
incomplete attempt to enforce sane UTF-8 string in the D-Bus bindings (10.97 KB, patch)
2015-06-01 08:26 UTC, Patrick Ohly
Details | Splinter Review

Description renato filho 2015-04-20 17:53:04 UTC
During the sync, syncevolution try to log some information about the event and crash because the function 'g_variant_new_string' return a null pointer.


The event title contains '"@default/calendar_uoa_5: updating \"Lunch Thursday - Steph & Joe\377Lunch Thursday w/Steph\""' which is not a valid utf-8 string. due the '\377'. I am not sure if the char '\377' is part of the event title or it was generated by a previous conversion. 


This cause this function to crash: (/src/gdbusxx/gdbus-cxx-bridge.h:1510)

g_variant_builder_add_value(&builder, g_variant_new_string(value.c_str());

because the g_variant_new_string returns a null pointer.
Comment 1 renato filho 2015-04-20 17:54:55 UTC
Created attachment 115231 [details]
full crash backtrace
Comment 2 renato filho 2015-04-20 17:57:43 UTC
Created attachment 115232 [details] [review]
Avoid crash during the log

check for g_variant_new_string result before call g_variant_builder_add_value
Comment 3 Patrick Ohly 2015-05-05 12:43:44 UTC
Comment on attachment 115232 [details] [review]
Avoid crash during the log

Review of attachment 115232 [details] [review]:
-----------------------------------------------------------------

The commit message needs to include information about the root cause for the failure. As mentioned in the original bug report, this occurs when the "char *" string isn't really UTF-8 encoded.

I am not convinced that silently ignoring the issue by replacing the actual string with an empty one is the right thing to do. It's okay for a hotfix, but not for inclusion upstream.

If it is a string created by SyncEvolution, then throwing an error is the right thing to do. If it is a string coming from an uncertain source (like the calendar event), then it would be better to sanitize the data before sending it via D-Bus. For example, g_utf8_validate() could be used to copy just the valid characters and replace the rest with a special character like ?

Unfortunately catching all places where invalid strings might be passed into the gdbus layer is a lot of work, so pampering over the problem inside the layer by sanitizing all "char *" strings might be the only viable approach. Perhaps gdbus should not accept the ambiguous "char *" at all and instead rely on the caller to convert to a "UTF8Char *" first, either by casting (string known to be valid) or converting (untrusted string).
Comment 4 Patrick Ohly 2015-05-05 12:44:13 UTC
I'll have a look at fixing this properly.
Comment 5 Patrick Ohly 2015-06-01 08:26:00 UTC
Created attachment 116196 [details] [review]
incomplete attempt to enforce sane UTF-8 string in the D-Bus bindings

(In reply to Patrick Ohly from comment #4)
> I'll have a look at fixing this properly.

My attempt to introduce an explicit "this is an UTF-8 string" type ended up causing much more changes than hoped for, and still isn't complete (attached). I stopped when appending a std::pair<std::string, ...> as return value would not compile because it is treated like appending a parameter - adding a special exception as for normal return values would break the key point of the exercise for D-Bus method calls (forcing the caller to think about string content).

So I'll focus instead on silently removing invalid bytes.
Comment 6 Patrick Ohly 2015-06-05 15:26:48 UTC
Included the following patch in the upcoming 1.5.1:

commit a5f0139a1070cdde4ec45781059d8703d0341381
Author: Patrick Ohly <patrick.ohly@intel.com>
Date:   Mon Jun 1 15:05:16 2015 +0200

    syncing: avoid segfault for invalid text inside items (FDO #90118)
    
    As reported by Canonical, syncing fails if data items contain
    text which is not correct UTF-8 in one of the fields that
    SyncEvolution logs in the command line output (like SUMMARY of
    a calendar event).
    
    That is because the byte string coming from the item is passed
    unchecked to the D-Bus implementation for transmission via D-Bus. But
    D-Bus strings must be correct UTF-8, so depending on the D-Bus library
    in use, one gets a segfault (GIO D-Bus, due to an unchecked NULL
    pointer access) or an "out of memory" error (libdbus, which checks for
    NULL).
    
    What the D-Bus bindings now do is checking the string in advance (to
    avoid error messages inside the D-Bus implementations) and then
    replacing all invalid bytes with question marks. The rest of the
    string is preserved.
    
    Handling this inside the D-Bus binding layer is not the "correct"
    solution (which would be to check for UTF-8 in the higher layers were
    such bad data might get into SyncEvolution), but it is the less
    intrusive and more complete one. Changing the bindings such that all
    strings must be declared explicitly as "UTF-8 string" would have been
    a way to find all places where such checks are missing, but it turned
    out to be too complex and requiring too many changes.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.