Bug 77065

Summary: contact merge corrupts PHOTO
Product: SyncEvolution Reporter: renato filho <renatox>
Component: SyncEvolutionAssignee: renato filho <renatox>
Status: NEEDINFO --- QA Contact:
Severity: major    
Priority: highest CC: syncevolution-issues
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Patch to fix photo merging.
server item wins, URI sent back
valgrind log of the sync operation which reults in this error
Syncevolution log fails to sync contact
Sync contact works

Description renato filho 2014-04-04 19:53:02 UTC
I using sync evolution to sync my contacts and after a while I started
to received the error 400 during the sync and now every sync
("two-way") returns the same error.
Comment 1 renato filho 2014-04-04 19:55:15 UTC
Created attachment 96920 [details]
Patch to fix photo merging.
Comment 2 renato filho 2014-04-04 19:56:58 UTC
After some discussion with patrik on IRC, we figure out this is a problem on avatar merge, patrick suggest a change on src/syncevo/configs/scripting/05vcard-merge.xml to fix it.
Comment 3 Patrick Ohly 2014-04-07 09:50:41 UTC
The problem occurs when local and remote item are both modified and PHOTO data was moved into a local file (as done by EDS). If the PHOTO is not modified, the local file should win the merge.

What happens instead is that 05vcard-merge.xml did not handle this particular case and left two different PHOTO values in the winning and loosing item. libsynthesis then concatenated the two values (URL + raw data).

It is unclear at the moment to me why the engine handled PHOTO like that. The "merge" attribute was not set, so it shouldn't have concatenated. Need to check.

But regardless of that, the merge needs to be done by the script because the engine does not know that PHOTO, PHOTO_TYPE and PHOTO_VALUE are related.

SyncEvolution doesn't have automated tests for merge handling because merge handling used to be under control of remote SyncML servers. Now we can use both http server mode in the nightly testing (for example client-test EDS<->EDS, with the existing testMerge) and/or local sync (as in test-dbus.py). The challenge remains that testing needs to work with various combinations of storages. This makes it more suitable to use client-test because can already be configured to run with different combinations. What's missing is a way to tell client-test what test data to use and what the expected result is.
Comment 4 Patrick Ohly 2014-04-09 15:59:53 UTC
Created attachment 97139 [details]
server item wins, URI sent back

I thought I knew what triggered the issue, but it looks like I was wrong. Here's the log file from the local side where merging has to be done for an update/update confict (look for "John 2 Doe") and it works correctly without the script patch.
Comment 5 Patrick Ohly 2014-04-09 16:00:36 UTC
Renato, can you provide a log file at level 4 which shows how the local, EDS side corrupts the PHOTO value?
Comment 6 renato filho 2014-04-09 18:22:45 UTC
(In reply to comment #5)
> Renato, can you provide a log file at level 4 which shows how the local, EDS
> side corrupts the PHOTO value?

Hi Patrick this is the log: http://ubuntuone.com/16V9ZZM8TsQYCuLfHBbm07
Comment 7 Patrick Ohly 2014-04-09 20:18:44 UTC
Do you have the possibility to run "syncevolution --daemon=no" for the failing sync under valgrind?

The steps during merging are the same as in my own test:
- script doesn't do anything
- winning PHOTO value is the URI from the local side
- when modifying the copy in EDS, a read/write cycle is involved
- during that step the PHOTO value gets corrupted.

This looks like plain old memory corruption to me. I'll try that myself tomorrow, perhaps using a larger photo.
Comment 8 Patrick Ohly 2014-04-09 20:40:18 UTC
(In reply to comment #7)
> Do you have the possibility to run "syncevolution --daemon=no" for the
> failing sync under valgrind?
> 
> The steps during merging are the same as in my own test:
> - script doesn't do anything
> - winning PHOTO value is the URI from the local side
> - when modifying the copy in EDS, a read/write cycle is involved
> - during that step the PHOTO value gets corrupted.

In your log. In my test the result is fine.
Comment 9 Patrick Ohly 2014-04-10 12:45:41 UTC
(In reply to comment #3)
> It is unclear at the moment to me why the engine handled PHOTO like that.
> The "merge" attribute was not set, so it shouldn't have concatenated. Need
> to check.

It didn't do that when I was checking. That means that there is an unidentified bug in the code somewhere. I consider this issue here unresolved until we find out what it is.

Updating the merge script hides the problem and is the right solution for photo merging, but it merely hides the reason for the corruption, it doesn't solve it.

Renato, as I can't reproduce the problem, can you investigate?

Running valgrind would be my first step. Next comes running under gdb.

The merge script compares photo data but doesn't write to it, so the two sides should still be okay when MERGEFIELDS(mode) gets called. You can set a breakpoint in sysync::TMFTypeFuncs::func_MergeFields for that. You need to find the code which prints:


[2014-04-04 19:06:36.981] Field 'PHOTO' available and enabled for merging, mode/sep=0x00FE, relevant
[2014-04-04 19:06:36.981] - assigned in winning / assigned in loosing
[2014-04-04 19:06:36.981] - try merging winning value 'file:///home/phablet/.local/share/evolut' with loosing value '�PNG

'
[2014-04-04 19:06:40.469]  merged something, winning value is 'file:///home/phablet/.local/share/evolut'
[2014-04-04 19:06:40.469] Winning and loosing Field 'PHOTO' not equal: 'file:///home/phablet/.local/sh' <> '�PNG

'
[2014-04-04 19:06:40.469] - updated fields such that both have same value 'file:///home/phablet/.local/share/evolut'

And then check the content of the fields for corruption.

Later, the local item is corrupted:

[2014-04-04 19:06:40.488] TStdLogicDS: Item updated with contents from remote
[2014-04-04 19:06:40.488] Item LocalID='pas-id-533EBFB40000000E', RemoteID='76045616125e182d', operation=replace, size: [maxlocal,maxremote,actual]
[2014-04-04 19:06:40.488] 
...
- 75 :     string PHOTO           [   0,   0,636325] : "file:///home/phablet/.local/share/evolution/add...0��ǐ��0���B��+�%�m9���V�ԭ�
- 76 :     string PHOTO_TYPE      [   0,   0,     7] : "unknown"
- 77 :     string PHOTO_VALUE     [   0,   0,     3] : "uri"

The item that gets sent back to remote is also corrupted:

[2014-04-04 19:06:41.423] 'Item_Generate' - generating SyncML item, SyncOp=replace, RemoteID=76045616125e182d [--][++] [->end] [->enclosing]

    +
    [2014-04-04 19:06:41.424] 'ScriptExecute' - Start executing Script, name=outgoingscript [--][++] [->end] [->enclosing]
    [2014-04-04 19:06:41.441] Generating....
    [2014-04-04 19:06:41.441] Item LocalID='', RemoteID='76045616125e182d', operation=replace, size: [maxlocal,maxremote,actual]
...
- 75 :     string PHOTO           [   0,   0,636325] : "file:///home/phablet/.local/share/evolution/add...0��ǐ��0���B��+�%�m9���V�ԭ�
- 76 :     string PHOTO_TYPE      [   0,   0,     0] : <unassigned>
- 77 :     string PHOTO_VALUE     [   0,   0,     3] : "uri"
Comment 10 renato filho 2014-04-11 17:09:17 UTC
Created attachment 97230 [details]
valgrind log of the sync operation which reults in this error
Comment 11 Patrick Ohly 2014-04-11 18:10:01 UTC
(In reply to comment #10)
> Created attachment 97230 [details]
> valgrind log of the sync operation which reults in this error

valgrind is not particularly happy about libdl, but that shouldn't be causing the current problem. I'm afraid you'll have to single-step through the PHOTO assignment to figure out why the data is not overwritten properly.
Comment 12 renato filho 2014-04-14 13:16:37 UTC
Created attachment 97349 [details]
Syncevolution log fails to sync contact
Comment 13 renato filho 2014-04-14 13:18:12 UTC
Created attachment 97350 [details]
Sync contact works
Comment 14 Patrick Ohly 2014-05-21 12:07:19 UTC
There are two separate issues left to work on:
* track down the reason for the memory corruption on ARM when the patch is not applied
* merge the PHOTO merge patch back into upstream SyncEvolution

Renato, any progress?
Comment 15 Patrick Ohly 2014-07-14 12:07:15 UTC
I committed the vcard merge fix into master. Getting to the bottom of the crash on ARM still would be good.

commit 1431fd4a7be7ef6fce5743cf757362c65ece0e8d
Author: Patrick Ohly <patrick.ohly@intel.com>
Date:   Fri Jul 11 02:35:41 2014 -0700

    datatypes: avoid PHOTO corruption during merge (FDO #77065)
    
    When handling an update/update conflict (both sides of the sync have an
    updated contact) and photo data was moved into a local file by EDS, the engine
    merged the file path and the photo data together and thus corrupted the photo.
    
    The engine does not know about the special role of the photo property.
    This needs to be handled by the merge script, and that script did not
    cover this particular situation. Now the loosing side is cleared,
    causing the engine to then copy the winning side over into the loosing
    one.
    
    Found by Renato Filho/Canonical when testing SyncEvolution for Ubuntu 14.04.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.