Bug 52800 - rewrite synccompare in C++
Summary: rewrite synccompare in C++
Status: RESOLVED MOVED
Alias: None
Product: SyncEvolution
Classification: Unclassified
Component: SyncEvolution (show other bugs)
Version: unspecified
Hardware: All All
: low enhancement
Assignee: SyncEvolution Community
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-13 01:34 UTC by SyncEvolution Community
Modified: 2018-10-13 12:39 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
longest common sequence implementation with a notion of "better" sequences (deleted)
2010-04-13 01:36 UTC, SyncEvolution Community
Details

Description Patrick Ohly 2012-07-29 18:36:00 UTC


---- Reported by jingke.zhang@intel.com 2010-04-13 01:34:14 +0000 ----

This issue is from BMO#2432 (http://bugzilla.moblin.org/show_bug.cgi?id=2432)

Currently synccompare is a Perl script because that was the easiest way to have
regular search/replace on all platforms. For performance reasons and for better
control of the result, a rewrite in C++ would be useful.

Need to add better description of the task, thus assigned to me.

------- Comment #1 From pohly 2009-06-08 07:42:59 PST (-) [reply] -------

synccompare is a tool which turns sets of contacts or calendar items into a
normal form that is slightly easier to read:
* uses indention for BEGIN/END pairs
* same order of properties and values
* no redundant default values
* configurable line length
* etc.

In addition, it compares two sets and prints items which differ (and only those
items) in a side-by-side comparison similar to a context diff. The return code
indicates whether changes were found. The heading is configurable.

A normal form is necessary before comparison because vCard and iCalendar allow
many different equivalent formats for the same information.

synccompare is used by the syncevolution command line to report data changes.
It is used by client-test to find unwanted data modifications during testing.

For that second role, synccompare also supports transformations in the normal
form which are caused by known server deficiencies. For example, suppose server
foo doesn't property BAR. This would show up in the diff of
Client::Sync::*::testItems as a lost property BAR. When invoked with
CLIENT_TEST_SERVER=foo, synccompare will remove all properties BAR and thus the
diff succeeds.

synccompare is meant to find defects in the PIM data handling code used by
SyncEvolution. Therefore it cannot rely on that same code to generate the
normal form. Otherwise a bug which, f.i., drops a property would affect the
sync and the verification the same way and not show up in the diff.

Currently synccompare uses regular expression search/replace to manipulate
vCard 3.0 and iCalendar 2.0 data. These formats are regular enough to do simple
transformations (line unfolding) via regular expressions. vCard 2.1 and
vCalendar 1.0 have more difficult folding rules and therefore do not work 100%
correctly at the moment. It would be good to also support them.

synccompare is implemented as a Perl script because that is the most widely
deployed implementation of regular expression search/replace with UTF-8 support
(useful for correct line length calculation!). It's also fairly simple to come
up with quick hacks for normalization... but it is hard to maintain and has
performance issues.

A rewrite with C++ and suitable libraries (Boost?) might overcome these
limitations. Transformations should be specified in a custom, higher-level
format instead of s// statements in the source code (to be defined - perhaps a
configurable list of regular expressions?). Not all target platforms will
necessarily have these libraries, so it might be useful to keep the Perl script
around and perhaps rewrite it so that it uses the same configuration.

I already implemented a better diff algorithm in C++. "Better" in the sense
that if there are are two equally short diffs, it will pick the one which
covers less items. I have the code somewhere, need to upload it here.

------- Comment #2 From pohly 2009-06-12 07:21:31 PST (-) [reply] -------

Created an attachment [details]
longest common sequence implementation with a notion of "better" sequences

This is the code which might be useful for implementing the diff in synccompare
in C++.



---- Additional Comments From jingke.zhang@intel.com 2010-04-13 01:36:17 +0000 ----

Created attachment 187 [details]
longest common sequence implementation with a notion of "better" sequences

Add the attachment of the BMO#2432 bug.



--- Bug imported by patrick.ohly@gmx.de 2012-07-29 20:36 UTC  ---

This bug was previously known as _bug_ 675 at https://bugs.meego.com/show_bug.cgi?id=675
Imported an attachment (id=64881)
Comment 1 GitLab Migration User 2018-10-13 12:39:23 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/SyncEvolution/syncevolution/issues/23.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.