Since I upgraded to SyncEvolution 1.5, synchronizing my calendar (in two-way mode) with my CalDAV server (Radicale) generates an abnormally high amount of network traffic.
This is the corresponding entry in my apache log:
2001:7a8:3d80:0:a00:27ff:fe70:ffdd - - [16/Nov/2014:00:23:09 +0100] "REPORT /sebastien/calendar.ics/ HTTP/1.1" 207 26684196 "-" "SyncEvolution"
As you can see, this simple two-way synchronization consumes 26Mb, while the ics file of the calendar is 300kb.
This is of course highly inefficient, and also makes SyncEvolution almost unusable to me, because synchronization now takes 6 minutes (due to low upload bandwith of my server).
SyncEvolution 1.4 was much more efficient in this respect, and was doing the synchronization in a couple of seconds.
Any idea how to fix this? Is there a simple workaround that I could apply?
Thanks for creating SyncEvolution.
I've no idea why 1.5 should be worse than 1.4. Please help me investigate that further.
For that, increase the loglevel to 4 in the target-config of the CalDAV sync and run a sync.
Was it reported as a normal, two-way sync with no changes on either side?
Look at the syncevolution-log.html of the target-config side for that sync session. What is this REPORT which produces the excessive amount of data?
It would be useful if I could look at the log leading up to that REPORT. Perhaps you can send it to me at patrick.ohly at gmx.de? Feel free to strip out personal data, although I also of course promise to keep it confidential.
It may also be useful to downgrade to 1.4; the archives can be downloaded here: http://downloads.syncevolution.org/syncevolution/archive/
Don't bother if you don't have the time to try that. Perhaps the additional information about 1.5 already gives me enough clues.
The sync was indeed reported as normal, two-way sync.
I sent you detailed log by private email.
The unexpectedly high traffic is caused by a "is the collection empty" check. This is relevant for slow sync prevention, because a slow sync is allowed to run when there is no data.
The 1.5 release does the check whereas 1.4 (unintentionally) skipped it. This was fixed for 220.127.116.11 in:
Author: Patrick Ohly <firstname.lastname@example.org>
Date: Wed Aug 20 16:23:15 2014 +0200
VirtualSyncSource, MapSyncSource: implement m_isEmpty
This is relevant for isUsable() and not so much (not at all?) for
slow sync prevention.
The comment explain why the change was made. It *is* relevant for slow sync prevention, it just wasn't the primary reason for changing the code.
The underlying problem is that the "is empty" check is currently done using code which downloads the entire calendar. This is necessary to work around bugs in (ironically) servers like Radical 0.7 which did (do?) not implement a less costly operation reliably: when asked to report only about VEVENTs, VTODO or VJOURNAL are also included in the response and vice versa.
The underlying problem is that a collection may contain a mixture of those, but SyncEvolution only syncs one kind and thus needs to determine the content type of each item. There is no good support for that in CalDAV.
There are three possible solutions:
1. restore the 1.4 behavior of always assuming that there is data (basically disabling the more intelligent slow-sync prevention)
2. implement a simpler "is empty" check with potentially false "is not empty" results when there are some items of unknown content (for example, we sync VEVENTs and the collection contains only VJOURNAL, but we don't know that because we avoid downloading the data and thus incorrectly assume that there might be VEVENTs)
3. implement logic which determines whether a collection really can contain a mixture of items and only do the more costly check in that case
Option 3 can be combined with 2.
I think I'll start with 2.
Thanks for making progress on the issue.
What I still don't get however is why a transfer of 26Mb is needed for a 300kb ICS file. It should theoretically be possible to keep your existing check (i.e. downloading the whole calendar) but using a more efficient download method. Is it a limitation of the CalDAV protocol? A bug in radicale? I don't know the underlying technical details, so I may miss something though…
(In reply to Sébastien Villemot from comment #4)
> What I still don't get however is why a transfer of 26Mb is needed for a
> 300kb ICS file.
I forgot to comment on that. That happens on the server side. There was a recent change in libical that causes it to include the entire history of a time zone in each item. For example, most data logged for the first item in your log file is just the VTIMEZONE:
PRODID:-//Ximian//NONSGML Evolution Calendar//EN
Worse, the *same* item also contains other VTIMEZONEs. I don't know why that happens. Radicale should only send the VTIMEZONE definitions actually referenced in an item. Check your .ics file - perhaps a client stored the item as it later gets send out by the server? But then the .ics file itself should already be much larger.
For libical, see
Actually radicale does not use libical, so I'm not sure to understand the connection to the libical change that you mentioned.
However it is clear that the problem is linked to timezones. I cleant my calendar.ics file by removing all the custom timezones (those with "freeassociation.sourceforge.net"), and now the REPORT request transfers 1.1Mb (instead of 26Mb previously). This is still 3 times as big as the whole calendar.ics, but it is nevertheless much better.
Looking at the log, it still looks that the spurious traffic comes from timezone information that is repeated for every VEVENT entry.
(In reply to Sébastien Villemot from comment #6)
> Actually radicale does not use libical,
You are right. I forgot that radicale is just fudging proper CalDAV semantics by doing text manipulations (radicale/ical.py).
> so I'm not sure to understand the
> connection to the libical change that you mentioned.
The large VTIMEZONE definitions in the log look exactly like the ones generated by the modified libical. If Radicale does not generate them, then it must have received them from a client (perhaps even SyncEvolution).
Current SyncEvolution should send the smaller VTIMEZONEs with just one entry for summer time and one for winter time.
> However it is clear that the problem is linked to timezones. I cleant my
> calendar.ics file by removing all the custom timezones (those with
> "freeassociation.sourceforge.net"), and now the REPORT request transfers
> 1.1Mb (instead of 26Mb previously). This is still 3 times as big as the
> whole calendar.ics, but it is nevertheless much better.
Removing them entirely might lead to missing time zone information in the items sent to clients. It might not matter in practice because clients (including SyncEvolution) often look at the TZID first to match it with an existing local definition.
> Looking at the log, it still looks that the spurious traffic comes from
> timezone information that is repeated for every VEVENT entry.
That's worth discussing with Radicale devs. I don't know how they currently decide which VTIMEZONEs need to be inserted into the individual items. If they include *all* of them, then it's not surprising that there is significant overhead once a calendar contains several different time zone definitions.
(In reply to Patrick Ohly from comment #7)
> (In reply to Sébastien Villemot from comment #6)
> > Looking at the log, it still looks that the spurious traffic comes from
> > timezone information that is repeated for every VEVENT entry.
> That's worth discussing with Radicale devs. I don't know how they currently
> decide which VTIMEZONEs need to be inserted into the individual items. If
> they include *all* of them, then it's not surprising that there is
> significant overhead once a calendar contains several different time zone
My understanding is indeed that *all* VTIMEZONEs are inserted into all individual items in the REPORT response.
So in the end I am not sure that the issue under discussion is actually a problem in SyncEvolution. I leave it up to you to decide if you want to implement a workaround for what now looks to me as a bug in Radicale.
Thanks for your feedback on this issue, it is much appreciated.
Yes, I want to avoid the costly REPORT including the actual calendar data when possible. The network traffic might be particularly bad with Radicale, but it's still something that SyncEvolution shouldn't be doing. A normal sync with no changes should cause very little network traffic.
Just for the record, I have reported the Radicale issue: https://github.com/Kozea/Radicale/issues/243
The current SyncEvolution master is using a more efficient approach for the "is empty" check (point 2 from comment 3).
I don't intend to implement point 3, because the current solution for CalDAV (a REPORT without item content) should already be comparable in performance to the simpler PROPFIND and adding a slower, more accurate approach is of little value (isEmpty() is allowed to err on the side of caution and report "not empty" for cases where the server has mixed content and only items which would get ignored during the sync).
Author: Patrick Ohly <email@example.com>
Date: Thu Feb 12 13:36:42 2015 +0100
CalDAV: more efficient "is empty" check (FDO #86335)
Since 18.104.22.168, syncing WebDAV collections always checks first
whether there are items in the collections. This was partly done for
slow sync prevention (which is not necessary for empty collections),
partly for the "is the datastore usable" check.
However, this did not take into account that for CalDAV collections,
the entire content gets downloaded for this check. That is because
filtering by item type (VEVENT vs. VJOURNAL) is not implemented
correctly by all servers. So now all CalDAV syncs, whether incremental
or slow, always transfered all items, which is not the
intention (incremental syncs should be fast and efficient).
This commit adds a more efficient isEmpty() check: for simple CardDAV
collections, only luid and etag get transferred, as in
listAllItems(). This is the behavior from 1.5.
For CalDAV, a report with a filter for the content type is used and
the transfer gets aborted after the first item, without actually
double-checking the content of the item. This is different from
listAllItems(), which really transfers the content. This extra content
check would only be needed for some old servers (Radical 0.7) and is
not essential, because reporting "not empty" even when empty is safe.