Bug 48832 - Improved MOVE_EVENT handling
Summary: Improved MOVE_EVENT handling
Status: RESOLVED MOVED
Alias: None
Product: Zeitgeist
Classification: Unclassified
Component: Engine (show other bugs)
Version: 0.9.x
Hardware: Other All
: medium normal
Assignee: zeitgeist-bugs@lists.freedesktop.org
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-17 10:48 UTC by Siegfried-Angel Gevatter Pujals
Modified: 2018-05-31 09:12 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Siegfried-Angel Gevatter Pujals 2012-04-17 10:48:25 UTC
From https://bugs.launchpad.net/zeitgeist/+bug/778140


MOVE EVENTS
============================================

PRESENTATION

By definition, Zeitgeist's events are immutable, and the subject meta-data
they contain is a snapshot of how a given resource was back when the event
happened.

To be useful, some way of linking event subjects to their physical
representation is needed. The primary identifier for doing this is the
subject's URI.

However, URIs, especially local ones, are transient and may change. To solve
this problem, a new field was added to subjects, and it is special in that
it isn't considered to be immutable. This is the `current_uri' field.

INITIAL IDEA

When a subject is inserted, its `current_uri' field is initially set to the
same value as its `uri' field. When Zeitgeist receives a MOVE_EVENT for that
file (with a coherent timestamp), the value of `current_uri' is updated to
its new file name.

The idea here is that this is done in a way that, if we deleted the
`current_uri' of all subjects and restored them looking at all MOVE_EVENTs
in the database, the result would be the same as before.

CURRENT IMPLEMENTATION

As of now, `current_uri' is initially set to the same value as `current_uri'.
Once a MOVE_EVENT is inserted, all events with a timestamp before that of the
move are updated.

However, after the point the MOVE_EVENT has been inserted, it is never
considered again. This is so for performance reasons, since the initial plan
would require pretty much "rebuilding the database".

PROBLEMS

There are numerous problems with this implementation, at least in theoretical
situations.

One problem is that of events coming in after the MOVE_EVENT (maybe because
the application is batching them). In this case they won't be updated.

We also have the opposite problem, a MOVE_EVENT coming in late after another
conflicting MOVE_EVENT happened. For instance, we have the following events:
 > T5 a.txt, T10 a.txt, T15 a.txt
We receive a first MOVE_EVENT from a.txt to b.txt with timestamp T7. Now we
have (time / current_uri):
 > T5 a.txt, T10 b.txt, T15 b.txt
Finally, we receive a further MOVE_EVENT from a.txt to c.txt with timestamp T0.
The result is:
 > T5 c.txt, T10 b.txt, T15 b.txt
This is totally inconsistent; the correct result would have been:
 > T5 c.txt, T10 c.txt, T15 b.txt

Further, even if implemented as described in the "initial idea" section, the
concept is flawed in that it may happen that events are inserted
retrospectively using already their updated URI. This could give rise to
further inconsistencies.

PROPOSAL

No clear way to avoid this problem is evident. Maybe the best idea is to
formalize the current behavior by documenting it and requesting that MOVE
and DELETE events be inserted near real time (for local files).

OUTSTANDING ISSUES

a) Deletion of MOVE_EVENT
What happens upon deletion of a MOVE_EVENT? Should the current_uri changes be reverted?

b) Insertion of other events
When inserting an event, should Zeitgeist check whether a MOVE_EVENT happened for that URI after the event's timestamp, and update it accordingly?

c) Directories
Should the insertion of a MOVE_EVENT with the renaming from "file:///home/user/dir1" to "file:///home/user/dir2" also update all events with uri "file:///home/user/dir1/*" to "file:///home/user/dir2/*"? I think so.

SEE ALSO

Related to this, please also check my proposal for improved DELETE_EVENT handling at https://bugs.freedesktop.org/show_bug.cgi?id=48661 .
Comment 1 GitLab Migration User 2018-05-31 09:12:04 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/zeitgeist/zeitgeist/issues/9.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.