Bug 58069 - FILESAVE in different path structure destroys relative hyperlinks
Summary: FILESAVE in different path structure destroys relative hyperlinks
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.0.0.0.beta1
Hardware: Other All
: medium critical
Assignee: Kohei Yoshida
URL:
Whiteboard: BSA target:4.1.0 target:4.0.0.2
Keywords: regression
Depends on:
Blocks: mab4.0
  Show dependency treegraph
 
Reported: 2012-12-10 07:41 UTC by Rainer Bielefeld Retired
Modified: 2013-01-11 05:20 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Test Kit (20.71 KB, application/zip)
2012-12-10 07:41 UTC, Rainer Bielefeld Retired
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rainer Bielefeld Retired 2012-12-10 07:41:03 UTC
Created attachment 71253 [details]
Test Kit

Steps how to reproduce with parallel installation of  "LOdev  4.0.0.0.beta1   -  GERMAN UI / German Locale  [Build ID: 87906242e87d3ddb2ba9827818f2d1416d80cc7)]"  {tinderbox: @6, pull time 2012-12-06} on German WIN7 Home Premium (64bit) with separate /4 User Profile for Master Branch

0. download / unzip attached test kit to a folder like
  <C:\Users\Public\Documents\LibO\NEW_Links>
1. Launch LibO, make sure that in menu 'Tools -> Options -> 
   Loade/Save -> General - Save - "Save URLs relative to file 
   system"' is checked
2. Open "samplewithlinks.ods" from LibO Start Center file menu
3. check links, should wor except third one because that folder
   does not exist on your PC
4. Save "samplewithlinks.ods" as "samplewithlinks2.ods"
   In a different folder outside the path structure of
   the old document, for example under 
   <C:\Users\Public\Documents\OpenOffice\PRIVAT\LibOQA>
5. Check Link in A1, should work fine
6. Close / Reopen
7. Check Link in A1, 
   Expected: should work fine
   Actual: leads to same folder as "samplewithlinks2.ods" 
           instead of old location like
           <C:\Users\Public\Documents\OpenOffice\PRIVAT\LibOQA>
  
Additional Test whether also other applications are affected will have to be done
              
There might be a relation to "Bug 58055 - After Recovery hyperlinks in spreadsheet are broken"
Operating System: Windows 7
Last worked in: 3.6.4.3 rc
Comment 1 Rainer Bielefeld Retired 2012-12-10 07:42:12 UTC
Major due to dataloss
Comment 2 Rainer Bielefeld Retired 2012-12-10 07:47:32 UTC
I did the same test with a Writer document, problem did not appear
Comment 3 Julien Nabet 2012-12-10 21:06:40 UTC
On pc Debian x86-64 with master sources updated today (commit 24f0aa76c005d1506a6d13945c39dafc6e9b8d91), I reproduced this behaviour.

Eike/Markus/Kohei: would one of you have some time about this tracker?
Comment 4 Rainer Bielefeld Retired 2012-12-15 08:14:18 UTC
Modified OS due to Comment 3
Comment 5 Jorendc 2012-12-23 10:12:09 UTC
Reproducible with LibreOffice 4.0.0.0.beta2 (Bouw-id: 4104d660979c57e1160b5135634f732918460a0)
TinderBox: MacOSX TDF Release, Branch:libreoffice-4-0, Time: 2012-12-18_17:13:13
Mac OSX 10.8.2
Comment 6 Markus Mohrhard 2012-12-26 23:58:40 UTC
I wonder how this could have worked in old releases.

It is part of our old optimization to store the input stream for writing it out again unchanged when the stream is not marked dirty.

o the solution is to mark the stream dirty during import if it contains a link.

You can control that by adding a value to any cell between step 3 and 4. This forces an invalidation of the input stream and therefore a real export and not writing out the old input stream.
Comment 7 Markus Mohrhard 2012-12-27 02:31:09 UTC
And of course it is not as easy as expected. The hyperlink is parsed somewhere in the shared rich text import. I still have no idea where the hyperlink is later stored or where we would have the right place to mark the stream dirty.
Comment 8 Kohei Yoshida 2013-01-10 18:54:46 UTC
The solution that might work better is to mark all table streams dirty when saving as a different document.  It makes no sense to re-use streams unless you are overwriting an existing document anyway.
Comment 9 Kohei Yoshida 2013-01-10 19:13:30 UTC
There may be other things that depend on the full path of the host document, not just hyperlinks.  Hence my proposal.
Comment 10 Markus Mohrhard 2013-01-10 19:17:42 UTC
On the other hand the cached stream is a big speed-up for export. Our export is already quite slow so we might want to keep this optimization.

That does not mean that I'm generally in favor of this optimization just that we need to think about the performance impact of such a change.
Comment 11 Kohei Yoshida 2013-01-10 19:20:05 UTC
(In reply to comment #10)
> On the other hand the cached stream is a big speed-up for export. Our export
> is already quite slow so we might want to keep this optimization.
> 
> That does not mean that I'm generally in favor of this optimization just
> that we need to think about the performance impact of such a change.

Note, my proposal will not remove this optimization.  It will just disable it for "Save As" operation.  Let's ask ourselves, how many times do you expect the users to "save as" a document as opposed to just "save"?
Comment 12 Kohei Yoshida 2013-01-10 19:23:40 UTC
(In reply to comment #11)

> Note, my proposal will not remove this optimization.  It will just disable
> it for "Save As" operation.  Let's ask ourselves, how many times do you
> expect the users to "save as" a document as opposed to just "save"?

It will be conceptually equivalent of saving a new document.  We don't use this optimization when saving a new document, and IMO we shouldn't do it when saving an existing document as a different document either.
Comment 13 Kohei Yoshida 2013-01-10 19:46:40 UTC
Alternatively, we could just disable stream cache only when the host document is saved into different directory.

We could also do additional check and disable stream cache only when the sheet contains hyperlinks (or anything else that might depend on the full path of the host document), but that would require going through all cells in each sheet, which is a big overhead especially for large documents.  I'm not sure if it's worth it.
Comment 14 Kohei Yoshida 2013-01-10 20:28:13 UTC
Unless there is objection, here is what I'm going to do.

When saving, I'll check if the directory path changes before and after the save, and if it does, I'll mark those sheets that have a hyperlink dirty, in order to prevent their stream caches from being used on save.  As we discover other elements that depend on the full path of the host document, we will add more checks in the same routine.

This will slightly increase the cost of the file save, but hopefully not too much.
Comment 15 Kohei Yoshida 2013-01-10 21:17:54 UTC
Ok.  In the end, I've decided to just mark all sheets dirty unconditionally, to invalidate all sheet stream caches only when the directory path changes before and after the save.  Note that when you simply save as a different document into the same directory, it won't invalidate the cache.

This is probably more reasonable than checking every cell on each sheet, which may potentially make the save performance worse than not checking at all.
Comment 16 Not Assigned 2013-01-10 21:25:36 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=831d1b6e62e91e67f171bd00305651043731d496

fdo#58069: Invalidate sheet stream cache when directory path changes.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 17 Not Assigned 2013-01-11 04:58:30 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=31664fbc882652f6b308fa5aab1eac48e9d0ad0a&h=libreoffice-4-0

fdo#58069: Invalidate sheet stream cache when directory path changes.


It will be available in LibreOffice 4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Kohei Yoshida 2013-01-11 05:20:43 UTC
Fixed for 4.0.