Bug 74359 - FILEOPEN: [RTF filter] Content piece of the table’s large cell is lost in file from Web page created in Word 2007
Summary: FILEOPEN: [RTF filter] Content piece of the table’s large cell is lost in fil...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version: 4.1.5.1 rc
Hardware: Other All
: medium critical
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-02-02 10:09 UTC by ape
Modified: 2014-02-27 15:36 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
DOCX file (fdo#74357) saved as ODT by LibO_Dev-4.2.3.0.0+ (2.74 MB, application/vnd.oasis.opendocumentformat.text)
2014-02-24 08:57 UTC, ape
Details

Description ape 2014-02-02 10:09:42 UTC
I saved the Web page as an MHT archive. I converted the MHT file to the RTF file using WinWord-2007 (see an attachment). I opened the RTF file using LibreOffice-4.2.0; 4.1.5; 4.0.6 and saw the error:
 Content of the big table cell is lost completely.
 I checked the operation of other programs:
1. LibreOffice-4.0.6 is making the same mistake.
2. OpenOffice.org-3.1.1 opens the RTF file correctly, not making this mistake.
3. LibO-3.5.7, 3.6.7 and AOO-4.0.1 have other mistake (see bug 74356, bug 74357):
The large piece of the table cell’s content is lost. Table cell, located on the tenth page of the document, contains an image whose size is equal to the page size. All information of the cell, located after this big image, is lost.
--
This is a loss of information and regression to old programs, so the status is critical.
Comment 2 kompilainenn 2014-02-02 11:47:11 UTC
confirm bug
Comment 3 Joel Madero 2014-02-02 17:06:18 UTC
I see that it's different because it's RTF/DOC/DOCX - it could very well be fine to have it as one bug report but I'll leave it as separate for now. RTF filter bugs most definitely do not belong on MAB list as they are quite rare and not going to impact many users
Comment 4 Cor Nouws 2014-02-07 14:22:06 UTC
(In reply to comment #0)
> I saved the Web page as an MHT archive. I converted the MHT file to the RTF
> file using WinWord-2007 (see an attachment). I opened the RTF file using
> LibreOffice-4.2.0; 4.1.5; 4.0.6 and saw the error:
>  [...]

thanks for the report.

Do you know if the same problems happens if you create such a file
 - from scratch in LibreOffice
 - by pasting in LibreOffice and then saving as rtf
 - by pasting in LibreOffice and then saving as odt ?

regards,
Cor
Comment 5 ape 2014-02-08 05:55:19 UTC
(In reply to comment #4)
> Do you know if the same problems happens if you create such a file
>  - from scratch in LibreOffice
> ...
--
Hi, Cor!

  I guess:
1. The filter does not know how to process the data (text, graphics) contained in a large table cell located after a large image which size is the size of the page. This error occurs in earlier versions of the program: LibreOffice-3.5.x and 3.6.x.
2. New programs (LibreOffice-4.1.5 and 4.2.0) do not process the entire table cell if the cell contains some data that cannot be displayed.

  Inserting copy data of the MHT file in the Writer’s new document does not give more information about the bug. Only text is stored in the Clipboard in that case.

Regards, ape.
Comment 6 ape 2014-02-24 07:26:53 UTC
 Miklos, 
I added you to CC. Your patch solved the problem DOCX-files (bug 74357) very well. Please decide a similar issue with files of RTF format, if you have the time and opportunity.
--
ape
Comment 7 ape 2014-02-24 08:57:03 UTC
Created attachment 94633 [details]
DOCX file (fdo#74357) saved as ODT by LibO_Dev-4.2.3.0.0+

I have been using this version of the program:
LibreOfficeDev 4.2.3.0.0+ (Build ID: 5ba682c48e449f30e3cc1ec4acac75a6122ee6d7, TinderBox: Win-x86@42, Branch:libreoffice-4-2, Time: 2014-02-22_23:03:29)
--
1. The DOCX file (attachment 93208 [details]) was opened and then was saved as ODT format (see an attachment).
2. The ODT file was saved as Rich Text Format (more than 220 MB).
3. I don't see the content of eighteen primary pages when I open this RTF file using Writer.
4. I can see the contents of all pages when I open this RTF file by WinWord-2007.
--
I guess that the RTF import filter does not know how to process a cell with large image which size is the size of the page.
Comment 8 ape 2014-02-27 15:36:48 UTC
The piece of regression fixed in this version:
 LibreOfficeDev-4.2.3.0.0+ (Build ID: 4274001144adeb0b0a1e7da05d52c1bedbe899e5,
 TinderBox: Win-x86@42, Branch: libreoffice-4-2, Time: 2014-02-27_08:31:36).
--
Now Writer shows the first eleven pages of the RTF file (see URL in the comment 1), the same as the DOC file (see bug 74356).
But
 if you made these actions:
  opened ODT file (see comment 7; attachment 94633 [details])
  and saved this ODT file as RTF format (size ~224 MB),
 then Writer (e.g. Writer_4.2.1.1,_4.0.6) opens new RTF file is fine and shows all contents of all pages.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.