Bug 40186 - Command-line conversion from HTML produces HTML, not RTF, DOC, etc., if output filter name is not specified explicitely
Summary: Command-line conversion from HTML produces HTML, not RTF, DOC, etc., if outpu...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Libreoffice (show other bugs)
Version: 3.4.2 release
Hardware: Other Mac OS X (All)
: medium normal
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-17 19:30 UTC by em36
Modified: 2014-11-10 13:58 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description em36 2011-08-17 19:30:06 UTC
This problem seems to have been introduced in 3.3 or later. Under OS X, when I try to convert an HTML file to another format (RTF, DOC), using the command-line, the output file is an HTML file. This is an example of the command line:

cd '/Applications/' ; LibreOffice.app/Contents/MacOS/soffice.bin --headless --nofirststartwizard --invisible --convert-to rtf --outdir /Users/username/ '/Users/username/testfile.html'

Other input formats work correctly. Is there anything I should be doing differently with the current version to make this work with HTML?
Comment 1 Björn Michaelsen 2011-12-23 12:32:54 UTC
[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:
http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1

more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
Comment 2 Roman Eisele 2012-05-07 09:21:13 UTC
Compare Bug 46026 - "Command line converter + Conversion issues/inconsistencies between odt->doc, docx, pdf".
Comment 3 Florian Reisinger 2012-08-14 13:57:33 UTC
Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian
Comment 4 Florian Reisinger 2012-08-14 13:58:53 UTC
Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian
Comment 5 Florian Reisinger 2012-08-14 14:03:26 UTC
Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian
Comment 6 Florian Reisinger 2012-08-14 14:05:42 UTC
Dear bug submitter!

Due to the fact, that there are a lot of NEEDINFO bugs with no answer within the last six months, we close all of these bugs.

To keep this message short, more infos are available @ https://wiki.documentfoundation.org/QA/NeedinfoClosure#Statement

Thanks for understanding and hopefully updating your bug, so that everything is prepared for developers to fix your problem.

Yours!

Florian
Comment 7 Roman Eisele 2012-08-16 15:40:39 UTC
Wait a minute -- I can (still) reproduce this bug:

REPRODUCIBLE with
* LibreOffice 3.5.6.2 (Build-ID: e0fbe70-dcba98b-297ab39-994e618-0f858f0)
* LibreOffice 3.6.0.4 (Build ID: 932b512)
both with German langpack installed, both running on MacOS X 10.6.8 (Intel).

Using the command line argument given in the original description, and a simple .html file named "testfile.html" and saved in my user folder, a file "testfile.rtf" is generated, which does not contain RTF data, but HTML data.

Of course, I am no LibreOffice --headless line expert, and can’t tell if there is an error in the command line argument supplied by the original reporter (maybe the command line options of soffice.bin have changed, and therefore the --convert-to argument is no longer honored?). Someone else should tell this.

But nevertheless I can confirm that the command does not work: it is strange (and really a bug) if we produce a file named *.rtf which contains HTML data.
Comment 8 Roman Eisele 2012-08-16 16:19:40 UTC
Well, it DOES work with LibreOffice 3.6 if I specify the filter to use:

  "LibreOffice.app/Contents/MacOS/soffice.bin" --headless
  --nofirststartwizard --invisible --convert-to 'rtf:Rich Text Format'
  --outdir /Users/username/result '/Users/username/testfile.html'

(NB that 'rtf:Rich Text Format' seems necessary; rtf:Rich_Text_Format with underscores, and WITHOUT quotation marks, does not work).

The same is true for other target file formats; e.g.,
  ... --convert-to 'doc:MS Word 97' ...
works, but
  ... --convert-to doc              ...
does not work: the generated file has the extension .doc, but contains still HTML data, so the generated file is invalid.


But I still don’t understand why the short version used by the original reporter:
  --convert-to rtf
does not work; according to

  http://help.libreoffice.org/Common/Starting_the_Software_With_Parameters

which gives the example
  --convert-to pdf
I would expect that it is not necessary to specify the filter explicitely.


Therefore adjusted the Summary: the problem is that the output filter name is required, while the documentation says it is optional.
Comment 9 Roman Eisele 2012-08-16 16:33:33 UTC
@Stephan Bergmann:

Hello Stephan, I could not find out which developer(s) should be informed about this issue; I insert you into the CC list because I remember (but I may be wrong ;-) that you have fixed some other issues with running LibO in headless mode.

Can you please take a short look at this issue and try to tell if this is
(a) a problem in LibreOffice (and then, which developer(s) could be interested
    in fixing it?), or if this is
(b) just a documentation error (if the output filter name is required
    in any case, the documentation is wrong in saying that it is optional)?

Or can you give me a hint who (if not you) could help here?

Thank you very much in advance for any hints!
Comment 10 Stephan Bergmann 2012-08-20 14:05:20 UTC
#libreoffice-dev:

<sberg> btw, any dev having insight into fdo#40186, "--convert-to rtf" not working while "--convert-to 'rtf:Rich Text Format'" does
<kendy> sberg: I'd try vmiklos, but there's public holiday in Hungary today :-(
<sberg> kendy, thanks, will cc him on the bug
<caolan> sberg: I might suspect some change in the filter module in source/config subdir. There's also some weirdness where Text Encoded in the file type list is now "csv,txt" which looks very odd to me
Comment 11 Maxim Monastirsky 2014-11-10 13:58:28 UTC
The reason for this bug seems quite simple. HTML files are opened in Writer/Web by default [1], but the RTF filter is registered with the Writer DocumentService [2], so when searching for a filter for Writer/Web it couldn't be found. And indeed, when changing the DocumentService of the filter entry, or forcing to search a filter for Writer's DocService, it gets the right filter and outputs RTF. So it should be easy to hack this to search also for a Writer filter, when no filter found for Writer/Web. But I wonder whether this still need fixing, givan that since 9df3a83c304f3dd0e0233d234dc6036ab5eefb77 there is an easy workaround (adding --writer to the command). Any thoughts?

[1] http://opengrok.libreoffice.org/xref/core/filter/source/textfilterdetect/filterdetect.cxx#140
[2] http://opengrok.libreoffice.org/xref/core/filter/source/config/fragments/filters/Rich_Text_Format.xcu#29


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.