Bug 51989 - PDF export performance decreased in 3.6, exponentially depending on file size
Summary: PDF export performance decreased in 3.6, exponentially depending on file size
Status: RESOLVED INVALID
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
3.6.0.0.beta3
Hardware: Other macOS (All)
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2012-07-11 19:29 UTC by Roman Eisele
Modified: 2013-03-06 17:04 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Big book sample (3222 pp. .odt file in book typography, German, no images, graphics or tables, fonts included) (2.02 MB, application/zip)
2012-07-15 11:10 UTC, Roman Eisele
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Eisele 2012-07-11 19:29:03 UTC
When playing around with the sample file for bug 51817 (a 163 pp. .odt file with many images and formulas), I noticed a slight decrease of PDF export performance:
* in LibreOffice 3.5.5.3, PDF export took ca. 38 seconds,
* in LibreOffice 3.6.0 beta 3, it took about  48 seconds.

Both times I used the same export settings (export automatically inserted blank pages, embed standard fonts) and both exports were done on the same machine (a MacBook Pro with 2.53 GHz Intel Core i5, 4 GB 1067 MHz DDR3 RAM, and a SSD drive -- especially the last item, the SSD drive, makes the machine rather fast for many tasks).

The decrease of performance is not very important for me, but I wonder if other people who use older and slower machines will suffer from a more drastic performance decrease.

So, I don't want to complain, I just invite other testers and QA volunteers to have an eye for the performance of LibreOffice 3.6 (not only of Writer, but also of the other components), and to add a comment to this bug report if they notice something important. If we find evidence that the overall PDF export performance has really decreased in an important grade, we should encourage the developers to look into it. Thank you for reporting any additional observations!
Comment 1 Roman Eisele 2012-07-11 19:38:47 UTC
(In reply to comment #0)
> So, I don't want to complain, I just invite other testers and QA volunteers to
> have an eye for the performance of LibreOffice 3.6 (not only of Writer, but
> also of the other components), and to add a comment to this bug report if they
> notice something important.

Well, I meant to say: only for the performance of *PDF export* from LibreOffice 3.6. If you notice other performance problems in 3.6, please create a new bug report for them; this report is about PDF export performance only. Thank you!
Comment 2 Björn Michaelsen 2012-07-11 21:20:26 UTC
Hi Robert,

a bibisect might help finding the root cause for this: http://wiki.documentfoundation.org/Bibisect

Best,

Bjoern
Comment 3 Fernand 2012-07-12 07:45:03 UTC
probably, partly due to the far more better Image resampling, who is slower by nature
Comment 4 Roman Eisele 2012-07-12 07:59:22 UTC
(In reply to comment #2)
> a bibisect might help finding the root cause for this:
> http://wiki.documentfoundation.org/Bibisect

A good idea; however, I can't do this because I am (only) on MacOS X (and WinXP). But probably someone else who uses Linux could do that?
Comment 5 Roman Eisele 2012-07-12 08:07:54 UTC
(In reply to comment #3)
> probably, partly due to the far more better Image resampling, who is slower by
> nature

This is a compelling idea, and if the PDF export performance is slower just for such reasons, I will not complain, but be happy instead. But is this the *only* cause of the performance decrease?

To check this, I tried another sample file, a 286 pp. proceedings volume which contains no (bitmap) images at all, only some tables and much text in English and German. (I can't attach the file here for copyright reasons and because it uses commercial .otf fonts which I can' share, sorry!) The results are (with identical settings):

* LibreOffice 3.5.5.3:        ca. 20 seconds
* Libreoffice 3.6.0.0 beta 3: ca. 40 seconds (!)

IMHO this means that the better image resampling can not be the only source of the slower performance -- there must have changed something else right in the export of text.
Comment 6 Roman Eisele 2012-07-15 10:36:24 UTC
After additional testing (report and sample file coming soon), I have to increase the severity and to adapt the summary.
Comment 7 Roman Eisele 2012-07-15 11:10:44 UTC
Created attachment 64232 [details]
Big book sample (3222 pp. .odt file in book typography, German, no images, graphics or tables, fonts included)

Update:
The decrease in PDF export performance does NOT (only) depend on images/graphics, it is measurable even with .odt files containing formatted text only, and it does depend exponentially on the .odt file size (count of pages, etc.): when you export a big .odt file to PDF, you will notice an even bigger decrease of performance.

Test:
To test with a real text sample (I am a bit distrustful about the value of tests with mere placeholder texts), I created a 3222 pp. .odt sample file in book typography -- no images, no graphics, no tables, only much German text, some nice formatting with a bit of color, using (semi-)professional fonts and styles. (If you wonder which text is free of copyright, but 3222 pp. long: it is the Bible in Luther's original 1545 German translation ;-) The .odt file is 1.9 MB in size. You find it together with the (free) fonts attached to this bug report.

To test I just open the file, do nothing else, do not even scroll, just select "File > Export as PDF...", leave most settings at default, just select
 * General: "Export bookmarks"
 * General: "Export automatically inserted blank pages"
 * General: "Embed standard fonts"
 * Links:   "Export bookmarks as named destinations"
but don't select General: "Embed OpenDocument file",
and click "Export".

Results:
I distinguish three stages of PDF export. In stage I, the progress bar is not yet visible, but the cursor turns to the spinning cursor or watch (depending on your OS, I suppose). In stage II, the progress bar is visible, but does not work yet, i.e. does not show any progress. In stage III, the pogress bar shows the progress of export. To distinguish these stages is important, because it may reveal something about where the decrease of performance occurs.

All values are in seconds and the mean value of several tests with each version.


                           LibO 3.4.6    LibO 3.5.53    LibO 3.6.0.1
========================================================================
Export stage I              50            52             50
(spinning cursor)
------------------------------------------------------------------------
Exporr stage II              5             5            267 (!!!)
(progress bar visible,
but disabled)
------------------------------------------------------------------------
Export stage III           102           113            140
(progress bar
showing progress)
========================================================================
Total                      157           170            457 (!!!)


You see, we have a little decrease of performance in LibO 3.5.x as compared to 3.4.x, but a big decrease in LibO 3.6.0.1 as compared to both older versions.

The decrease happens partially in stage III of the export, but mostly in stage II. This stage, which took only 5 seconds in LibO 3.4.x and 3.5.x, and is, with small documents, so short that you may not even notice it, takes 267 seconds now in LibreOffice 3.6.0.1. Wow! I already thought that LibO 3.6.0.1 freezed, and this is really bad user experience. It seems that the duration of this stage depends exponentially on file size, even without any graphics.

So the question is: WHAT does LibreOffice do in stage II? Here is the place of the biggest regression of performance against LibreOffice 3.5.x, and here our developers should look how to get performance back.
Comment 8 Thomas Hackert 2012-07-29 16:37:27 UTC
Hello Roman, *,
interesting bug ... ;) I tested it with LO Version 3.6.0.2 (Build ID: 815c576), installed Germanophone Help as well as Lang pack, under Debian Testing AMD64. My system is an IBM PC with an AMD Athlon X2 6000+ CPU (Dual Core 2x3.0 GHz) with an 750 GB WD7500AAVS HD w/7200 RPM, and - IIRC - w/3GB of RAM. If I open your attached Big book sample, it needs
<quote>
real    0m10.015s
user    0m8.497s
sys     0m0.304s
</quote>
just to open the file. If I use the PDF Export with your mentioned options, I get
<quote>
real    3m36.746s
user    3m30.457s
sys     0m1.536s
</quote>
... :(

With my installed LibreOffice 3.5.5.3 Build-ID: 7122e39-92ed229-498d286-15e43b4-d70da21, I get
<quote>
real    0m20.843s
user    0m13.001s
sys     0m1.468s
</quote>
just to open the file ... :( Exporting needs
<quote>
real    4m23.457s
user    4m29.125s
sys     0m3.068s
</quote>
, so in my case the older version is slower than the newer one ... :(

Just out of interest: Which Java are you using? Mine is
<quote>
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.3) (6b24-1.11.3-2)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
</quote>
or openjdk-6-jre 6b24-1.11.3-2.
HTH
Thomas.
Comment 9 Roman Eisele 2012-08-03 13:14:24 UTC
Hello Thomas,

first of all: thank you very much for investigating in this issue and taking all the time necessary to replicate my example (I know it is rather enervating to do such speed tests ;-)! Your results are, of course, very interesting:

> so in my case the older version is slower than the newer one ... :(

Wow! Well, this makes the issue much more complicated. What may be the reason? The most probable idea I have for now is that there is some difference between PDF export handling on MacOS X and Linux/Debian ... maybe some system/library calls changed between LibO 3.5 and 3.6?

I will try do do some further tests (maybe on Windows, if I get access to a Win machine) and then make additional remarks.

> Just out of interest: Which Java are you using?

Apple’s own version of Java, version 1.6.0_33, as supplied with the operation system.

Do you know if LibreOffice PDF export depends somehow on Java?
Comment 10 Thomas Hackert 2012-09-02 13:16:13 UTC
Hello Roman, *,
sorry for the delay, but I had too much to do in RL, so I missed to answer ... :(

(In reply to comment #9)
> first of all: thank you very much for investigating in this issue and taking
> all the time necessary to replicate my example

you are welcome :)

<snip>
> > so in my case the older version is slower than the newer one ... :(
> 
> Wow! Well, this makes the issue much more complicated. What may be the reason?

I have not the faintest idea ... :(

> The most probable idea I have for now is that there is some difference between
> PDF export handling on MacOS X and Linux/Debian ... maybe some system/library
> calls changed between LibO 3.5 and 3.6?

This could be a reason. But I am not a developer, so I cannot be of any help here, sorry ... :(

> I will try do do some further tests (maybe on Windows, if I get access to a Win
> machine) and then make additional remarks.

And? Have you tested it any further?

> > Just out of interest: Which Java are you using?
> 
> Apple’s own version of Java, version 1.6.0_33, as supplied with the operation
> system.

O.K.

> Do you know if LibreOffice PDF export depends somehow on Java?

I am not sure ... :( I have started LO from konsole, but there was no indication, that java was involved in the creation of the PDF ... :(
HTH
Thomas.
Comment 11 Roman Eisele 2012-09-03 07:47:30 UTC
(In reply to comment #10)
> I am not sure ... :( I have started LO from konsole [...]

Wait a minute -- this is a very interesting point! I used the "normal" GUI way for my PDF export tests (i.e., open the .odt files with Writer, choose "File > Export as PDF ..."). Did you do your tests completely from the Terminal/Command line? This would well explain the difference of the results, because it is quite possible that LibO spends much time with UI update tasks ...
Comment 12 Uwe Altmann 2012-10-11 07:36:12 UTC
imho, opening a big doc and immediately doing a PDF export says not that much on PDF-Export time. The document has to be formatted first (i. e. inserting page breaks) independent of exporting; you may also use printing for that. If you target PDF export, then export it twice directly one after another and take the second timing. Or print it before (abort when the printing dialog appears). You will see similar timings for that. 
Maybe the problem is not the export, but the formatting of the document?
Comment 13 Roman Eisele 2012-10-11 08:27:09 UTC
@ Uwe Altmann:
Thank you very much for your hints! I will try to do new tests, following your suggestions, when I find some time.

@ All:
I set the status of this bug report to NEEDINFO to remind myself that I (the reporter) have to input more information. If I forget to do so for more than a month, please do not close this bug report immeditately, but ping me before ;-)
Comment 14 Roman Eisele 2012-10-11 16:14:28 UTC
(In reply to comment #12)
> Maybe the problem is not the export, but the formatting of the document?

Uwe seems to be right here. Following his suggestion, I have tried to export the same document two or three times successively to PDF format, in order to separate the preparation (“Umbruch”: page breaking etc.) from the real PDF export. To make the comparison easier, I have used again my “big book sample” from comment #7. The results, given in the same table as in comment #7, but with current LibO versions and in minute/seconds format (easier to read):

                         3.5.7.1    3.5.7.1       3.6.2.2    3.6.2.2
                         1st pass   2nd/3rd pass  1st pass   2nd/3rd pass
=========================================================================
Stage I                  50s        3s            50s        3s
(spinning cursor)
-------------------------------------------------------------------------
Stage II                 5s         2s            4m50s      5s
(progress bar visible,
but disabled)
-------------------------------------------------------------------------
Stage III                1m50s      1m50s         1m54s      1m42s
(progress bar
showing progress)
=========================================================================
Total                    2m45s      1m55s         7m34s      1m50s

So, while it is still true that the 1st pass of the PDF export is far slower in LibO 3.6.x than in 3.5.x, the 2nd and 3rd passes are equally fast in both versions -- I consider the difference of ca. 5 seconds as measuring error; maybe 3.6.2.2. is even a little bit faster, as Thomas has suggested in comment #8. Nice!

Now it seams reasonable (the LibO documentation is bad here; a developer could tell us for sure) that stage I and II are related to the document preparation, i.e. inserting page breaks, maybe collecting font information etc., and that the acutal PDF export is identical with stage III. So what has become slower in LibO 3.6.x is very probably not the PDF export, but the formatting of the document. This _is_ a problem, of course, but a different one, which needs more testing (*); the current bug report, which assumed that the PDF export has become slower, can be closed now.

Thank you all very much for your interest and testing!


(*) If somebody else is interested in testing this document formatting time slowdown, we could collaborate in designing a nice test, executing it on several platforms and submitting a new bug report for that; just let me know if you are interested!
Comment 15 David Ronis 2013-03-06 17:04:36 UTC
I'm running 4.0 on a linux box (2 cpus, 3G memory).   I have an  .odm file that contains ca 1400 pages, with graphics, text, tables, etc.   Everything is glacially slow.   In particular, exporting as PDF (using the pdf icon on the toolbar) takes 2-3 hours!  I've upped the memory settings to 256Mb for LO, 20Mb per object,  and 128 objects, but this doesn't help.

Just scrolling through the document is jerky and unresponsive.  Worse, there are times when the entire desktop freezes (AFAIK there is little swapping going on in the system, but who knows what LO is doing)