Bug 7808

Summary: Speedup PDF loading by ~19%
Product: poppler Reporter: Krzysztof Kowalczyk <kkowalczyk>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: high CC: dkelly
Version: unspecified   
Hardware: x86 (IA32)   
OS: Windows (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Patch for the faster string implementation
GooString optimizations
Lexer and UGooString optimization
Additional fix for previous Lexer caching patch
Tentative patch to fix Gates_direct.pdf problems

Description Krzysztof Kowalczyk 2006-08-07 22:56:51 UTC
poppler allocates and frees a lot of objects, especially GooString objects. In
my profiling malloc() and free() are at the top of functions that take most of
the time.

I instrumented the code to record sizes of GooStrings at the time of
destruction. It turns out that ~90% of them are 16 bytes or less.

Currently allocating GooString() requires 2 malloc()s:
* one from new to allocate the object itself
* another one to allocate s pointer to a string

A very simple optimization is to keep a small buffer within GooString itself and
only allocate new space if the string gets bigger. This halfs the number of
malloc() calls for string that entirely fit in the buffer. In case of poppler, a
buffer of size 16 would eliminate 45% malloc()/free() calls that are caused by
GooString().

Another good part is that it doesn't even use more memory. malloc() rounds
allocation sizes anyway (by 16 bytes on Ubuntu, try printf("allocation rounding:
%d\n", -(int)((char*)malloc(1) - (char*)malloc(1)));) and the OS needs
book-keeping information (BOOK_KEEP_SIZE) which is implementation dependent but
usually not less than 8 bytes. Assuming those parameters, memory used by current
implementation of GooString is:
* BOOK_KEEP_SIZE + round_16(sizeof(GooString)) + round_16($str_size) +
BOOK_KEEP_SIZE = 2*BOOK_KEEP_SIZE + round_16(8) + round_16($str_size) =
2*8+16+round_16($str_size) = 32+round_16($str_size)
In implementation with static buffer, assuming buffer size 16:
* for $str_size <= 16: BOOK_KEEP_SIZE + round_16(sizeof(GooString) +
round_16($str_size) = 8 + round_16(24) = 8 + 32 = 40 so it's actually less real
memory used
For $str_size > 16 a bit more is used.

How to choose size of static space: currently I use 16 but it might be better to
use 24 (so that sizeof(GooString) is 32 because it'll probably be anyway due to
rounding to 16 bytes).

Does it give any real speedup? My test consisted of just loading (no rendering)
of PDFReference16.pdf file (PDF 1.6 spec available from Adobe website) which is
about 8.72 in size.

I used a release build and recorded user time averaged over 4 runs. I got a 10%
speedup (from 1453.095 milliseconds to 1303.7425 milliseconds).

On top of that, the implementation is dead simple.

Attached patch has a #define FAST_GOO_STRING in GooString.h so that interested
parties can easily compile both versions and compare the speeds.

It also has #define DO_HIST that adds code to collect string sizes at the time
of deletion. This shows that majority of strings in poppler is <=16 bytes.

Howerver, I do not recommend keeping those #defines in final version. It's
trivial to remove them.
Comment 1 Krzysztof Kowalczyk 2006-08-07 22:58:20 UTC
Created attachment 6496 [details] [review]
Patch for the faster string implementation
Comment 2 Krzysztof Kowalczyk 2006-08-13 19:51:48 UTC
Created attachment 6540 [details] [review]
GooString optimizations
Comment 3 Krzysztof Kowalczyk 2006-08-13 19:52:23 UTC
Created attachment 6541 [details] [review]
Lexer and UGooString optimization
Comment 4 Krzysztof Kowalczyk 2006-08-13 20:03:16 UTC
I've created two new patches that, when combined, provide ~19% speed improvement
when loading PDFReference16.pdf document (PDF reference from Adobe website).

It cleans up previous patch and adds additional improvements.

Brief overview of changes:
* make GooString use internal buffer for short strings; re-factor GooString to
remove code duplication
* gfree() doesn't have to check for NULL pointer (C library does it anyway, it's
in the C ISO standard). gfree() is called so often that removing that check
improves the speed by 1%
* make UGooString use internal buffer as well; refactor the code to make it more
like GooString
* Parser::getObj(): make 'key' to be UGooString to avoid creating temporary
objects since dictAdd() uses UGooString as the argument
* Lexer::lookChar() and Lexer::getChar() - getChar() is often called right after
lookChar() (for about 30% of all getChar()s). Currently it has to re-do all the
work that lookChar() did. A very simple optimization is to cache the last value
of lookChar() and return it in getChar() if available.
* PageLabelInfo.cc: #include <config.h> since it's needed for compilation on Windows

Most of those changes reduce the number of malloc()/free() calls. There are
still ways to go.
Comment 5 Krzysztof Kowalczyk 2006-09-02 18:32:40 UTC
Created attachment 6798 [details] [review]
Additional fix for previous Lexer caching patch

This patch fixes a bug introduces in patch 6541 (Lexer and UGooString
optimization) by taking into account the fact that we might have one value
cached when creating a substream from current stream.

They should be applied together.
Comment 6 Nickolay V. Shmyrev 2006-09-03 01:52:32 UTC
Krzysztof, although patch aren't commented yet, I think they are just great,
thanks for your work, it should certainly go into poppler.
Comment 7 Albert Astals Cid 2006-12-28 07:51:59 UTC
commited to the head branch. Thanks for the work.
Comment 8 Dwight Kelly 2007-01-31 15:24:14 UTC
I found at least 3 files that now fail to parse correctly because of the lookChar() caching patch.

http://www.apago.com/~dkelly/bugzilla_7808.zip
Comment 9 Albert Astals Cid 2007-02-01 11:58:40 UTC
I can see poppler failing in the Gates_direct.pdf but not on the other 2, what's the problem there?
Comment 10 Albert Astals Cid 2007-02-01 12:40:00 UTC
Created attachment 8562 [details] [review]
Tentative patch to fix Gates_direct.pdf problems

Does this patch fix the problems you are having?
Comment 11 Albert Astals Cid 2007-02-03 16:18:01 UTC
fixed

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.