Summary: | Speedup PDF loading by ~19% | ||
---|---|---|---|
Product: | poppler | Reporter: | Krzysztof Kowalczyk <kkowalczyk> |
Component: | general | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | high | CC: | dkelly |
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | Windows (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Patch for the faster string implementation
GooString optimizations Lexer and UGooString optimization Additional fix for previous Lexer caching patch Tentative patch to fix Gates_direct.pdf problems |
Description
Krzysztof Kowalczyk
2006-08-07 22:56:51 UTC
Created attachment 6496 [details] [review] Patch for the faster string implementation Created attachment 6540 [details] [review] GooString optimizations Created attachment 6541 [details] [review] Lexer and UGooString optimization I've created two new patches that, when combined, provide ~19% speed improvement when loading PDFReference16.pdf document (PDF reference from Adobe website). It cleans up previous patch and adds additional improvements. Brief overview of changes: * make GooString use internal buffer for short strings; re-factor GooString to remove code duplication * gfree() doesn't have to check for NULL pointer (C library does it anyway, it's in the C ISO standard). gfree() is called so often that removing that check improves the speed by 1% * make UGooString use internal buffer as well; refactor the code to make it more like GooString * Parser::getObj(): make 'key' to be UGooString to avoid creating temporary objects since dictAdd() uses UGooString as the argument * Lexer::lookChar() and Lexer::getChar() - getChar() is often called right after lookChar() (for about 30% of all getChar()s). Currently it has to re-do all the work that lookChar() did. A very simple optimization is to cache the last value of lookChar() and return it in getChar() if available. * PageLabelInfo.cc: #include <config.h> since it's needed for compilation on Windows Most of those changes reduce the number of malloc()/free() calls. There are still ways to go. Created attachment 6798 [details] [review] Additional fix for previous Lexer caching patch This patch fixes a bug introduces in patch 6541 (Lexer and UGooString optimization) by taking into account the fact that we might have one value cached when creating a substream from current stream. They should be applied together. Krzysztof, although patch aren't commented yet, I think they are just great, thanks for your work, it should certainly go into poppler. commited to the head branch. Thanks for the work. I found at least 3 files that now fail to parse correctly because of the lookChar() caching patch. http://www.apago.com/~dkelly/bugzilla_7808.zip I can see poppler failing in the Gates_direct.pdf but not on the other 2, what's the problem there? Created attachment 8562 [details] [review] Tentative patch to fix Gates_direct.pdf problems Does this patch fix the problems you are having? fixed |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.