Bug 89097

Summary: [patch] inline some frequently called stream functions
Product: poppler Reporter: William Bader <williambader>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: williambader
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: patch to inline some functions in Stream.cc and Stream.h

Description William Bader 2015-02-12 04:25:34 UTC
Created attachment 113384 [details] [review]
patch to inline some functions in Stream.cc and Stream.h

The various Stream classes have getChar() functions that are called once for each byte in the stream.
This patch makes some of the simpler functions inline.
Most of the functions are virtual, so they probably can't be inlined, but a few are not virtual, and the change seems to make a small difference.

The profiles are from running "pdftops 2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf x.ps" three times on the test file from https://bugs.freedesktop.org/show_bug.cgi?id=89076

pdftops before the patch
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 15.25      0.93     0.93  8905797     0.00     0.00  GooString::formatDouble(double, char*, int, int, bool, char**, int*)
  9.34      1.50     0.57  6172503     0.00     0.00  GooString::appendfv(char const*, __va_list_tag*)
  7.87      1.98     0.48 34515024     0.00     0.00  FlateStream::getHuffmanCodeWord(FlateHuffmanTab*)
  7.05      2.41     0.43 22730064     0.00     0.00  FlateStream::readSome()
  6.48      2.81     0.40 18514623     0.00     0.00  Lexer::getObj(Object*, int)
  5.90      3.17     0.36 16607355     0.00     0.00  Parser::getObj(Object*, bool, unsigned char*, CryptAlgorithm, int, int, int, int, bool)
  5.90      3.53     0.36 106901274     0.00     0.00  JBIG2Segment::~JBIG2Segment()
  4.92      3.83     0.30 111594801     0.00     0.00  FlateStream::getChar()
  4.34      4.09     0.27 38266749     0.00     0.00  Object::free()
  3.69      4.32     0.23 87202734     0.00     0.00  Lexer::lookChar()
  3.61      4.54     0.22 19199391     0.00     0.00  GooString::append(char const*, int)
  2.79      4.71     0.17      438     0.39    12.47  Gfx::go(bool)
  2.62      4.87     0.16    38745     0.00     0.00  FlateStream::compHuffmanCodes(int*, int, FlateHuffmanTab*)
  1.39      4.95     0.09 10967463     0.00     0.00  gmallocn

pdftops after the inline patch
 12.99      0.73     0.73  8905797     0.00     0.00  GooString::formatDouble(double, char*, int, int, bool, char**, int*)
  9.96      1.29     0.56  6172503     0.00     0.00  GooString::appendfv(char const*, __va_list_tag*)
  8.10      1.75     0.46 18514623     0.00     0.00  Lexer::getObj(Object*, int)
  7.38      2.16     0.42 22730064     0.00     0.00  FlateStream::readSome()
  6.23      2.51     0.35 106901274     0.00     0.00  JBIG2Segment::~JBIG2Segment()
  6.14      2.86     0.35 34515024     0.00     0.00  FlateStream::getHuffmanCodeWord(FlateHuffmanTab*)
  4.98      3.14     0.28 16607355     0.00     0.00  Parser::getObj(Object*, bool, unsigned char*, CryptAlgorithm, int, int, int, int, bool)
  4.72      3.40     0.27 111594801     0.00     0.00  FlateStream::getChar()
  3.02      3.57     0.17      438     0.39    11.37  Gfx::go(bool)
  2.67      3.72     0.15    38745     0.00     0.00  FlateStream::compHuffmanCodes(int*, int, FlateHuffmanTab*)
  2.40      3.86     0.14 38266749     0.00     0.00  Object::free()
  2.14      3.98     0.12 87202734     0.00     0.00  Lexer::lookChar()
  2.05      4.09     0.12 43505223     0.00     0.00  FileStream::getChar()
  1.96      4.20     0.11  5449956     0.00     0.00  Gfx::findOp(char*)
  1.87      4.31     0.11 18343965     0.00     0.00  Parser::shift(int)
  1.78      4.41     0.10 10967463     0.00     0.00  gmallocn
  1.78      4.51     0.10  5449956     0.00     0.00  Gfx::execOp(Object*, Object*, int)
  1.51      4.59     0.09 19199391     0.00     0.00  GooString::append(char const*, int)
  1.07      4.65     0.06     2904     0.02     0.07  SampledFunction::SampledFunction(Object*, Dict*)
  0.98      4.71     0.06 13247493     0.00     0.00  FlateStream::getCodeWord(int)
  0.89      4.76     0.05  1208397     0.00     0.00  RunLengthStream::getChars(int, unsigned char*)
  0.71      4.80     0.04   579486     0.00     0.00  GooString::formatInt(long long, char*, int, bool, int, int, char**, int*, bool)
  0.62      4.83     0.04    72426     0.00     0.00  Parser::~Parser()
  0.62      4.87     0.04    19782     0.00     0.00  BaseStream::getDict()
  0.62      4.90     0.04      654     0.05     0.05  Lexer::getChar(bool)
  0.62      4.94     0.04                             LZWStream::getRawChar()
Comment 1 Albert Astals Cid 2015-02-12 19:24:01 UTC
This doesn't really make much sense, makes the code somewhat harder to read and makes future patching from xpdf harder for no win since those calls can't be inlined, see http://www.parashift.com/c++-faq-lite/inline-virtuals.html

So i'd prefer not accepting this patch.
Comment 2 William Bader 2015-02-12 21:02:43 UTC
OK.
It didn't seem to make much of a performance difference since the getChar() functions are virtual and can't be inlined.
I think that a better solution is making the low-level Stream interface based on filling buffers rather than passing characters one at a time.
The profile above has 111M calls to one of the getChar() functions.
Would it be possible to have only one non-virtual getChar() in Stream that calls a virtual function to fill a buffer as needed?
At worst the buffer fill function could just load the one character that getChar() would have returned, so it would not require a redesign of streams that can provide only one character at a time.
Comment 3 Albert Astals Cid 2015-02-12 22:11:02 UTC
there is doGetChars, use it in the places you think that may be needed.
Comment 4 GitLab Migration User 2018-08-21 10:50:12 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/389.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.