Bug 89076 - [patch] fix "length bigger than vheaTab size" and "length bigger than vmtxTab size"
Summary: [patch] fix "length bigger than vheaTab size" and "length bigger than vmtxTab...
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-11 06:14 UTC by William Bader
Modified: 2019-05-15 06:34 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
patch to fix a FoFiTrueType test and to use alloca in GooString::appendfv() (4.05 KB, patch)
2015-02-11 06:14 UTC, William Bader
Details | Splinter Review

Description William Bader 2015-02-11 06:14:42 UTC
Created attachment 113337 [details] [review]
patch to fix a FoFiTrueType test and to use alloca in GooString::appendfv()

I looked at a report of slowness on the PDF at https://www.gnu.org/software/autoconf/manual/autoconf-2.60/html_node/Particular-Functions.html

It gets two warnings due to a recently added test in fofi/FoFiTrueType.cc that is < and should be <=
Syntax Warning: length bigger than vheaTab size
Syntax Warning: length bigger than vmtxTab size
The attached patch fixes the test.

This file also gets the warnings below for each page because the page offset table has a length of 0.  See Hints::readPageOffsetTable().  ghostscript does not complain, but I think that the page offset table should have at least 1 item, so the PDF might be bad. The Creator is "PDFCreator 2.0.2.0".
Syntax Warning: Invalid least number of objects reading page offset hints table
Syntax Warning: Failed to get object num from hint tables for page 1
Syntax Warning: Failed parsing page 1 using hint tables
Syntax Warning: Failed to get object num from hint tables for page 1
Syntax Warning: Failed parsing page 1 using hint tables

Then I did a build with profiling. I have some of the results below.  pdftops makes a lot of use of GooString::formatDouble() and GooString::appendfv() to write numbers to the output postscript file.  appendfv() is called a lot and allocates a buffer each time.  I changed the allocation from gmallocn() to alloca() (including updating configure.ac to check for AC_FUNC_ALLOCA).  alloca() is a little risky because it does not do overflow checking, but I think that appendfv() is never fed huge strings.

After that, it would be nice to be able to inline FlateStream::getChar(), but I think that it would require a lot of special case code for each type of stream.

On my Fedora 20, 2.5 GHz i5-2450M laptop, running on a RAM disk, "pdftops 2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf x" takes 3.6s and "pdftocairo -jpeg 2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf x" takes 4.6s.  This document has 68 pages, so that is about 15 pages per second.

On page 11 and some other pages, valgrind with pdftocairo complains
jpeg_write_scanlines (in /usr/lib64/libjpeg.so.62.1.0)
JpegWriter::writeRow(unsigned char**) (JpegWriter.cc:166)
writePageImage(GooString*) (pdftocairo.cc:418)
main (pdftocairo.cc:660)
In pdftocairo, height = 1755, stride = 4960, width = 1240, transp = 0, tiff = 0, gray = 0, mono = 0, so it is in the case for RGB.
pdftoppm -jpeg is clean, so it might be an interaction between cairo and libjpeg.


pdftops 2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf xxx
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 14.97      0.28     0.28  2968599     0.00     0.00  GooString::formatDouble(double, char*, int, int, bool, char**, int*)
 12.83      0.52     0.24  2057501     0.00     0.00  GooString::appendfv(char const*, __va_list_tag*)
  7.22      0.66     0.14  7576688     0.00     0.00  FlateStream::readSome()
  7.22      0.79     0.14 37198267     0.00     0.00  FlateStream::getChar()
  5.35      0.89     0.10 35633758     0.00     0.00  JBIG2Segment::~JBIG2Segment()
  5.35      0.99     0.10  6171541     0.00     0.00  Lexer::getObj(Object*, int)
  4.28      1.07     0.08  5535785     0.00     0.00  Parser::getObj(Object*, bool, unsigned char*, CryptAlgorithm, int, int, int, int, bool)
  3.74      1.14     0.07 29067578     0.00     0.00  Lexer::lookChar()
  3.74      1.21     0.07 11505008     0.00     0.00  FlateStream::getHuffmanCodeWord(FlateHuffmanTab*)
  3.48      1.28     0.07    12915     0.01     0.01  FlateStream::compHuffmanCodes(int*, int, FlateHuffmanTab*)
  2.94      1.33     0.06  4415831     0.00     0.00  FlateStream::getCodeWord(int)
  2.67      1.38     0.05 14501741     0.00     0.00  FileStream::getChar()
  2.67      1.43     0.05      146     0.34    11.52  Gfx::go(bool)
  2.14      1.47     0.04 12755583     0.00     0.00  Object::free()
  2.14      1.51     0.04  6114655     0.00     0.00  Parser::shift(int)
  1.60      1.54     0.03  1816652     0.00     0.00  Gfx::execOp(Object*, Object*, int)
  1.60      1.57     0.03      218     0.14     0.14  Lexer::getChar(bool)

pdftocairo -jpeg 2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf xxx
 32.43      0.36     0.36                             writePageImage(GooString*)
  9.01      0.46     0.10  3027722     0.00     0.00  Lexer::getObj(Object*, int)
  8.11      0.55     0.09  5304419     0.00     0.00  FlateStream::getHuffmanCodeWord(FlateHuffmanTab*)
  5.41      0.61     0.06  3436913     0.00     0.00  FlateStream::readSome()
  5.41      0.67     0.06  2762912     0.00     0.00  Parser::getObj(Object*, bool, unsigned char*, CryptAlgorithm, int, int, int, int, bool)
  2.70      0.70     0.03 18211014     0.00     0.00  FlateStream::getChar()
  2.70      0.73     0.03 17502359     0.00     0.00  GlobalParams::getSecurityHandler(char*)
  2.70      0.76     0.03  6246744     0.00     0.00  Object::free()
  2.70      0.79     0.03    18126     0.00     0.00  GfxDeviceCMYKColorSpace::getRGBLine(unsigned char*, unsigned int*, int)
  2.70      0.82     0.03     5172     0.01     0.01  FlateStream::compHuffmanCodes(int*, int, FlateHuffmanTab*)
  1.80      0.84     0.02 14309886     0.00     0.00  Lexer::lookChar()

pdftoppm -jpeg 2-DESCR_648853-IT-EN-DE_MANITOU_RIGENERATO.pdf xxx
  9.35      0.29     0.29 12022517     0.00     0.00  Splash::pipeRunAARGB8(SplashPipe*)
  9.35      0.58     0.29      275     1.05     1.61  Splash::scaleImageYuXuBilinear(bool (*)(void*, unsigned char*, unsigned char*), void*, SplashColorMode, int, bool, int, int, int, int, SplashBitmap*)
  7.42      0.81     0.23       68     3.38     3.38  Splash::compositeBackground(unsigned char*)
  6.77      1.02     0.21   260902     0.00     0.00  SplashXPathScanner::clipAALine(SplashBitmap*, int*, int*, int)
  4.19      1.15     0.13  8111163     0.00     0.00  void std::make_heap<SplashXPathSeg*, cmpXPathSegsFunctor>(SplashXPathSeg*, SplashXPathSeg*, cmpXPathSegsFunctor)
  4.19      1.28     0.13   209068     0.00     0.00  SplashXPathSeg* std::__unguarded_partition<SplashXPathSeg*, SplashXPathSeg, cmpXPathSegsFunctor>(SplashXPathSeg*, SplashXPathSeg*, SplashXPathSeg const&, cmpXPathSegsFunctor)
  4.19      1.41     0.13   154405     0.00     0.01  Splash::fillWithPattern(SplashPath*, bool, SplashPattern*, double)
  3.23      1.51     0.10   582666     0.00     0.00  SplashXPathScanner::renderAALine(SplashBitmap*, int*, int*, int, bool)
  3.23      1.61     0.10   126376     0.00     0.00  SplashXPathScanner::computeIntersections()
  2.90      1.70     0.09 14488586     0.00     0.00  Splash::pipeRunSimpleRGB8(SplashPipe*)
  2.90      1.79     0.09   229187     0.00     0.00  SplashClip::clipAALine(SplashBitmap*, int*, int*, int, bool)
  2.26      1.86     0.07 18211014     0.00     0.00  FlateStream::getChar()
  2.26      1.93     0.07  6248717     0.00     0.00  Object::free()
  2.26      2.00     0.07    20483     0.00     0.00  expandRow(unsigned char*, unsigned char*, int, int, int)
  2.26      2.07     0.07  3028636     0.00     0.00  Lexer::getObj(Object*, int)
Comment 2 Adrian Johnson 2015-02-11 10:52:49 UTC
These three changes are unrelated and should be split into three separate commits.

My only concern with the alloca change is ensuring the stack usage is not too large.

>  argsSize = 8;
> +#if HAVE_ALLOCA_H
> +  args = (GooStringFormatArg *)alloca(argsSize * sizeof(GooStringFormatArg));
> +#else

This is a fixed size allocation that I estimate to be up to 544 bytes so it should be fine.

>  	    argsSize *= 2;
> +#if HAVE_ALLOCA_H
> +	    {
> +	      GooStringFormatArg *new_args = (GooStringFormatArg
>      *)alloca(argsSize * sizeof(GooStringFormatArg));
> +	      memcpy(new_args, args, argsLen);
> +	      args = new_args;
> +	    }
> +#else

It looks like the allocation is increased if there are more than 8 format args. I have not searched the code to see if there are more than 8 format args used. I'm guessing this is unlikely so maybe when there are more than 8 args we could fallback to using gmallocn/greallocn. If > 8 args is rare there will be little performance impact but it does put an upper bound on the stack usage.

> +#if HAVE_ALLOCA_H
> +#else
>   gfree(args);
> +#endif

#ifndef HAVE_ALLOCA_H
   gfree(args);
#endif

would be better.

> +#if 0
> int FlateStream::getChar() {
>   if (pred) {
>     return pred->getChar();
>   }
>   return doGetRawChar();
> }
> +#endif

I don't see any need for keeping this. It will always be in the git history.


I had trouble understanding what the timings you provided are referring to. Could you provide separate times for master, master + alloca change, and master + alloc + inline change.
Comment 3 William Bader 2015-02-11 15:00:50 UTC
Thanks for looking at it.

>These three changes are unrelated and should be split into three separate commits.

Later today, I will make a patch with just the FoFiTrueType patch here and open separate reports for the alloca change and the Flate change.

>#ifndef HAVE_ALLOCA_H

I did that intentionally for symmetry with the allocation to make it clear that the other case was not forgotten.  I do that in my own code, but I will make the new patch match the style in poppler.

>I had trouble understanding what the timings

The timings are after the change.  The alloca change made appendfv() about 5% faster.

The FlateStream change had no effect.  Some of the other streams inline their getChar() also, but I suspect that all of the stream getChar() functions are called from places that gcc can not tell the Stream type at compile time.

I ran several pdftoxxx utilities because the person who reported the performance issues did not say what output device he was using.  On my laptop, pdftocairo is significantly faster than pdftoppm because cairo uses assembly code that takes advantage of vector instructions while splash is pure C++.

The person who reported the issue had an ARM CPU, and the coding of the output method and the display hardware could make a big difference.  For example, we have a RaspberryPi running Fedora, and moving windows on the console is so slow that you can see the pixels paint, like the first time that I ran X on a 20MHz 386 nearly 30 years ago.
Comment 4 Albert Astals Cid 2015-02-11 16:36:26 UTC
I've pushed the off by one fix to FoFiTrueType, thanks for catching!

Please open different bugs for the rest.
Comment 5 William Bader 2015-02-12 04:30:13 UTC
Thanks.

The gmallocn() GooString::appendfv() patch is in https://bugs.freedesktop.org/show_bug.cgi?id=89096

The Stream inline patch is in https://bugs.freedesktop.org/show_bug.cgi?id=89097
Comment 6 bholeshankar1992crax@gmail.com (Spammer; Account disabled) 2019-05-15 06:34:37 UTC Comment hidden (spam)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.