Bug 90596 - pdftoppm: out of memory; tries to allocate huge bitmap
Summary: pdftoppm: out of memory; tries to allocate huge bitmap
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: splash backend (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on: 94053
Blocks:
  Show dependency treegraph
 
Reported: 2015-05-22 15:35 UTC by Jakub Wilk
Modified: 2016-02-17 22:03 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Fall back to Gfx implementation of tiling pattern if repetition rate is small (565 bytes, patch)
2016-02-16 09:54 UTC, Thomas Freitag
Details | Splinter Review

Description Jakub Wilk 2015-05-22 15:35:31 UTC
pdftoppm runs out of memory when trying to convert this PDF file:

$ wget -q https://bitbucket.org/jwilk/pdf2djvu/issue-attachment/106/jwilk/pdf2djvu/1432051577.18/106/Page156.pdf
$ ulimit -v 500000  # 500MB
$ pdftoppm -r 300 Page156.pdf > /dev/null
Out of memory


Apparently it's because it tries to allocate memory for a huge bitmap (2146x61483, whereas the output image size is only 2280x3071):

#0  0x00007ffff5d4d620 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007ffff5ce8473 in _IO_new_file_write (f=0x7ffff6018060 <_IO_2_1_stderr_>, data=0x7ffff7b54770, n=14) at fileops.c:1253
#2  0x00007ffff5ce7b33 in new_do_write (fp=fp@entry=0x7ffff6018060 <_IO_2_1_stderr_>, data=data@entry=0x7ffff7b54770 "Out of memory\n", to_do=to_do@entry=14) at fileops.c:530
#3  0x00007ffff5ce8a86 in _IO_new_file_xsputn (f=0x7ffff6018060 <_IO_2_1_stderr_>, data=<optimized out>, n=14) at fileops.c:1335
#4  0x00007ffff5cdeb0d in __GI__IO_fwrite (buf=<optimized out>, size=1, count=14, fp=0x7ffff6018060 <_IO_2_1_stderr_>) at iofwrite.c:43
#5  0x00007ffff7ac376e in gmalloc (size=131942518, checkoverflow=false) at gmem.cc:111
#6  0x00007ffff7ac395b in gmallocn (nObjs=2146, objSize=61483, checkoverflow=false) at gmem.cc:192
#7  0x00007ffff7ac397f in gmallocn (nObjs=2146, objSize=61483) at gmem.cc:196
#8  0x00007ffff7af5399 in SplashBitmap::SplashBitmap (this=0x67e9a0, widthA=2146, heightA=61483, rowPadA=1, modeA=splashModeRGB8, alphaA=true, topDown=true, separationListA=0x663350) at SplashBitmap.cc:119
#9  0x00007ffff7aeaac7 in Splash::scaleImage (this=0x6665e0, src=0x7ffff799da4c <SplashOutputDev::tilingBitmapSrc(void*, unsigned char*, unsigned char*)>, srcData=0x7fffffffdab0, srcMode=splashModeRGB8, nComps=3, srcAlpha=true, srcWidth=2146, srcHeight=2148, scaledWidth=2146, scaledHeight=61483, interpolate=false, tilingPattern=false) at Splash.cc:4133
#10 0x00007ffff7ae99eb in Splash::arbitraryTransformImage (this=0x6665e0, src=0x7ffff799da4c <SplashOutputDev::tilingBitmapSrc(void*, unsigned char*, unsigned char*)>, srcData=0x7fffffffdab0, srcMode=splashModeRGB8, nComps=3, srcAlpha=true, srcWidth=2146, srcHeight=2148, mat=0x7fffffffdae0, interpolate=false, tilingPattern=true) at Splash.cc:3934
#11 0x00007ffff7ae8f03 in Splash::drawImage (this=0x6665e0, src=0x7ffff799da4c <SplashOutputDev::tilingBitmapSrc(void*, unsigned char*, unsigned char*)>, srcData=0x7fffffffdab0, srcMode=splashModeRGB8, srcAlpha=true, w=2146, h=2148, mat=0x7fffffffdae0, interpolate=false, tilingPattern=true) at Splash.cc:3799
#12 0x00007ffff79a2aef in SplashOutputDev::tilingPatternFill (this=0x6497f0, state=0x67e180, gfxA=0x6601b0, catalog=0x6496a0, str=0x6d03f8, ptm=0x6d03c8, paintType=1, resDict=0x6709d0, mat=0x7fffffffdd20, bbox=0x6d0388, x0=0, y0=0, x1=1, y1=1, xStep=20, yStep=20) at SplashOutputDev.cc:4361
#13 0x00007ffff79fe3a3 in Gfx::doTilingPatternFill (this=0x6601b0, tPat=0x6d0370, stroke=false, eoFill=true, text=false) at Gfx.cc:2283
#14 0x00007ffff79fcc3b in Gfx::doPatternFill (this=0x6601b0, eoFill=true) at Gfx.cc:2020
#15 0x00007ffff79fc674 in Gfx::opEOFill (this=0x6601b0, args=0x7fffffffe020, numArgs=0) at Gfx.cc:1906
#16 0x00007ffff79f8000 in Gfx::execOp (this=0x6601b0, cmd=0x7fffffffe230, args=0x7fffffffe020, numArgs=0) at Gfx.cc:904
#17 0x00007ffff79f7931 in Gfx::go (this=0x6601b0, topLevel=true) at Gfx.cc:763
#18 0x00007ffff79f7765 in Gfx::display (this=0x6601b0, obj=0x7fffffffe340, topLevel=true) at Gfx.cc:729
#19 0x00007ffff7a5d765 in Page::displaySlice (this=0x650240, out=0x6497f0, hDPI=300, vDPI=300, rotate=0, useMediaBox=true, crop=false, sliceX=0, sliceY=0, sliceW=2280, sliceH=3071, printing=false, abortCheckCbk=0x0, abortCheckCbkData=0x0, annotDisplayDecideCbk=0x0, annotDisplayDecideCbkData=0x0, copyXRef=false) at Page.cc:599
#20 0x00007ffff7a61088 in PDFDoc::displayPageSlice (this=0x648e90, out=0x6497f0, page=1, hDPI=300, vDPI=300, rotate=0, useMediaBox=true, crop=false, printing=false, sliceX=0, sliceY=0, sliceW=2280, sliceH=3071, abortCheckCbk=0x0, abortCheckCbkData=0x0, annotDisplayDecideCbk=0x0, annotDisplayDecideCbkData=0x0, copyXRef=false) at PDFDoc.cc:504
#21 0x00000000004018e0 in savePageSlice (doc=0x648e90, splashOut=0x6497f0, pg=1, x=0, y=0, w=2280, h=3071, pg_w=2279.5250000000001, pg_h=3070.8625000000002, ppmFile=0x0) at pdftoppm.cc:225
#22 0x0000000000402778 in main (argc=2, argv=0x7fffffffe6c8) at pdftoppm.cc:532

Tested with Poppler 0.33.0.
Comment 1 Thomas Freitag 2016-02-12 14:37:38 UTC
The huge bitmap is a tiling pattern including its repititions before it is scaled to the resulting bitmap. This algorithm was introduced in poppler 0.17.0 and dramatically increased the speed for rendering when there is a high repitition rate.

But in this case the tiling pattern itself is so huge, but the repitition count in x and y direction is nearly always small, between 1 x 1 amd 2 x 2 in nearly every of quite a lot tiling patterns.

So I give it a try to fall back to the old algorithm, where the tiling pattern is rendered for every repitition again but directly to the resulting bitmap and measured the time:

a) Without my changes:

time ./utils/pdftoppm -png -cropbox -r 300 90596.open/Page156.pdf output/90596

real	0m29.039s
user	0m28.395s
sys	0m0.275s  

b) Use the fallback

time ./utils/pdftoppm -png -cropbox -r 300 90596.open/Page156.pdf output/90596-new

real	0m25.754s
user	0m25.592s
sys	0m0.170s

So it's even faster to use the fallback here.

So I create a patch where it falls back to the old algorithm if repeatX * repeatY <= 4 and measure the time of my regression test:

a) Without changes:

Refs created in 44 minutes and 32 seconds

b) With limit and fallback:

Refs created in 41 minutes and 4 seconds

So I will upload a patch when a solution for bug 94053 will be committed.
Comment 2 Thomas Freitag 2016-02-16 09:54:00 UTC
Created attachment 121780 [details] [review]
Fall back to Gfx implementation of tiling pattern if repetition rate is small

Here the announced patch
Comment 3 Albert Astals Cid 2016-02-17 22:03:26 UTC
Patch commited


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.