I recently encountered some PDF files that cause all poppler utilities (pdfinfo, pdftotext, evince) to allocate a large amount of memory (usually 3GB) and hang for several minutes. Acrobat Reader does not exhibit either problem. The cause is corrupted linearization hint tables - the program that wrote the .pdfs did not properly align the start of the shared objects hint table on a byte boundary. So its header looks like: firstSharedObjectNumber 00 00 00 00 firstSharedObjectOffset 00 00 00 00 nSharedGroupsFirst 00 00 00 01 nSharedGroups 10 00 00 01 nBitsNumObjects 10 00 groupLengthLeast 00 00 00 02 nBitsDiffGroupLength 80 01 Hints::readSharedObjectsTable allocates several giant arrays, and then spends ages trying to populate them (without checking that it's reached the end of the stream). Since nBits* can't be more than 32, this hint table should just be rejected as invalid immediately.
The PDFs were produced by "Aspose.Pdf for .NET 8.9.0", a library which is apparently quite widely used.
Created attachment 122979 [details] Example broken PDF file
you seem to know what you're talking about, maybe you can produce a patch?
Created attachment 125492 [details] [review] Hints.cc patch I really don't know much about linearization, but here's a patch to try to fix a couple of problems that stand out: 1. If nBitsNumObjects or nBitsDiffGroupLength are greater than 32, bail out early 2. Improve readBits efficiency (replace recursion with iteration; fix EOF detection to work on any bit, not just those where n is equal to 1 modulo 32)
Where in the spec does it say that those values have to be smaller than 33?
Right before the table of fields in the Page Offset Hint Table header, there's a note: "All the items in Table F.3 that specify a number of bits needed, such as item 3, have values in the range 0 through 32. Although that range requires only 6 bits, 16-bit numbers shall be used." It doesn't explicitly say this about the Shared Object Hint Table header (described in Table F.5), but there's no indication that it's different, nor can I think of any reason for it to be.
Pushed the first part, the second part didn't apply (and was unrealted anyway).
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.