Bug 48679 - "Error: Invalid XRef entry" messages for a valid PDF
Summary: "Error: Invalid XRef entry" messages for a valid PDF
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-14 00:53 UTC by Thomas Freitag
Modified: 2012-04-17 11:39 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Patch to suppress wrong "Error: Invalid XRef entry" messages (1.15 KB, patch)
2012-04-14 00:53 UTC, Thomas Freitag
Details | Splinter Review

Description Thomas Freitag 2012-04-14 00:53:11 UTC
Created attachment 59961 [details] [review]
Patch to suppress wrong "Error: Invalid XRef entry" messages

I have a valid PDF, where I get several "Error: Invalid XRef entry" messages when I call i.e. pdftoppm. The PDF is too huge to attach it to this bug report, so I try to describe the problem:
The PDF has an object with a big string:
56 0 obj <</CharSet (þÿ^@\(^@/^@S^@/^@t^@/^@r^@/^@a^@/egmrnabdsl^@/^@e^@/^@n^@/^@b^@/^@u^@/psca^@e^@/yhhpne^@/^@T^@/^@i^@/^@f^@/^@c^@/^@h^@/^@o^@/^@w^@/^@k^@/^@N^@/^@s^@/^@C^@/^@l^@/^@D^@/inen^@/^@A^@/no^@e^@/hter^@e^@/ifev^@/^@x^@/wt^@o^@/^@m^@/ezor^@/ofru^@/^@E^@/epirdo^@/^@d^@/^@G^@/^@H^@/maepsrna^@d^@/^@K^@/dueieris^@s^@/^@g^@\)) /CapHeight 500 /Ascent 728 /Flags 32 /FontFile 58 0 R /ItalicAngle 0 /Descent -210 /XHeight 250 /FontName /ZJIGIZ+ArialMT,Bold /Leading 150 /FontBBox [-628 -376 2000 1010 ] /MaxWidth 2628 /AvgWidth 479 /Type /FontDescriptor /StemV 0 >> endobj
Parsing this string in Lexer::getObj it exceeds the token buffer size (128 bytes), therefore xref->getNumEntry(curStr.streamGetPos()) is called to check if the document is not malformed and we are growing too much.
XRef::getNumEntry walks over every xref entry to get the obj num for the actual stream position, so it calls also XRef::getEntry with parameter 2. But in this PDF the obj num's from 2 to 4 are not used:
xref
0 2
0000000002 65535 f
0000000015 00000 n
5 145
0000000236 00000 n
:  :  :
But therefore XRef::getEntry i.e. with parameter 2 rescans the xref section, encounters that obj num is not used and give the error message.
Because getNumEntry is called only to check for malformed documents and returns the obj num for a given stream position, it shouldn't check for xrefEntryNone entries. The attached patch solves this.
Comment 1 Albert Astals Cid 2012-04-16 10:38:00 UTC
Wouldn't it be better just wrapping the
error(errSyntaxError, -1, "Invalid XRef entry");
with that new if? And call the if "complainAboutMissingEntry" ?
Comment 2 Thomas Freitag 2012-04-16 23:52:32 UTC
(In reply to comment #1)
> Wouldn't it be better just wrapping the
> error(errSyntaxError, -1, "Invalid XRef entry");
> with that new if? And call the if "complainAboutMissingEntry" ?

Possible, but I don't think it's really better: why is it necessary to rescan the complete xref section when just looking if the current stream position still belongs to that object? I think, that this scanning code comes due to the fact that a "real" fetch to that object is done, and then try to localize that "missing" object, which is not becessary in this case inmho.
Comment 3 Albert Astals Cid 2012-04-17 11:39:13 UTC
I know what you mean, but there's a reconstructXRef that i find kind of scary so i've went the "really secure way". Sorry if i sound like a coward sometimes :D


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.