6500 – page display failure for some JBIG2 PDFs

Bug 6500 - page display failure for some JBIG2 PDFs

Summary: page display failure for some JBIG2 PDFs

Status:	RESOLVED FIXED

Alias:	None

Product:	poppler
Classification:	Unclassified
Component:	general (show other bugs)
Version:	unspecified
Hardware:	x86 (IA32) Linux (All)

Importance:	high normal
Assignee:	poppler-bugs
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2006-04-05 08:47 UTC by paul walmsley
Modified:	2006-04-05 11:20 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
A PDF demonstrating the bug (2.03 MB, application/pdf) 2006-04-05 08:50 UTC, paul walmsley	Details
patch for this bug (1.81 KB, patch) 2006-04-05 08:51 UTC, paul walmsley	Details \| Splinter Review
View All

Description paul walmsley 2006-04-05 08:47:10 UTC

Some PDF generation software packages produce PDFs with JBIG2 pages that fail to
render with poppler, xpdf, and the Apple PDF Previewer; but which display
correctly with Adobe Acrobat Reader.

The pages which fail cause poppler to generate these messages:

Error (....): Unknown segment type in JBIG2 stream
Error (....): Unexpected EOF in JBIG2 stream

It turns out that some of the JBIG2 images embedded in the PDF have symbol
dictionary segments (segment type 0) with an extraneous NULL byte at the end of
the segment.  This extra byte is not consumed by the symbol dictionary segment
handler, and it prevents poppler from reading the next segment header correctly. 

The PDF generator seems to create these types of segments when it's compressing
large amounts of whitespace.  It generates an arithmetic-coded symbol dictionary
segment with SDNUMNEWSYMS set to 0, and then stores an extra NULL after the end
of the arithmetic coder data.  Note that the segment's length is "correct" -- it
includes the NULL byte -- but poppler, quite reasonably, expects the segment to
end immediately after the arithmetic-coded data is exhausted.  An example of a
PDF with this problem is attached to this bug.  

The attached patch works around this problem for all segment types by reading
through any remaining bytes left in the segment after the handler returns to
JBIG2Stream::readSegments().  It will also warn the user if a segment handler
read more bytes than the segment length.

The same issue also exists in the xpdf 3.01 code base, and a similar patch is
being forwarded to its author.

This patch was developed collaboratively with Raj Kumar of the Internet Archive
<rkumar@archive.org>.


- Paul

Comment 1 paul walmsley 2006-04-05 08:50:20 UTC

Created attachment 5199 [details]
A PDF demonstrating the bug

also available via 

   http://ia311040.us.archive.org/~rkumar/test1_opt.pdf

Comment 2 paul walmsley 2006-04-05 08:51:19 UTC

Created attachment 5200 [details] [review]
patch for this bug

Comment 3 Albert Astals Cid 2006-04-06 04:20:03 UTC

Thanks for the patch :-)

It went into the CVS

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.