Bug 104468

Summary: "/OCRPageInfo <</L /eng >> BDC" marked content causes abort
Product: poppler Reporter: jmmorlan
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: skip remainder of OC parsing code after "Unexpected MC Type" error
example PDF file to demonstrate bug

Description jmmorlan 2018-01-03 01:49:04 UTC
We encountered a PDF file (produced by "Nuance PDF Create") with a content stream containing marked content that looks like this:

/OCRPageInfo <</L /eng >> BDC
...
 EMC

The file causes poppler to crash with the following messages:

Syntax Error (546164): Unexpected MC Type: 7
Internal Error (0): Call to Object where the object was type 7, not the expected type 4
Aborted

The problem is in Gfx::opBeginMarkedContent. Any marked content with a tag starting with "OC" is assumed to be optional content. And if args[1].isName() is false, it calls args[1].getName() anyway.
Comment 1 Albert Astals Cid 2018-01-03 22:50:12 UTC
Can you please attach the document so we can make sure the change we make fixes the crash completely?
Comment 2 jmmorlan 2018-01-04 20:58:58 UTC
I don't think I can disclose the document (confidentiality). If you attach a patch I can test it out though.
Comment 3 jmmorlan 2018-01-04 21:02:35 UTC
Created attachment 136556 [details] [review]
skip remainder of OC parsing code after "Unexpected MC Type" error

Here's the patch we've been using. I don't know if it really makes sense to be trying to parse "OCRPageInfo"-tagged marked content as optional content, since the PDF spec says the tag should just be "OC". But at least this prevents poppler from crashing.
Comment 4 Albert Astals Cid 2018-01-04 21:31:26 UTC
I'm sorry but without a file i can't test if the thing is good or not.
Comment 5 jmmorlan 2018-01-05 22:20:57 UTC
Created attachment 136578 [details]
example PDF file to demonstrate bug

Here's an example PDF file that demonstrates the bug.
Comment 6 Albert Astals Cid 2018-01-08 22:49:58 UTC
Thanks for the file.

I went with a different solution than the one you proposed since we're trying to have less goto not more :)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.