Bug 97156

Summary: astral characters not handled in document outline title
Product: poppler Reporter: Jason Crain <jason>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
See Also: http://bugs.debian.org/702082
Whiteboard:
i915 platform: i915 features:
Attachments: astral.pdf
qq.pdf
Fix-UTF16-decoding-of-document-outline-title.patch

Description Jason Crain 2016-07-31 16:53:33 UTC
Created attachment 125447 [details]
astral.pdf

Forwarding from https://bugs.debian.org/702082

Both of the attached documents, astral.pdf and qq.pdf, include astral characters in the document outline.  poppler doesn't handle the surrogate pairs correctly.  In evince, this leads to the outline not being shown correctly and "Invalid UTF-8" warning messages being printed to the terminal.
Comment 1 Jason Crain 2016-07-31 16:54:26 UTC
Created attachment 125448 [details]
qq.pdf
Comment 2 Jason Crain 2016-07-31 16:58:09 UTC
Created attachment 125449 [details] [review]
Fix-UTF16-decoding-of-document-outline-title.patch

This patch changes the OutlineItem constructor to use the TextStringToUCS4 function instead of doing the conversion itself.
Comment 3 Carlos Garcia Campos 2016-09-04 09:13:03 UTC
Pushed, thanks!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.