Bug 13573

Summary: Poppler does not support ActualText
Product: poppler Reporter: Adrian Johnson <ajohnson>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: ActualText patch

Description Adrian Johnson 2007-12-09 02:22:29 UTC
Poppler does not support ActualText. The ActualText entry is used to specify replacement text for content that does translate to text but is represented in a non standard way (eg glyphs for ligatures). ActualText support is required to enable text to be correctly extracted from the pdf.

Some examples of PDFs that use ActualText are at http://www.unicode.org/udhr/
One of the PDFs that I tested is http://www.unicode.org/udhr/d/udhr_san.pdf

A patch to implement ActualText support is attached.
Comment 1 Adrian Johnson 2007-12-09 02:24:14 UTC
Created attachment 13005 [details] [review]
ActualText patch

Patch to implement ActualText
Comment 2 Albert Astals Cid 2007-12-09 09:08:52 UTC
Patch commited thanks a lot. Are you subscribed to poppler mailing list? If not we will be happy of getting people like you there :-)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.