Summary: | Poppler fails to extract Turkish characters correctly | ||
---|---|---|---|
Product: | poppler | Reporter: | İsmail Dönmez <ismail> |
Component: | general | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED INVALID | QA Contact: | |
Severity: | normal | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | Sample pdf file extracted from a longer file |
Description
İsmail Dönmez
2009-07-23 06:22:24 UTC
Created attachment 27949 [details]
Sample pdf file extracted from a longer file
Adobe can't extract the text correctly either so i'm leaning to the file being faulty How do you extract with Adobe btw? The file for sure might be faulty, is there any way to debug what might be wrong with the file? Thanks! File -> Save as Text ;-) That was easy Probably the font mapping/encoding is not correctly set Yeah looks like they didn't use CP1254 but some other latin variant. Interesting bug (on the pdf creator side) :-) |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.