Bug 34300

Summary: Text obtained from some pdfs with cyrillic encoding is unreadable
Product: poppler Reporter: Jose Aliste <jose.aliste>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED INVALID QA Contact:
Severity: normal    
Priority: medium CC: freedesktop
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Jose Aliste 2011-02-15 09:11:37 UTC
If you do pdftotext to the file 
http://zelmanov.ptep-online.com/ctan/lshort_russian.pdf you get unreadable text. 

The reason seems to be related with encoding. If I use iconv to convert the text asumming it comes from CP1251, it gets readable.
Comment 1 Albert Astals Cid 2011-02-15 10:56:52 UTC
The same unreadable text Adobe Reader returns. Complain to whoever created the document.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.