Bug 34300 - Text obtained from some pdfs with cyrillic encoding is unreadable
Summary: Text obtained from some pdfs with cyrillic encoding is unreadable
Status: RESOLVED INVALID
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-15 09:11 UTC by Jose Aliste
Modified: 2011-02-16 02:04 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Jose Aliste 2011-02-15 09:11:37 UTC
If you do pdftotext to the file 
http://zelmanov.ptep-online.com/ctan/lshort_russian.pdf you get unreadable text. 

The reason seems to be related with encoding. If I use iconv to convert the text asumming it comes from CP1251, it gets readable.
Comment 1 Albert Astals Cid 2011-02-15 10:56:52 UTC
The same unreadable text Adobe Reader returns. Complain to whoever created the document.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.