Bug 28076

Summary: pdftohtml: RTL text generated backwards
Product: poppler Reporter: Nezmer <me>
Component: cairo backendAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: Fahad.alsaidi
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Nezmer 2010-05-12 05:04:58 UTC
"pdftohtml" seems to generate RTL text backwards. It's like (abc) is generated (cba). You can read the generated text from LTR but that's not convenient ;)

"pdftotext" is behaving correctly.
Comment 1 kirillkh 2010-12-17 09:09:11 UTC
I'm seeing the same issue with poppler-utils 0.12.4 (Ubuntu 10.04.1).

Workaround for Hebrew: convert with "-enc ISO-8859-8". However, that discards all non-Hebrew Unicode characters (such as those used in math).

Simply reversing the Hebrew words in the output doesn't help, since the order of words in a sentence is also backwards.
Comment 2 GitLab Migration User 2018-08-21 11:07:21 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/520.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.