Bug 28076 - pdftohtml: RTL text generated backwards
Summary: pdftohtml: RTL text generated backwards
Alias: None
Product: poppler
Classification: Unclassified
Component: cairo backend (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
Depends on:
Reported: 2010-05-12 05:04 UTC by Nezmer
Modified: 2018-08-21 11:07 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Note You need to log in before you can comment on or make changes to this bug.
Description Nezmer 2010-05-12 05:04:58 UTC
"pdftohtml" seems to generate RTL text backwards. It's like (abc) is generated (cba). You can read the generated text from LTR but that's not convenient ;)

"pdftotext" is behaving correctly.
Comment 1 kirillkh 2010-12-17 09:09:11 UTC
I'm seeing the same issue with poppler-utils 0.12.4 (Ubuntu 10.04.1).

Workaround for Hebrew: convert with "-enc ISO-8859-8". However, that discards all non-Hebrew Unicode characters (such as those used in math).

Simply reversing the Hebrew words in the output doesn't help, since the order of words in a sentence is also backwards.
Comment 2 GitLab Migration User 2018-08-21 11:07:21 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/520.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.