Bug 92509

Summary: pdttotext transposes words and groups of words.
Product: poppler Reporter: xxxxxxxxxxxxx
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED NOTABUG QA Contact:
Severity: major    
Priority: medium    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description xxxxxxxxxxxxx 2015-10-17 17:14:53 UTC
pdftotext my.pdf

is transposing words in the text output

Here is the output from:

pdftohtml my.pdf

green<br/>
Hey jiff, rockit! &#160;Barfy, stick&#160;<br/>
to decimals!<br/>

and here is the output from:

pdttotext my.pdf

green
Hey jiff, rockit!
to decimals!

Barfy, stick

This is poppler from doing a current git clone.
Comment 1 xxxxxxxxxxxxx 2015-10-17 17:24:13 UTC
This can be closed because using -layout option keeps the text in the correct order.

(Okay reminder to self, look at command line options.)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.