Bug 71038

Summary: pdftops generated postscript has missing letters (in 64bit processor)
Product: poppler Reporter: Binaria <ingenieria>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: christian.mattar
Version: unspecified   
Hardware: Other   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Comparison of the results
The pdf file for testing
Modified pdf file for testing
ps result with missing letters
ps result ok in 32bit centos
ps result ok in 32bit centos - ps level 3
PDF, all chars visible
PS, last char from PDF is missing

Description Binaria 2013-10-30 08:23:53 UTC
When converting the attached pdf to postscript using pdftops the result has missing letters.
Command line: "pdftops testFile.pdf" (No options)
pdftops version 0.24.3 (latest)
Environment: Centos 6.3 64bit 

We have tried in two different machines (both with Centos 6.3 64bit) and the result is the same. pdftops versions 0.22.2,0.24.1 and 0.24.3 produce the same output, where letters N R and A are missing. Previous versions of pdftops (0.12.4 and 0.18.4) output a postscript where U is also missing (See attached image)

The same command in a 32bit Centos 6.3 outputs a correct postscript file, with no missing fonts. (With all versions of pdftops tested, from 0.12.4 to 0.24.3)
Comment 1 Binaria 2013-10-30 08:25:02 UTC
Created attachment 88344 [details]
Comparison of the results
Comment 2 Binaria 2013-10-30 08:36:51 UTC
Created attachment 88345 [details]
The pdf file for testing
Comment 3 kurt.pfeifle 2013-11-07 21:07:01 UTC
I used Poppler 0.24.3 on a 64bit Mac to run

   pdftops -level3 testFile.pdf testFile.ps

which outputs a 25 MByte file (from a 514 kByte source PDF) with lots of complaints (virtually hundreds of lines) about:

  [....]
  Syntax Warning: SOT marker inconsistency in tile 0: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 1: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 2: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 3: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 0: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 1: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 2: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 3: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 0: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 1: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 2: tile-part index greater (5) than number of tile-parts (5)<0a>
  Syntax Warning: SOT marker inconsistency in tile 3: tile-part index greater (5) than number of tile-parts (5)<0a>
  [....]

However, using gv and Ghostscript, the resulting PS file displays correctly.
Comment 4 Binaria 2013-11-08 12:46:07 UTC
Thanks for you reply, Kurt.

I think that the size issue is due to de background image of the original pdf. I have created a new pdf with the text only to avoid it.
If you could try with this new pdf (testFile2.pdf) and tell us if it also throws complaints. (I see no complaints at all, not with testFile.pdf, neither with testFile2.pdf)
I also attach the resulting incorrect ps file.

If in your machine the result is correct may be Centos has something to do... can anyone try in a Centos 64bit environment? Thanks.
Comment 5 Binaria 2013-11-08 12:46:54 UTC
Created attachment 88889 [details]
Modified pdf file for testing
Comment 6 Binaria 2013-11-08 12:47:41 UTC
Created attachment 88890 [details]
ps result with missing letters
Comment 7 Albert Astals Cid 2013-12-12 21:19:26 UTC
That is weird, it works perfectly fine here with poppler 0.24.3 on a 64 bit machine.

Can you test that the two .ps files produced are indeed different? i.e. use the diff command on them.
Comment 8 Binaria 2013-12-13 09:39:41 UTC
Created attachment 90706 [details]
ps result ok in 32bit centos
Comment 9 Binaria 2013-12-13 09:43:34 UTC
Hi Albert,I have added an attachment with the ps we get in centos 32bits. This one has no missing letters, and as you can see, the size is different and the diff also.

The test you ran was in a Centos 64bit? Thanks.
Comment 10 Albert Astals Cid 2013-12-15 21:58:29 UTC
Hmmm, the diff says

-%%LanguageLevel: 2
+%%LanguageLevel: 3

at the very top, are you sure you're using exactly the same options to generate both .ps files?
Comment 11 Binaria 2013-12-16 09:08:00 UTC
Created attachment 90827 [details]
ps result ok in 32bit centos - ps level 3
Comment 12 Binaria 2013-12-16 09:12:54 UTC
Yes, I forgot the "-level3" options in my last attachment, I have added I new one for better comparison. But the problem remains the same (refer to Comment 9)
Comment 13 Albert Astals Cid 2013-12-16 21:13:26 UTC
does running pdftops under valgrind give you any warning?
Comment 14 Binaria 2014-01-20 08:48:28 UTC
(In reply to comment #13)
> does running pdftops under valgrind give you any warning?

Hi again,

here is valgrind output for our CentOS 64bits machine.
(trimmed out unimportant lines)

valgrind pdftops testFile2.pdf:
==16029==
==16029== HEAP SUMMARY:
==16029==     in use at exit: 68,754 bytes in 868 blocks
==16029==   total heap usage: 27,187 allocs, 26,319 frees, 248,213,759 bytes allocated
==16029==
==16029== LEAK SUMMARY:
==16029==    definitely lost: 0 bytes in 0 blocks
==16029==    indirectly lost: 0 bytes in 0 blocks
==16029==      possibly lost: 0 bytes in 0 blocks
==16029==    still reachable: 68,754 bytes in 868 blocks
==16029==         suppressed: 0 bytes in 0 blocks
==16029== Rerun with --leak-check=full to see details of leaked memory
==16029==
==16029== For counts of detected and suppressed errors, rerun with: -v
==16029== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 6)

valgrind --tool=helgrind
==17953== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

valgrind --tool=drd
==22923== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Trying to get an unified testing enviroment i have also tried vagrant+virtual box running this image (CentOS 6.5 64bit): 

https://github.com/2creatives/vagrant-centos/releases/download/v0.1.0/centos64-x86_64-20131030.box

minimal setup for vagrant just name of the box+url all the rest default options.

I ran it in Centos 32bits as host OS. (in this machine pdftops outputs correct ps files.)

Inside Centos 6.5 VM just:
yum install poppler-utils (v 0.12.4 in repo)
pdftops /pathTo/testFile2.pdf /pathTo/testFile2.ps -> missing letters in this testFile2.ps file again. (as viewed with evince)

This way you can test in the "same" environment.

Inside VM valgrind outputs the same:

[vagrant@vagrant-centos65 ~]$ valgrind --leak-check=yes pdftops /vagrant/Descargas/testFile2.pdf
==7699== Memcheck, a memory error detector
==7699== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==7699== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==7699== Command: pdftops /vagrant/Descargas/testFile2.pdf
==7699==
==7699==
==7699== HEAP SUMMARY:
==7699==     in use at exit: 55,370 bytes in 709 blocks
==7699==   total heap usage: 26,720 allocs, 26,011 frees, 247,867,541 bytes allocated
==7699==
==7699== LEAK SUMMARY:
==7699==    definitely lost: 0 bytes in 0 blocks
==7699==    indirectly lost: 0 bytes in 0 blocks
==7699==      possibly lost: 0 bytes in 0 blocks
==7699==    still reachable: 55,370 bytes in 709 blocks
==7699==         suppressed: 0 bytes in 0 blocks
==7699== Reachable blocks (those to which a pointer was found) are not shown.
==7699== To see them, rerun with: --leak-check=full --show-reachable=yes
==7699==
==7699== For counts of detected and suppressed errors, rerun with: -v
==7699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 6)

[vagrant@vagrant-centos65 ~]$ valgrind --tool=drd pdftops /vagrant/Descargas/testFile2.pdf
==1463== drd, a thread error detector
==1463== Copyright (C) 2006-2012, and GNU GPL'd, by Bart Van Assche.
==1463== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==1463== Command: pdftops /vagrant/Descargas/testFile2.pdf
==1463==
==1463==
==1463== For counts of detected and suppressed errors, rerun with: -v
==1463== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

etc...
Comment 15 christian.mattar 2015-01-08 09:30:06 UTC
Valgrind info was provided.
Comment 16 christian.mattar 2015-01-08 10:25:22 UTC
Created attachment 111947 [details]
PDF, all chars visible
Comment 17 christian.mattar 2015-01-08 10:26:20 UTC
Created attachment 111948 [details]
PS, last char from PDF is missing
Comment 18 christian.mattar 2015-01-08 10:28:14 UTC
I've encountered a similar issue (small testcase attached). The page uses a Chinese font, but all characters except for one are working.
Comment 19 Binaria 2015-06-17 08:33:59 UTC
In Centos 7 (64 bits), tested with pdftops 0.30.0 and 0.32.0, the resulting ps displays correcly for the original testFile (attachment 88345 [details]) (But with the same warnings as in comment #3)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.