Bug 17497 - Poppler cannot handle certain font encodings or font types correctly
Summary: Poppler cannot handle certain font encodings or font types correctly
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: cairo backend (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-09 05:14 UTC by Simon
Modified: 2008-11-01 06:18 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
The LaTeX source file (1.05 KB, text/plain)
2008-09-09 05:14 UTC, Simon
Details
pdflatex creates this PDF file, if I omit the fontenc package (7.84 KB, application/file)
2008-09-09 05:17 UTC, Simon
Details
The printing output of Evince for good.pdf (6.28 KB, application/file)
2008-09-09 05:18 UTC, Simon
Details
pdflatex creates this PDF file, if I use the T1 fonts (7.10 KB, application/file)
2008-09-09 05:19 UTC, Simon
Details
The printing output of Evince for wrong.pdf (292.27 KB, application/file)
2008-09-09 05:20 UTC, Simon
Details
pdftops wrong.pdf <- version from poppler-utils 0.6.4-1ubuntu1 amd64 (33.19 KB, application/postscript)
2008-09-18 10:55 UTC, Brandon Moore
Details
pdftops -level1 wrong.pdf <- version from poppler-utils 0.6.4-1ubuntu1 amd64 (34.69 KB, application/postscript)
2008-09-18 10:56 UTC, Brandon Moore
Details

Description Simon 2008-09-09 05:14:28 UTC
Created attachment 18774 [details]
The LaTeX source file

If I use the T1 fonts in LaTeX, Evince cannot print the PDF files any more. Actually printing takes very long and outputs a rastered image. With the Adobe Reader everything works fine. In the related Ubuntu bug report (https://bugs.launchpad.net/ubuntu/+source/evince/+bug/227186), Sebastian Bacher asked to open a bug report for poppler. Actually I'm not sure, whether this is a bug in poppler or in cairo. There is another related bug report in Launchpad (https://bugs.launchpad.net/ubuntu/+source/cupsys/+bug/150187).

Here are the details:

I created a small sample LaTeX file (test.tex), which demonstrates the bug. If I compile it without the "fontenc" package using pdflatex, everything works fine in Evince. I named this PDF file "good.pdf". If I compile it with the "fontenc" package (and the option "T1", which uses fonts of a certain LaTeX font encoding), pdflatex creates a PDF file ("wrong.pdf") that Evince cannot print correctly any more. I also attach the printing output of Evince, when using the PDF printer. I named these file "good.printed.pdf" and "wrong.printed.pdf".

The following graphic illustrates the described setting.

              test.tex

pdflatex:     /       \

      good.pdf          wrong.pdf

Evince:   |                  |

 good.printed.pdf    wrong.printed.pdf


Since the problem is somehow related to fonts, I also compile the font information of pdffonts and the Adobe Reader in the following:

pdffonts on good.pdf:
  name                                 type              emb sub uni object ID
  ------------------------------------ ----------------- --- --- --- ---------
  FYHDIM+CMSSBX10                      Type 1            yes yes no       4  0
  PTMTFG+CMR10                         Type 1            yes yes no       5  0

pdffonts on good.printed.pdf:
  name                                 type              emb sub uni object ID
  ------------------------------------ ----------------- --- --- --- ---------
  UFQSLH+f-1-0                         Type 1C           yes yes yes     10  0
  ZRSKVS+f-0-0                         Type 1C           yes yes yes      8  0

pdffonts on wrong.pdf:
  name                                 type              emb sub uni object ID
  ------------------------------------ ----------------- --- --- --- ---------
  [none]                               Type 3            yes no  no       4  0
  [none]                               Type 3            yes no  no       5  0

pdffonts on wrong.printed.pdf:
  name                                 type              emb sub uni object ID
  ------------------------------------ ----------------- --- --- --- ---------

Adobe Reader on good.pdf:
  CMR10 (Eingebettete Untergruppe) [engl: embedded subgroup]
    Typ: Type 1
    Kodierung: Mitgeliefert  [engl: Coding: shipped/supported]
  CMSSBX10 (Eingebettete Untergruppe)
    Typ: Type 1
    Kodierung: Mitgeliefert

Adobe Reader on good.printed.pdf:
  f-0-0 (Eingebettete Untergruppe)
    Typ: Type 1
    Kodierung: Benutzerdefiniert [engl: Coding: User defined]
  f-1-0 (Eingebettete Untergruppe)
    Typ: Type 1
    Kodierung: Benutzerdefiniert

Adobe Reader on wrong.pdf:
  F16
    Typ: Type 3
    Kodierung: Benutzerdefiniert
    Originalschrift: F16  [engl: source/former/original font: F16]
    Originalschrifttyp: Type 3  [engl: original font type: Type 3]
  F19
    Typ: Type 3
    Kodierung: Benutzerdefiniert
    Originalschrift: F19
    Originalschrifttyp: Type 3

Adobe Reader on wrong.printed.pdf:
  (Empty widget)


Evince states version 2.22.2 with poppler 0.6.4 (cairo). It's the Ubuntu 8.04 package. If you need any more information, I'd be glad to help.

Sincerly, Simon
Comment 1 Simon 2008-09-09 05:17:56 UTC
Created attachment 18775 [details]
pdflatex creates this PDF file, if I omit the fontenc package
Comment 2 Simon 2008-09-09 05:18:46 UTC
Created attachment 18776 [details]
The printing output of Evince for good.pdf
Comment 3 Simon 2008-09-09 05:19:49 UTC
Created attachment 18777 [details]
pdflatex creates this PDF file, if I use the T1 fonts
Comment 4 Simon 2008-09-09 05:20:24 UTC
Created attachment 18778 [details]
The printing output of Evince for wrong.pdf
Comment 5 James Cloos 2008-09-11 12:39:25 UTC
> pdffonts on wrong.pdf:
>   name         type      emb sub uni object ID
>   ------------ --------- --- --- --- ---------
>   [none]       Type 3    yes no  no       4  0
>   [none]       Type 3    yes no  no       5  0

Note the Type3 fonts.  That means that you ended up with metafont-
rendered bitmap fonts rather than type1 fonts.  Specifically the
ec fonts which are typically found at:

    /usr/share/texmf-dist/fonts/source/jknappen/ec/

You may prefer to get the LatinModern fonts, outline fonts which cover
the subset of ec text-font glyphs which LaTeX uses.  (The math fonts,
when using lm, are the same CM Type1 fonts your good.pdf example would
have used, had it had any math.)

That said, it is indeed a bug that evince cannot correctly handle Type3
bitmap fonts.  And probably means that it also cannot handle Type3 vector
fonts.

Whether the bug is in evince, poppler, cairo is a good question.

Cairo has had some recent changes which may be relevant.  Since you point
at launchpad, I presum you use Ubuntu; you, therefore, are probably using
rather old versions of poppler and cairo....
Comment 6 Simon 2008-09-12 02:51:29 UTC
I agree: The main problem is, that evince/poppler/cairo cannot print Type 3 fonts. But 

 - I regularly receive or download some PDF files with Type 3 fonts

 - Evince _can show_ them correctly on the screen 
   (with each letter markable on its own)

 - Adobe Reader can show them correctly on the screen and 
   print them to PostScript

 - Evince cannot print them.

And if I look at many bug reports in the Internet, I'm not the only one receiving such documents. (Actually there are much more bug reports I consider to be related as those two linked in my original posting.)

So, thanks for pointing out the LaTeX behaviour. Indeed, the EC package can solve it as it switches to Type 1 fonts. But this is only of help, if I have the LaTeX sources.

Again thanks James and best regards,

Simon
Comment 7 James Cloos 2008-09-12 11:11:13 UTC
xpdf/poppler (via Gentoo’s ebuild) can print wrong.pdf to PS in an
optimal way (the fonts are converted to PS Type3 fonts and the strings
are shown).

Ghostscript 8.63’s cairo backend handles wrong.pdf as well as expected:
all of the glyphs are output as filled paths.  (The cairo backend is
advertized to do just that with any fonts).  The filled paths, though,
look like squares making up the individual pixels of the bitmaps.

GS 8.63’s pdfwrite does a much better job than you got from 8.61:

  :; pdfinfo wrong.printed.pdf
  Producer:       GPL Ghostscript 8.61
  CreationDate:   Tue Sep  9 12:03:47 2008
  ModDate:        Tue Sep  9 12:03:47 2008
  Tagged:         no
  Pages:          1
  Encrypted:      no
  Page size:      595 x 842 pts (A4)
  File size:      299281 bytes
  Optimized:      no
  PDF version:    1.4

  :; pdfinfo wrong_pdfwrite.pdf 
  Producer:       GPL Ghostscript 8.63
  CreationDate:   Fri Sep 12 13:54:23 2008
  ModDate:        Fri Sep 12 13:54:23 2008
  Tagged:         no
  Pages:          1
  Encrypted:      no
  Page size:      595.28 x 841.89 pts (A4)
  File size:      7804 bytes
  Optimized:      no
  PDF version:    1.4

Said output also looks better when viewed on screen.

Also, poppler’s pdftops seems to work well on wrong.pdf (as of poppler
commit 217c0d1f80a78713977a7bfbe680fce90f1c6b36).  As well as xpdf/poppler
does.

This suggests that if the bug is in poppler it has been fixed, or that
the bug in in evince.

Unfortunately, I cannot test in evince because I seem to have updated
poppler and evince but not poppler-bindings, which prevented evince’s
configure from finding libpoppler.  [SIGH]

I’ll have to update again and try then.  Probably not until Sunday,
though.
Comment 8 Brandon Moore 2008-09-18 10:53:54 UTC
pdftops alone fails on wrong.pdf.
(Version 3.02, from poppler-utils 0.6.4-1ubuntu1 in Ubuntu 8.04).
Instead of characters I get see squares of garbage.
With the -level1 option the output seems fine.
I haven't tried a more recent release yet.
Comment 9 Brandon Moore 2008-09-18 10:55:31 UTC
Created attachment 18985 [details]
pdftops wrong.pdf <- version from poppler-utils 0.6.4-1ubuntu1 amd64
Comment 10 Brandon Moore 2008-09-18 10:56:05 UTC
Created attachment 18986 [details]
pdftops -level1 wrong.pdf <- version from poppler-utils 0.6.4-1ubuntu1 amd64

This one looks ok.
Comment 11 James Cloos 2008-09-18 14:09:57 UTC
I am out of X right now, but I am pretty sure that pdftops/poppler from
poppler git master of a few days ago gave reasonable output.  I can say
that, for each of levels 1, 2 and 3, it generates Type3 PostScript fonts
for each of the Typd3 PDF fonts and outputs the text using show-type ops.

The biggest difference between the level1 and level2 output is that l1
uses hex and level2 uses ascii85 encoding for the images which make up
the type3 fonts.  Level3 only adds better support fir CID-keyed fonts
over the level2 output.

[Time passes...]

I ran those ps files thru gs -sDEVICE=pbm -r75 (which creates ascii PBM
files), opened them in emacs and removed the newlines after each 64-
character long line, thereby having one line of 0s and 1s for each pixel
row of the image.  That shows that gs 8.63, at least, can handle the PS
output by pdftops/poppler (of git master) just fine.

I suspect, then, that this part of your problem is due to the outdated
version of poppler in the version of Ubuntu you have installed.
(poppler 0.6.4 is OLD.)

Git master poppler and cairo do not fully fix the problem when using
evince to print.

Evince generates new PDF or PS using cairo when printing.  The resulting
files are ugly.  

I found that evince needed tens of minutes to print wrong.pdf to a ps
file on my 1GHz PIIIM.  The resulting ps has 847 fallback images of 
the form:

  % Fallback Image: x=149, y=129, w=3, h=2 res=300dpi size=351
  [ 0.24 0 0 0.24 149 660.839983 ] concat
  /DeviceRGB setcolorspace
  8 dict dup begin
    /ImageType 1 def
    /Width 13 def
    /Height 9 def
    /BitsPerComponent 8 def
    /Decode [ 0 1 0 1 0 1 ] def
    /DataSource currentfile /ASCII85Decode filter /LZWDecode filter def
    /ImageMatrix [ 1 0 0 -1 0 9 ] def
  end
  image
  J3Vsg3$]7K#D>EP:q1$o*=mro@So+\<\5,H7Uo7+~>Q
  q 0 0 612 792 rectclip

If you use something like pdftk(1) to uncompress the PDF evince
generates you find similar output there, too.

I'm not sure where the fault lies in terms of evince, poppler or
cairo.  Evince probably should use cairo's userfont API.  It
probably should also use poppler's pdftops code for generating
PS from PDF, and pass PDF thru as is when generating PDF from PDF.

IE, evince probably ought to have document-backend-specific routines
for generating PS and PDF for printing.

I *think*, then, that -- at least when using current tip versions --
the bug is in evince, not in poppler.
Comment 12 Adrian Johnson 2008-09-21 04:39:06 UTC
(In reply to comment #11)
> I found that evince needed tens of minutes to print wrong.pdf to a ps
> file on my 1GHz PIIIM.  The resulting ps has 847 fallback images of 
> the form:
> 
>   % Fallback Image: x=149, y=129, w=3, h=2 res=300dpi size=351
>   [ 0.24 0 0 0.24 149 660.839983 ] concat
>   /DeviceRGB setcolorspace
>   8 dict dup begin
>     /ImageType 1 def
>     /Width 13 def
>     /Height 9 def
>     /BitsPerComponent 8 def
>     /Decode [ 0 1 0 1 0 1 ] def
>     /DataSource currentfile /ASCII85Decode filter /LZWDecode filter def
>     /ImageMatrix [ 1 0 0 -1 0 9 ] def
>   end
>   image
>   J3Vsg3$]7K#D>EP:q1$o*=mro@So+\<\5,H7Uo7+~>Q
>   q 0 0 612 792 rectclip
> 
> If you use something like pdftk(1) to uncompress the PDF evince
> generates you find similar output there, too.
> 
> I'm not sure where the fault lies in terms of evince, poppler or
> cairo.  Evince probably should use cairo's userfont API.  It
> probably should also use poppler's pdftops code for generating
> PS from PDF, and pass PDF thru as is when generating PDF from PDF.

It is a limitation of the cairo backend of poppler that it can't print Type 3 fonts nicely. When cairo 1.8 is released with the new user-font API it will be possible to fix this in poppler. Cairo user-fonts are embedded in PS/PDF as a Type 3 font.

Comment 13 James Cloos 2008-09-21 15:13:52 UTC
> It is a limitation of the cairo backend of poppler that it can't print
> Type 3 fonts nicely. When cairo 1.8 is released with the new user-font
> API it will be possible to fix this in poppler. Cairo user-fonts are
> embedded in PS/PDF as a Type 3 font.

Although I generally use xpdf/poppler for reading PDFs and evince only
for forms, I am available to help test if you are willing to increase
poppler master’s required cairo version to either cairo/master or the
current pre-release (1.7.6) and start using the user-font API on master
now rather than after cairo makes its 1.8.0 release.

(I find this to be an intriguing bug to quash.)
Comment 14 Albert Astals Cid 2008-09-21 15:25:58 UTC
Not right now, we are in process of converting trunk into stable so you'll have to wait until October 9 that is when trunk will be open again for non bugfixes
Comment 15 MM 2008-10-13 05:27:32 UTC
Maybe this bug is related to https://bugs.freedesktop.org/show_bug.cgi?id=12769 ?
Comment 16 Adrian Johnson 2008-11-01 06:18:22 UTC
Fix in git.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.