Bug 41104

Summary: Xpdf(pdfedit/poppler) renders specidic pdfs differently than e.g., Acrobat/Sumatra
Product: poppler Reporter: misutkajunior
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED NOTABUG QA Contact:
Severity: major    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: test file
viewers/editors comparison

Description misutkajunior 2011-09-21 23:00:30 UTC
Created attachment 51504 [details]
test file

Copy of 
http://pdfedit.petricek.net/bt/view.php?id=372

GfxFont.cc

int GfxCIDFont::getNextChar(
...

  *code = (CharCode)(cid = cMap->getCID(s, len, &n));
  if (ctu) {
    ...
  } else {
    *uLen = 0;
  }

if i am not mistaken, if ctu == NULL it means identity, so it should be 
  } else {
    u[0] = *code;
    *uLen = 1;
  }

at least this is the acrobat/sumatra behaviour.

Steps to reproduce:
open attached file in pdfedit/xpdf/okular/kpdf vs sumatra/acrobat
compare the first letter
Comment 1 Albert Astals Cid 2011-09-22 03:19:34 UTC
My Acrobat in Linux complains the pdf is malformed and renders two small balls for the first character. Can you attach a rendering of what you get?
Comment 2 misutkajunior 2011-09-22 03:34:52 UTC
Created attachment 51508 [details]
viewers/editors comparison
Comment 3 Thomas Freitag 2016-03-18 16:54:38 UTC
You are not true: the problem is NOT a charcode and/or map-to-unicode problem, the problem is that the fonts are not embedded in this font! 
Therefore poppler ask fontconfig to find a suitable font for in this case ArialMT and gets one where 'ě' has no glyph, so the replacement char is rendered.
'ě' is only in the unicode version for Arial, which is used (and delivered!!) by Acrobat, therefore You can render it correctly on windows platforms also with poppler, but not on unix platforms.

Poppler doesn't deliver any fonts because poppler has no rights on fonts!
Comment 4 Jason Crain 2016-03-20 16:26:14 UTC
I do not consider this a poppler bug either because this PDF does not embed fonts and it looks up glyphs by the glyph ID, which is a bad idea because glyph IDs are only meaningful for one specific font.  The PDF spec even has a line forbidding this behavior.

As a workaround, you can install Microsoft's Arial font.  You might also need to add something like this to your ~/.fonts.conf to make sure that ArialMT (the font referenced in the PDF) gets matched to Arial:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<match target="pattern">
    <test name="family">
	<string>ArialMT</string>
    </test>
    <edit name="family" mode="prepend">
	<string>Arial</string>
    </edit>
</match>
</fontconfig>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.