Bug 75178

Summary: font substitution fails if font name contains a dash
Product: poppler Reporter: joachim.schwender
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: akira, freedesktop
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Example PDF wit such a font that is not embedded and gets not substituted properly
The font file needed for font substitution
This example uses both Code128 and Code-128 fonts not embedded
Code128.ttf font for testing
Code-128 for testing

Description joachim.schwender 2014-02-18 22:38:47 UTC
Created attachment 94316 [details]
Example PDF wit such a font that is not embedded and gets not substituted properly

I have PDF that contain a font named Code-128, not embedded. Just installing this font does not help: it is not picked properly and evince uses DejaVu for rendering, which is bad for a barcode font!
A fc-list does show the installed font. A fc-match does not work and always returns "DejaVu Sans" "Book"

Expected: fontconfig picks the installed font and uses it for rendering in evince. fc-match returns the font name it it matches, no matter what character is contained in the name.

Tested on Ubuntu 12.04 and 13.10 (x86_64) with fonconfig 2.10.93

Temprary workaround i currently use:
I used fontforge to change the font by removing the dash in the font name. Then i implemented a substitution for Code-128 to be substituted by Code128.

Result: evince displayed the font properly but fc-match still does not work.

Unfortunately i have no influence on the PDF generation, they are generated by a German Ministry of Interior and i have been struggling for three years now to get them to create only PDF-A (fonts embedded), but they refuse to implement that without giving meaningful arguments. We, like many other companies are forced(!) to utilize these buggy PDFs, probably most companies use Windows+AdobeReader and don't have that problem...
Comment 1 joachim.schwender 2014-02-18 22:40:51 UTC
Created attachment 94317 [details]
The font file needed for font substitution
Comment 2 Akira TAGOH 2014-02-19 03:19:23 UTC
I'm speaking without having a look at evince/poppler code - there are two ways to ask for better font from applications. one is to use FcNameParse() with the certain query string, one is to add any patterns to FcPattern directly. for former one, the command line tools in fontconfig also uses, has a limitation in its syntax. -,: is reserved characters. so if those characters are included in a font name, it has to follow them with \ to escape. in this case, fc-match Code\-128 should works if it is installed properly. later one should works without it.

To demonstrate:

% fc-list -f "%{family}\n" | grep -- -
Emmentaler-11
Emmentaler-14
Emmentaler-13
Emmentaler-18
Emmentaler-Brace
Emmentaler-26
Emmentaler-20
Emmentaler-23
Emmentaler-16
[...]

% fc-match Emmentaler-11
emmentaler.ttf: "Emmentaler" "26"
% fc-match "Emmentaler\-11"
emmentaler-11.otf: "Emmentaler-11" "11"

Speaking of your issue:

(In reply to comment #0)
> Created attachment 94316 [details]
> Example PDF wit such a font that is not embedded and gets not substituted
> properly
> 
> I have PDF that contain a font named Code-128, not embedded.

Does it? that looks like the attached PDF requires the font named OCRABT though.
That said you mentioned the workaround and works for you, something went wrong on this report I guess. anyway, adding the following recipe works for the font substitution:

<fontconfig>
  <alias binding="strong">
    <family>OCRABT</family>
    <prefer><family>Code-128</family></prefer>
  </alias>
</fontconfig>

Please let me know if you still have any issues or the above assumption was wrong.
Comment 3 Behdad Esfahbod 2014-02-19 17:26:09 UTC
That fc-match Code-128 doesn't work is expected, you need to escape the dash in there.  It wouldn't happen with the API though, so that can't explain the issue you are seeing.
Comment 4 joachim.schwender 2014-02-19 21:09:32 UTC
Created attachment 94385 [details]
This example uses both Code128 and Code-128 fonts not embedded

I am very sorry for the first PDF which does not contain the right examples.
Comment 5 joachim.schwender 2014-02-19 21:13:41 UTC
(In reply to comment #3)
> That fc-match Code-128 doesn't work is expected, you need to escape the dash
> in there.  It wouldn't happen with the API though, so that can't explain the
> issue you are seeing.

I found that it works only by using escaping PLUS double quotes around the match string. Unfortunaltely only escaping does _NOT_ work. This is not mentioned in the man page, would it be an idea add that?
Comment 6 joachim.schwender 2014-02-19 21:18:58 UTC
Created attachment 94386 [details]
Code128.ttf font for testing
Comment 7 joachim.schwender 2014-02-19 21:19:27 UTC
Created attachment 94387 [details]
Code-128 for testing
Comment 8 joachim.schwender 2014-02-19 21:21:08 UTC
My workaround is to install Code128 in addition and use the following config:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<match target="pattern" >
      <test name="family" qual="any" >
        <string>Code\-128</string>
      </test>
      <edit name="family" mode="assign" binding="same">
        <string>Code128</string>
      </edit>
</match>
</fontconfig>
Comment 9 Behdad Esfahbod 2014-02-19 21:40:16 UTC
(In reply to comment #5)
> (In reply to comment #3)
> > That fc-match Code-128 doesn't work is expected, you need to escape the dash
> > in there.  It wouldn't happen with the API though, so that can't explain the
> > issue you are seeing.
> 
> I found that it works only by using escaping PLUS double quotes around the
> match string. Unfortunaltely only escaping does _NOT_ work. This is not
> mentioned in the man page, would it be an idea add that?

You sure?!

behdad:~ 0$ fc-match 'Code-128'
DejaVuSans.ttf: "DejaVu Sans" "Book"
behdad:~ 0$ fc-match 'Code\-128'
Code-128.ttf: "Code-128" "Normal"
behdad:~ 0$ fc-match '"Code-128"'
DejaVuSans.ttf: "DejaVu Sans" "Book"

Oh, you mean this doesn't work:

$ fc-match Code\-128
DejaVuSans.ttf: "DejaVu Sans" "Book"

That has nothing to do with fontconfig.  Bash / your shell will consume that escaping.  You want the backslash to reach fontconfig.
Comment 10 Akira TAGOH 2014-02-20 02:18:48 UTC
Well, this is a poppler issue. see:
http://cgit.freedesktop.org/poppler/poppler/tree/poppler/GlobalParams.cc#n1021

they are replacing a dash in a font name and fontconfig takes care of the dash and the space in the name differently.
This change has been made based on older version of fontconfig:

commit 82638babe89e402c0348619ec3205059b977c7e9
Author: Albert Astals Cid <aacid@kde.org>
Date:   Thu Jul 28 17:34:19 2005 +0000

    Fontconfig patch is here, rejoice

So I'm not surprised something went wrong. it may be the time to get rid of this hack. reassigning to poppler.
Comment 11 joachim.schwender 2014-02-20 15:21:01 UTC
Dear Akira, thank you very much for your help! I will verify if a patch on poppler works and it will hopefully find it's way to release.

greetings :-)
Comment 12 Behdad Esfahbod 2014-02-20 20:19:32 UTC
(In reply to comment #10)
> Well, this is a poppler issue. see:
> http://cgit.freedesktop.org/poppler/poppler/tree/poppler/GlobalParams.
> cc#n1021
> 
> they are replacing a dash in a font name and fontconfig takes care of the
> dash and the space in the name differently.
> This change has been made based on older version of fontconfig:
> 
> commit 82638babe89e402c0348619ec3205059b977c7e9
> Author: Albert Astals Cid <aacid@kde.org>
> Date:   Thu Jul 28 17:34:19 2005 +0000
> 
>     Fontconfig patch is here, rejoice
> 
> So I'm not surprised something went wrong. it may be the time to get rid of
> this hack. reassigning to poppler.

Wait.  I *think* Poppler is chopping a PS font name and passing to fontconfig.  So, if the real font name has space, Poppler has to convert dash to space, but if the real font name actually has dash, the conversion is bogus.  Back to a known issue: we need matching by PS font name.  That, or make fontconfig ignore dash in family name matching as it does for space.  Or, make poppler remove the dash instead of replacing it with space.
Comment 13 Albert Astals Cid 2014-02-20 21:24:56 UTC
I guess we can always remove the - and then not remove it and see if it gives us something better. Honestly my fontconfig skills are not ultra awesome and i've never been able to understand the fontconfig API much so any help you guys can give us is very much welcome.
Comment 14 Akira TAGOH 2014-02-25 02:11:59 UTC
(In reply to comment #12)
> Wait.  I *think* Poppler is chopping a PS font name and passing to
> fontconfig.  So, if the real font name has space, Poppler has to convert
> dash to space, but if the real font name actually has dash, the conversion
> is bogus.  Back to a known issue: we need matching by PS font name.  That,
> or make fontconfig ignore dash in family name matching as it does for space.
> Or, make poppler remove the dash instead of replacing it with space.

Oh yes, that's true. that said there are nothing we can do in fontconfig - we did that already. poppler just needs to match it on FC_POSTSCRIPT_NAME rather than FC_FAMILY_NAME.
Comment 15 Albert Astals Cid 2014-05-06 22:37:34 UTC
This is interesting, if i remove the code that removes the "-", the example for the Code-128 file works.

Now if i pass FC_POSTSCRIPT_NAME rather than FC_FAMILY to FcPatternBuild as Akira suggests it stops working.

Any idea of why that might happen?
Comment 16 Akira TAGOH 2014-05-13 03:29:04 UTC
(In reply to comment #15)
> This is interesting, if i remove the code that removes the "-", the example
> for the Code-128 file works.
> 
> Now if i pass FC_POSTSCRIPT_NAME rather than FC_FAMILY to FcPatternBuild as
> Akira suggests it stops working.
> 
> Any idea of why that might happen?

Because that postscript name doesn't contain a dash? see fc-query /path/to/font.
Comment 17 Albert Astals Cid 2014-07-21 22:24:46 UTC
So then using FC_POSTSCRIPT_NAME as suggested is wrong, no?
Comment 18 Akira TAGOH 2014-07-22 07:42:35 UTC
(In reply to comment #17)
> So then using FC_POSTSCRIPT_NAME as suggested is wrong, no?

Yes and No. even though the font prefers to use the name without a dash, we could be saying that using a dash in the document to require a font is wrong; maybe checking both would be ideal perhaps.
Comment 19 Albert Astals Cid 2014-07-22 23:18:12 UTC
So how would you do both? When do you decide to do the second query? I.e. how do you decide the first query "failed"?
Comment 20 Behdad Esfahbod 2014-07-22 23:19:33 UTC
Do one query, but add two items for FC_POSTSCRIPT_NAME.
Comment 21 Albert Astals Cid 2014-07-22 23:22:41 UTC
One with the - and one without? I thought you guys were saying removing the - was bad?
Comment 22 Behdad Esfahbod 2014-07-22 23:29:46 UTC
Well, apparently you have a document that asks for PS name "code-128", but the font has PS name "code128".  So there's no logic left.  It's back to brute-force now :(.
Comment 23 GitLab Migration User 2018-08-20 21:34:52 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/25.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.