Summary: |
Font info not getting properly into html when using pdftohtml |
Product: |
poppler
|
Reporter: |
Sushant Sinha <sushant354> |
Component: |
utils | Assignee: |
poppler-bugs <poppler-bugs> |
Status: |
RESOLVED
FIXED
|
QA Contact: |
|
Severity: |
normal
|
|
|
Priority: |
medium
|
CC: |
koleygr
|
Version: |
unspecified | |
|
Hardware: |
x86-64 (AMD64) | |
|
OS: |
Linux (All) | |
|
Whiteboard: |
|
i915 platform:
|
|
i915 features:
|
|
Attachments: |
the example document
|
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 43145 [details] the example document I have attached a pdf document which is a mix of english and hindi languages. For Hindi it uses Aryan2 font. When I use pdftohtml on this doc, I do not get any font information in the html file. When I use the "-xml" or the "-c" Aryan2 font is still outputted as Times. So there is some problem with embedded fonts. I have attached the pdf doc for your analysis. $ pdffonts 2211.pdf name type emb sub uni object ID ------------------------------------ ----------------- --- --- --- --------- CFFEEL+TimesNewRoman TrueType yes yes no 1852 0 CFFEGM+TimesNewRoman,Bold TrueType yes yes no 1854 0 CFFFEJ+TimesNewRoman,Italic TrueType yes yes no 93 0 CFFFHI+SymbolMT CID TrueType yes yes yes 94 0 CFFGDG+Aryan2-Bold TrueType yes yes no 95 0 CFFGEI+Aryan2-Normal TrueType yes yes no 97 0 CFFGEH+Aryan2-Normal CID TrueType yes yes yes 96 0 CFFGII+Tahoma,Bold TrueType yes yes no 98 0 CFFGLJ+Tahoma TrueType yes yes no 99 0