Created attachment 43145 [details] the example document I have attached a pdf document which is a mix of english and hindi languages. For Hindi it uses Aryan2 font. When I use pdftohtml on this doc, I do not get any font information in the html file. When I use the "-xml" or the "-c" Aryan2 font is still outputted as Times. So there is some problem with embedded fonts. I have attached the pdf doc for your analysis. $ pdffonts 2211.pdf name type emb sub uni object ID ------------------------------------ ----------------- --- --- --- --------- CFFEEL+TimesNewRoman TrueType yes yes no 1852 0 CFFEGM+TimesNewRoman,Bold TrueType yes yes no 1854 0 CFFFEJ+TimesNewRoman,Italic TrueType yes yes no 93 0 CFFFHI+SymbolMT CID TrueType yes yes yes 94 0 CFFGDG+Aryan2-Bold TrueType yes yes no 95 0 CFFGEI+Aryan2-Normal TrueType yes yes no 97 0 CFFGEH+Aryan2-Normal CID TrueType yes yes yes 96 0 CFFGII+Tahoma,Bold TrueType yes yes no 98 0 CFFGLJ+Tahoma TrueType yes yes no 99 0
You can use -fontfullname once poppler 0.22 gets released
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.