Bug 54436

Summary: pdftohtml - Issues when converting PDF to HTML
Product: poppler Reporter: Nitesh G. <nitesh.golchha>
Component: pdftohtmlAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED INVALID QA Contact:
Severity: critical    
Priority: medium CC: koleygr
Version: unspecified   
Hardware: All   
OS: Windows (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: input PDF

Description Nitesh G. 2012-09-03 12:37:38 UTC
Created attachment 66540 [details]
input PDF

Hi,

I have tried to convert the attached PDF to HTML(using pdftohtml.exe) and found several issues as
follows:-
Page 1-> All bullets are converted into some other character represented by
alphabetic character 'n'
Page 2-> The text "Media Services" is shown horizontal instead of vertical
Page 3-> There is extended spacing between word and hyperlink and the
underlining is stretched a bit too far
Page 4-> Japanese characters inside table are garbled
Page 5-> Bullets are lost
Page 6-> Style and font is different than that in original PDF.

I am attaching the reference PDF.

Thanks,
Nitesh
Comment 1 Albert Astals Cid 2012-09-03 21:37:37 UTC
Please open a bug for each of the problems. Otherwise you make it very hard for us to keep track of what has been fixed and what not in a bug.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.