Bug 65889 - Bullets are converted to other characters when converting PDF to HTML
Summary: Bullets are converted to other characters when converting PDF to HTML
Status: RESOLVED MOVED
Alias: None
Product: poppler
Classification: Unclassified
Component: pdftohtml (show other bugs)
Version: unspecified
Hardware: Other Windows (All)
: medium major
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-18 10:30 UTC by Nitesh G.
Modified: 2018-08-20 21:43 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Input PDF (1.16 MB, text/plain)
2013-06-18 10:30 UTC, Nitesh G.
Details

Description Nitesh G. 2013-06-18 10:30:49 UTC
Created attachment 80993 [details]
Input PDF

Hi,

I have tried to convert the attached PDF to HTML(using pdftohtml.exe) and found that all bullets are converted into some other character represented by
alphabetic character 'n'.
I am attaching the reference PDF.

Thanks,
Nitesh
Comment 1 Dan Small 2013-08-07 17:31:39 UTC
I'm seeing something similar on Ubuntu where bullets are converted to ï·   (ï(0082)  ) using the file http://www.cityplym.ac.uk/sites/default/files/docs/jobs/Safeguarding_changes_to_DBS.pdf
Comment 2 Dan Small 2013-08-07 18:24:09 UTC
Forgot the version poppler-0.24.0
Comment 3 Dan Small 2013-08-07 19:25:32 UTC
The problem goes away if I add the output encoding as -enc Windows-1255
Comment 4 GitLab Migration User 2018-08-20 21:43:29 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/45.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.