Bug 63909 - conversion breaks on certain unicode errors
Summary: conversion breaks on certain unicode errors
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: pdftohtml (show other bugs)
Version: unspecified
Hardware: Other Windows (All)
: medium blocker
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-25 08:55 UTC by Calistophes
Modified: 2013-04-26 10:59 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Calistophes 2013-04-25 08:55:35 UTC
Hi

An error occurs within calibri while converting a non-DRM PDF book. I've filed a bug report there but a maintainer rejected the report (https://bugs.launchpad.net/calibre/+bug/1171012/) being an error in poppler and directed me to submit a report here. Unfortunately he didn't give any further information like version the version of poppler is used in calibri.

When converting a PDF book I get this stack trace (the full error message is in the calibri report):

Converting file to html...
Python function terminated unexpectedly
   (Error Code: 1)
Traceback (most recent call last):
  File "site.py", line 132, in main
  File "site.py", line 109, in run_entry_point
  File "site-packages\calibre\utils\ipc\worker.py", line 189, in main
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert
  File "site-packages\calibre\ebooks\conversion\plumber.py", line 1009, in run
  File "site-packages\calibre\customize\conversion.py", line 239, in __call__
  File "site-packages\calibre\ebooks\conversion\plugins\pdf_input.py", line 50, in convert
  File "site-packages\calibre\ebooks\pdf\pdftohtml.py", line 92, in pdftohtml
calibre.ebooks.ConversionError
Comment 1 Albert Astals Cid 2013-04-25 09:02:23 UTC
We don't have any python in our code so it can't hardly be our problem with that kind of trace you are adding.

Have you run pdftohtml directly on the pdf file? What's the result?
Comment 2 Calistophes 2013-04-25 09:18:38 UTC
No I ran it within calibri. To be honest with you I don't have much of a clue how this is working. I've just been firing up calibri (for the first time), hit the convert button and got that message. Until recently I didn't even know pdf->html was a 3rdParty integration.

Trying to start pdfohtml.exe complains something about a missing freetype.dll. The file exists in a subfolder but there is no hint as to how I can tell pdfohtml where to look.
Comment 3 Albert Astals Cid 2013-04-25 09:24:13 UTC
Well, you can't report a bug against pdftohtml if you can't even use pdftohtml :D

The calibre guy has never been much friendly sadly. Just attach the pdf file and i'll give it a go myself in pdftohtml
Comment 4 Calistophes 2013-04-25 09:42:21 UTC
This is indeed sad to hear. Thanks a lot for your effort on this, though!

I found the pdf here: https://code.google.com/p/cunruiwang-se/downloads/detail?name=Agile%20Principles%2C%20Patterns%2C%20and%20Practices%20in%20C%23.pdf

I'm not particular in need of this conversion as this book is on my whishlist anyways. I see it more as a special case for other books I really want on my eReader but were never published as epub.
Comment 5 Albert Astals Cid 2013-04-25 18:30:51 UTC
Ok, i fixed a crash in pdftohtml when generating the outlines in that file, fix will be in poppler 0.22.4

Not sure if that's the problem you were having in calibre but with the info you gave me that's all i can do.
Comment 6 Calistophes 2013-04-26 10:59:42 UTC
Thanks! I'll relay the info to the calibri guy(s).


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.