Bug 50002 - pdftohtml writes invalid HTML
Summary: pdftohtml writes invalid HTML
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: pdftohtml (show other bugs)
Version: unspecified
Hardware: All All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-16 06:05 UTC by geralds
Modified: 2012-05-26 08:47 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Patch for utils/HtmlOutputDev.cc (8.30 KB, patch)
2012-05-16 06:05 UTC, geralds
Details | Splinter Review
patch for invalid xhtml output (12.36 KB, patch)
2012-05-23 02:25 UTC, geralds
Details | Splinter Review
patch for invalid xhtml output (12.36 KB, patch)
2012-05-23 02:44 UTC, geralds
Details | Splinter Review

Description geralds 2012-05-16 06:05:59 UTC
Created attachment 61713 [details] [review]
Patch for utils/HtmlOutputDev.cc

Patch against r0.20.0 is attached.

The element names output by pdftohtml take upper case, which is  not valid to the DTD and so rejected by epubcheck and other tools downstream.

The <hr> and <frame> elements are missing closing tags or abbreviated empty tag notation (<hr/>, <frame/>).

These errors are fixed by patch.txt applied to utils/HtmlOutputDev.cc.

Tested on CentOS Linux server against source built from r0.20.0 tarball.
Comment 1 Albert Astals Cid 2012-05-21 15:02:15 UTC
Can you please attach the "diff -u" output, it's much easier to read.
Comment 2 geralds 2012-05-23 02:25:21 UTC
Created attachment 62006 [details] [review]
patch for invalid xhtml output

diff -u output attached
Comment 3 geralds 2012-05-23 02:44:29 UTC
Created attachment 62007 [details] [review]
patch for invalid xhtml output

-u patch v2
Comment 4 Albert Astals Cid 2012-05-23 13:14:19 UTC
Hi, i'm going to need your full name so i can put it correctly when commiting the patch.
Comment 5 geralds 2012-05-25 03:03:50 UTC
Thanks Albert, it's Gerald Schmidt.

(In reply to comment #4)
> Hi, i'm going to need your full name so i can put it correctly when commiting
> the patch.
Comment 6 Albert Astals Cid 2012-05-26 08:47:24 UTC
Pushed it to master, will be part of poppler 0.22


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.