Bug 107419 - Allow page ranges in pdftohtml
Summary: Allow page ranges in pdftohtml
Alias: None
Product: poppler
Classification: Unclassified
Component: utils (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
Depends on:
Reported: 2018-07-29 16:16 UTC by ulatekh
Modified: 2018-08-21 11:19 UTC (History)
0 users

See Also:
i915 platform:
i915 features:

Patch to add functionality (5.54 KB, patch)
2018-07-29 16:16 UTC, ulatekh
Details | Splinter Review
Patch to add functionality (15.36 KB, patch)
2018-08-04 18:50 UTC, ulatekh
Details | Splinter Review

Description ulatekh 2018-07-29 16:16:23 UTC
Created attachment 140875 [details] [review]
Patch to add functionality

I'm using pdftohtml to extract information from PDFs and organize the results into a database, so I had a chance to dig through the code.

The patch adds a "-pg" command-line option to pdftohtml, to allow noncontiguous ranges of pages to be specified.

I don't know what the policy is on using Boost inside of poppler, but I can hand-write a simple integer interval-set if it's a problem.

The "-pg" command-line option may be useful in other utilities, e.g. pdfseparate.
Comment 1 Albert Astals Cid 2018-07-29 18:01:52 UTC
Please don't use boost.
Comment 2 ulatekh 2018-08-04 18:50:24 UTC
That's too bad. I was hoping boost was considered one of the "standard components of modern Unix desktop environments", like mentioned in the README file.

In any case, the new patch has a custom-written interval-set, and uses strtok_r() instead of boost::split.
Comment 3 ulatekh 2018-08-04 18:50:46 UTC
Created attachment 140966 [details] [review]
Patch to add functionality
Comment 4 GitLab Migration User 2018-08-21 11:19:25 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/621.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.