pdftoppm has a -cropbox command line parameter to restrict output to the CropBox. It would be nice to have the same -cropbox parameter for pdftotext, in order to only extract text that is part of the printed/displayed CropBox.
I have submitted a feature request that covers this, but requests it for all utils, as well as support for other bounding boxes, as bug 45108.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/481.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.