Hi all, I use pdftohtml 0.37.0 on Ubuntu. When I call pdftohtml -xml -fmt png command - some images are extracted as .jpg (all with inverted colors) and some as .png (all with normal colors). When I call pdfimages -all test.pdf test command - I get same result for images (inverted .jpg and normal .png). But when I call pdfimages -png test.pdf test command - I get only .png images and all of it has normal colors. Questions: 1. Is it possible to convert pdf to html/xml using pdftohtml utility with export all images to .png? Or at least to have non-inverted .jpg images? Because now I need to call 2 different commands for same pdf page to get correct result? It seems that `-fmt` option doesn't work 2. if using `pdfimages -all test.pdf test` command first image is extracted as .jpg and second as .png - does it mean that first image is actually stored in JPG format in pdf? and same for second image? 3. is it ok, if exported via `pdftohtml -xml` image has one resolution (width-height), but another inside generated xml? for example, file has width=145, height=145, but inside xml it has width=105, height=105? PS: I can attach pdf file if needed Thanks in advance,
I've stumble upon the same bug, recent version (september 2016) compiled from git master (poppler 0.47). pdfimages -j test.pdf out Produce an inverted grayscale jpeg. pdfimages -png test.pdf out Produce a normal grayscale image. pdfimages -list test.pdf page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio -------------------------------------------------------------------------------------------- 1 0 image 2817 1981 sep 1 8 jpeg no 16 0 343 170 112K 2.1% I've tested several PDF (sorry can't share it) and I found that only the jpeg with a colorspace 'sep' (csSeparation) produce an inverted grayscale image.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/151.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.