Created attachment 37478 [details] [review] Fix binary I/O on Windows when using redirection If you use the poppler utilities (pdffonts, pdfimages, pdfinfo, pdftoabw, pdftohtml, pdftoppm, pdftops, pdftotext) on Windows feed a PDF binary as input via stdin, the program reads it in text mode instead of binary mode. Similarly if you output a binary to stdout and redirect to a file, it will write it in text mode. This results in the binary data being corrupted and it is a result of Windows using text mode by default for stdin, stdout, stderr and file streams. The attached patch resolves this by changing the default mode to binary for these utilities.
The problem with this patch is it changes all of stdin, stdout, and stderr to binary without checking if binary mode is actually required. For example all the utilities use stderr to output text error messages. pdfinfo and pdffonts use stdout to output text. Where text is output the mode should not be set to binary. Instead of this patch I would suggest putting the call to _setmode() at the point where we know stdin will be used for reading a PDF. ie put a #if _WIN32 _setmode(_fileno(stdin), _O_BINARY); #endif in either StdinCachedFile.cc or StdinPDFDocBuilder.cc. The other place I can see where binary mode is not currently set on win32 is when pdftoppm writes images to stdout. The call to _setmode() could be done in writeImgFile(). The "_fmode = _O_BINARY" line is not necessary as it has no effect on files already open. See also the use of setmode in TextOutputDev.cc where the file mode is set to binary at the point where we know we need binary output.
Created attachment 37483 [details] [review] Fix binary I/O on Windows when using redirection Revised patch based on feedback from Adrian Johnson.
I've commited the fix for poppler 0.14.3
#if _WIN32 should be changed to #ifdef _WIN32 for: * poppler/StdinCachedFile.cc * utils/pdftoppm.cc
Created attachment 38547 [details] [review] Fix checking whether _WIN32 is defined
Commited.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.