Bug 29329

Summary: Binary I/O on Windows doesn't work when using redirection
Product: poppler Reporter: Jonathan Liu <net147>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: All   
OS: Windows (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Fix binary I/O on Windows when using redirection
Fix binary I/O on Windows when using redirection
Fix checking whether _WIN32 is defined

Description Jonathan Liu 2010-07-30 19:01:54 UTC
Created attachment 37478 [details] [review]
Fix binary I/O on Windows when using redirection

If you use the poppler utilities (pdffonts, pdfimages, pdfinfo, pdftoabw, pdftohtml, pdftoppm, pdftops, pdftotext) on Windows feed a PDF binary as input via stdin, the program reads it in text mode instead of binary mode. Similarly if you output a binary to stdout and redirect to a file, it will write it in text mode. This results in the binary data being corrupted and it is a result of Windows using text mode by default for stdin, stdout, stderr and file streams.

The attached patch resolves this by changing the default mode to binary for these utilities.
Comment 1 Adrian Johnson 2010-07-31 01:37:05 UTC
The problem with this patch is it changes all of stdin, stdout, and stderr to binary without checking if binary mode is actually required. For example all the utilities use stderr to output text error messages. pdfinfo and pdffonts use stdout to output text. Where text is output the mode should not be set to binary.

Instead of this patch I would suggest putting the call to _setmode() at the point where we know stdin will be used for reading a PDF. ie put a 

  #if _WIN32
    _setmode(_fileno(stdin), _O_BINARY);
  #endif

in either StdinCachedFile.cc or StdinPDFDocBuilder.cc.

The other place I can see where binary mode is not currently set on win32 is when pdftoppm writes images to stdout. The call to _setmode() could be done in writeImgFile().

The "_fmode = _O_BINARY" line is not necessary as it has no effect on files already open.

See also the use of setmode in TextOutputDev.cc where the file mode is set to binary at the point where we know we need binary output.
Comment 2 Jonathan Liu 2010-07-31 04:59:08 UTC
Created attachment 37483 [details] [review]
Fix binary I/O on Windows when using redirection

Revised patch based on feedback from Adrian Johnson.
Comment 3 Albert Astals Cid 2010-08-10 11:44:31 UTC
I've commited the fix for poppler 0.14.3
Comment 4 Jonathan Liu 2010-09-08 04:20:02 UTC
#if _WIN32 should be changed to #ifdef _WIN32 for:
 * poppler/StdinCachedFile.cc
 * utils/pdftoppm.cc
Comment 5 Jonathan Liu 2010-09-08 04:20:34 UTC
Created attachment 38547 [details] [review]
Fix checking whether _WIN32 is defined
Comment 6 Albert Astals Cid 2010-09-09 12:34:44 UTC
Commited.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.