Bug 29329 - Binary I/O on Windows doesn't work when using redirection
Summary: Binary I/O on Windows doesn't work when using redirection
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: All Windows (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-30 19:01 UTC by Jonathan Liu
Modified: 2010-09-09 12:34 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Fix binary I/O on Windows when using redirection (5.38 KB, patch)
2010-07-30 19:01 UTC, Jonathan Liu
Details | Splinter Review
Fix binary I/O on Windows when using redirection (1.74 KB, patch)
2010-07-31 04:59 UTC, Jonathan Liu
Details | Splinter Review
Fix checking whether _WIN32 is defined (1.04 KB, patch)
2010-09-08 04:20 UTC, Jonathan Liu
Details | Splinter Review

Description Jonathan Liu 2010-07-30 19:01:54 UTC
Created attachment 37478 [details] [review]
Fix binary I/O on Windows when using redirection

If you use the poppler utilities (pdffonts, pdfimages, pdfinfo, pdftoabw, pdftohtml, pdftoppm, pdftops, pdftotext) on Windows feed a PDF binary as input via stdin, the program reads it in text mode instead of binary mode. Similarly if you output a binary to stdout and redirect to a file, it will write it in text mode. This results in the binary data being corrupted and it is a result of Windows using text mode by default for stdin, stdout, stderr and file streams.

The attached patch resolves this by changing the default mode to binary for these utilities.
Comment 1 Adrian Johnson 2010-07-31 01:37:05 UTC
The problem with this patch is it changes all of stdin, stdout, and stderr to binary without checking if binary mode is actually required. For example all the utilities use stderr to output text error messages. pdfinfo and pdffonts use stdout to output text. Where text is output the mode should not be set to binary.

Instead of this patch I would suggest putting the call to _setmode() at the point where we know stdin will be used for reading a PDF. ie put a 

  #if _WIN32
    _setmode(_fileno(stdin), _O_BINARY);
  #endif

in either StdinCachedFile.cc or StdinPDFDocBuilder.cc.

The other place I can see where binary mode is not currently set on win32 is when pdftoppm writes images to stdout. The call to _setmode() could be done in writeImgFile().

The "_fmode = _O_BINARY" line is not necessary as it has no effect on files already open.

See also the use of setmode in TextOutputDev.cc where the file mode is set to binary at the point where we know we need binary output.
Comment 2 Jonathan Liu 2010-07-31 04:59:08 UTC
Created attachment 37483 [details] [review]
Fix binary I/O on Windows when using redirection

Revised patch based on feedback from Adrian Johnson.
Comment 3 Albert Astals Cid 2010-08-10 11:44:31 UTC
I've commited the fix for poppler 0.14.3
Comment 4 Jonathan Liu 2010-09-08 04:20:02 UTC
#if _WIN32 should be changed to #ifdef _WIN32 for:
 * poppler/StdinCachedFile.cc
 * utils/pdftoppm.cc
Comment 5 Jonathan Liu 2010-09-08 04:20:34 UTC
Created attachment 38547 [details] [review]
Fix checking whether _WIN32 is defined
Comment 6 Albert Astals Cid 2010-09-09 12:34:44 UTC
Commited.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.