Bug 62735 - Should not try to fopen a file we already have open
Summary: Should not try to fopen a file we already have open
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-25 19:05 UTC by Albert Astals Cid
Modified: 2013-04-04 22:00 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
prototype patch to clarify intended solution (3.20 KB, patch)
2013-03-25 20:33 UTC, Adam Reichold
Details | Splinter Review
prototype patch to clarify intended solution (5.03 KB, patch)
2013-03-25 21:04 UTC, Adam Reichold
Details | Splinter Review
patch to keep using old-style I/O on Win32 (3.58 KB, patch)
2013-03-25 21:42 UTC, Adam Reichold
Details | Splinter Review
add a GooFile class to encapsulate file read access using offsets (11.17 KB, patch)
2013-03-26 15:58 UTC, Adam Reichold
Details | Splinter Review
add a GooFile class to encapsulate file read access using offsets (11.58 KB, patch)
2013-03-27 19:30 UTC, Adam Reichold
Details | Splinter Review
add a GooFile class to encapsulate file read access using offsets (13.30 KB, patch)
2013-04-01 19:52 UTC, Adam Reichold
Details | Splinter Review

Description Albert Astals Cid 2013-03-25 19:05:12 UTC
Because the file may have disappeared.

See http://lists.freedesktop.org/archives/poppler/2013-March/010115.html
Seems the conclusion is to use pread
Comment 1 Adam Reichold 2013-03-25 20:33:07 UTC
Created attachment 77016 [details] [review]
prototype patch to clarify intended solution

This is a minimal changeset to help clarify the intended solution using "pread". The interface of FileStream is currently unchanged, using "fileno" internally, with the FILE objects internal structure being overhead.
Comment 2 Albert Astals Cid 2013-03-25 21:04:18 UTC
Note, Windows doesn't have pread, so someone that cares about windows probably needs to write a windows version (or we can keep the old code around for windows, it should work since afaik you can't delete files in use in windows)
Comment 3 Adam Reichold 2013-03-25 21:04:35 UTC
Created attachment 77019 [details] [review]
prototype patch to clarify intended solution

A cleaner changeset discarding the FILE object and file name earlier, using only the file descriptor in the FileStream class.
Comment 4 Adam Reichold 2013-03-25 21:19:56 UTC
(In reply to comment #2)
> Note, Windows doesn't have pread, so someone that cares about windows
> probably needs to write a windows version (or we can keep the old code
> around for windows, it should work since afaik you can't delete files in use
> in windows)

I see. It may be worth it to replace the call to "pread" using Windows' ReadFile with "overlapped I/O" since it would allow us to remove the UniqueFileStream class. But we would have to carry around an OVERLAPPED structure and use CreateFile to open the file, so it is probably easier to keep both ways of doing things and just #ifdef accordingly...

In any case, I have to admit that I do not have a Windows machine around to test any of this...
Comment 5 Adam Reichold 2013-03-25 21:42:51 UTC
Created attachment 77021 [details] [review]
patch to keep using old-style I/O on Win32

This slightly convoluted patch uses #ifdefs to keep the old-style I/O, and UniqueFileStream as an implementation detail, around for Win32. *Not tested on an actual Win32 host.*

(This is so ugly... Couldn't Poppler depend on Cygwin or some other form of POSIX layer... :-()
Comment 6 Albert Astals Cid 2013-03-25 21:55:24 UTC
Please give me a shout when you think stuff needs to be reviewed (you've been posting patches so fast that not sure when/if you've consider it finished)
Comment 7 Adam Reichold 2013-03-26 05:26:17 UTC
(In reply to comment #6)
> Please give me a shout when you think stuff needs to be reviewed (you've
> been posting patches so fast that not sure when/if you've consider it
> finished)

Sorry for not being clear enough: I consider this by no means finished. The patches are rather for having something concrete to discuss with Thomas. I'll tell when we settled on something...
Comment 8 Adrian Johnson 2013-03-26 08:56:00 UTC
(In reply to comment #3)
> Created attachment 77019 [details] [review] [review]
> prototype patch to clarify intended solution
> 
> A cleaner changeset discarding the FILE object and file name earlier, using
> only the file descriptor in the FileStream class.

You will need to ensure the large file support is not broken. It looks like the use of off_t in pread and lseek will require a similar solution as used in 48ed05d9 (bug 60095).

There is a large file test case at bug 44085 comment 18.
Comment 9 Adrian Johnson 2013-03-26 09:09:54 UTC
(In reply to comment #5)
> Created attachment 77021 [details] [review] [review]
> patch to keep using old-style I/O on Win32
> 
> This slightly convoluted patch uses #ifdefs to keep the old-style I/O, and
> UniqueFileStream as an implementation detail, around for Win32. *Not tested
> on an actual Win32 host.*
> 
> (This is so ugly... Couldn't Poppler depend on Cygwin or some other form of
> POSIX layer... :-()

It would be nice to have a solution that didn't require all the #ifdefs. I know this is what we have done in the past but it would be better if the #ifdefs were confined to the header files and win32 specific code is put in a separate file. This also avoids polluting the namespace with all the junk in windows.h.
Comment 10 Adam Reichold 2013-03-26 15:58:54 UTC
Created attachment 77064 [details] [review]
add a GooFile class to encapsulate file read access using offsets

Hello Adrian,

To address your first concern and to begin addressing your second remark, I would like to propose to add a GooFile class that encapsulates file read access with explicit offsets. What do you think?

There two problems for me now:
1. Could someone tell me where to touch the build system to get the HAVE_FSEEK64 and HAVE_PREAD64 definitions?
2. The _WIN32 part is gobbled together using MSDN examples. Could someone having access to a Win32 machine have a look at it since I can't even compile this.

Thanks!
Comment 11 Adam Reichold 2013-03-27 19:30:10 UTC
Created attachment 77122 [details] [review]
add a GooFile class to encapsulate file read access using offsets

Thomas was so kind to test and correct the patch on a Win32 host and fix a missing PDFDoc constructor, c.f. the updated patch which I would now consider fit for regression testing.
Comment 12 Adam Reichold 2013-03-28 16:40:53 UTC
(In reply to comment #11)
> Created attachment 77122 [details] [review] [review]
> add a GooFile class to encapsulate file read access using offsets
> 
> Thomas was so kind to test and correct the patch on a Win32 host and fix a
> missing PDFDoc constructor, c.f. the updated patch which I would now
> consider fit for regression testing.

I put the patch through Albert's test suite today and at least the POSIX part seems fine. That leaves Windows and the build system questions...
Comment 13 Adam Reichold 2013-04-01 19:52:00 UTC
Created attachment 77289 [details] [review]
add a GooFile class to encapsulate file read access using offsets

Hopefully, I have made the necessary additions for autotools and CMake to generate the 'pread64' and 'lseek64' definitions. I also removed the - as far as I can see - by now superfluous calls to 'stat' where file streams are constructed since we can just use 'GooFile::size'.

@Albert: I will run regtest on this version overnight (looking good so far), but I think this is basically ready for review.

@Thomas: I used Poppler's default regtest which AFAIK is non-threaded since the default pdftoppm is. I did do a bit of smoke testing using qpdfview and multi-threading.
Comment 14 Thomas Freitag 2013-04-02 07:38:21 UTC
(In reply to comment #13)
> Created attachment 77289 [details] [review] [review]
> add a GooFile class to encapsulate file read access using offsets
> 
> Hopefully, I have made the necessary additions for autotools and CMake to
> generate the 'pread64' and 'lseek64' definitions. I also removed the - as
> far as I can see - by now superfluous calls to 'stat' where file streams are
> constructed since we can just use 'GooFile::size'.
> 
> @Albert: I will run regtest on this version overnight (looking good so far),
> but I think this is basically ready for review.
> 
> @Thomas: I used Poppler's default regtest which AFAIK is non-threaded since
> the default pdftoppm is. I did do a bit of smoke testing using qpdfview and
> multi-threading.

I just ran Your patch through my regression test with the multi-threaded version of pdftoppm, and I just got the same failed tests than in https://bugs.freedesktop.org/show_bug.cgi?id=50992#c124 (beside the failed test caused by tiling patterns, because we solved that in the meantime).
This also helps on systems where a virus scanner is installed: because we open the PDF now only once, the virus scanner scans it only one time.
Also it seems to be a little bit faster, but this could just be a subjective feeling.
I also compiled it under Windows and made a few tests without any problems.

Congratulation!!!!
Comment 15 Albert Astals Cid 2013-04-04 22:00:37 UTC
Good work :-)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.