Created attachment 68945 [details] Sample PDF document illustrating the problem Section 3.4.7 of the PDF 1.5 Reference doesn't specify any limit on the size of entries in a cross-reference stream. So the 'W' entry in a cross-reference stream dictionary should be allowed to have a value of e.g. [1 8 2]. When trying to display the attached document, evince throws the following error: Error: PDF file is damaged - attempting to reconstruct xref table... Error: Couldn't find trailer dictionary Error: Couldn't read xref table Ghostscript is able, however, to convert the document to PostScript (or back to PDF, for that matter), and Adobe Reader displays it fine. There are the following lines of code, starting from line 654 in poppler/XRef.cc from Poppler 0.20.5 (the latest stable version downloaded from the Poppler website): if (w[i] < 0 || w[i] > 4) { goto err1; } The w[i] > 4 seems unwarranted and there should be no upper limit (or, at least, a limit of 8).
I think, You're true, even if You refer to an outdated specification. But also the actual specification (ISO 32000, PDF version 1.7) doesn't say anything else in chapter 7.5.8.2. Are You able to create a patch?
Created attachment 69219 [details] [review] allow a limit of 8 for 'W' entries Okay, seems as if You're not able or willing, so here a patch to limit the 'W' entries to 8.
(In reply to comment #2) > Created attachment 69219 [details] [review] [review] > allow a limit of 8 for 'W' entries > > Okay, seems as if You're not able or willing, so here a patch to limit the > 'W' entries to 8. Thanks for your patch. I thought more work would be needed to store the offset in a 64bit integer, which would be way beyond my non-existent capabilities of C++ developer. Can a Guint store 8-byte integers? Lacking of a C/C++ dev environment, I can't test this patch. Vincent
(In reply to comment #3) > (In reply to comment #2) > > Created attachment 69219 [details] [review] [review] [review] > > allow a limit of 8 for 'W' entries > > > > Okay, seems as if You're not able or willing, so here a patch to limit the > > 'W' entries to 8. > > Thanks for your patch. I thought more work would be needed to store the > offset in a 64bit integer, which would be way beyond my non-existent > capabilities of C++ developer. Can a Guint store 8-byte integers? It depends on the machine: in a 32 bit environment Guint has only 32 bit, so the upper 32 bit will be moved out during reading the xref stream section. So in a 32 bit enviroment a PDF with size greater than 4 GByte wouldn't be properly read :-). In a 64 bit environment Guint has the necessary 64 bit, but even there: has anyone a PDF with more than 4 GByte? Than we should implement the LARGEFILE_ACCESS :-) Cheers, Thomas > > Lacking of a C/C++ dev environment, I can't test this patch. > > Vincent
Thomas are you sure Guint is 64 bits? That's not true on my my Linux 64 bits box, is that a windows thingie? Anyway yes, we need support for big files, there's already one bug for that #44085
(In reply to comment #5) > Thomas are you sure Guint is 64 bits? That's not true on my my Linux 64 bits > box, is that a windows thingie? Oh, oh, of course You're true: int has 32 bits in 32 bit AND 64 bis environments (with the compiler and architecture I know). I mixed that already up (gaving int the wrong size) when had to change 16-bit-libraries so that they could run in a 32-bit environment. Long long ago, but really a déjà vu. So what to do with this PDF, where the high 32 bits are zero? Should we test for overflow with an error message "Large files not supported" but accept it when the high 32 bit are zero? Wait that somebody implements large file support? Or just commit the patch knowing that we run into problems with really large files?
* test for overflow with an error message "Large files not supported" but accept it when the high 32 bit are zero? This seems the easy short-term solution for at least being able to open this file, do you want to give it a try?
Created attachment 73668 [details] [review] Read 8 byte offsets This patch fixes the problem by reading 8 byte offsets and printing an error if the offset is too large for Goffset.
Makes sense, i hope this situation is not too common though. Do you know which systems wouldn't have a 64 bits goffset?
(In reply to comment #9) > Makes sense, i hope this situation is not too common though. Do you know > which systems wouldn't have a 64 bits goffset? I expect all 64-bit systems would support large files. On 32-bit systems configure checks for large file support unless --disable-largefile is specified. I don't know which systems don't support large files. There may be 32-bit systems (eg embedded) that are unlikely to ever see large pdfs where it makes sense to disable large file support to save a few bytes of memory and cycles of cpu time.
Ok, as said makes sense, so feel free to commit (i guess the important part of this bug was actually fixed by your large file support)
Pushed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.