Created attachment 77269 [details] [review] endstream detection and scanSpecialFlags correction 1. During porting poppler to Java I made a mistake in the "<objnum> 0 obj<length>" pattern detection so that it fails. Therefore it ran into the endstream search, and at least with bug-poppler16579.pdf this doesn't work correctly: the shift(-1) with the used token mechanism in Lexer isn't correct for a binary data stream. If there is i.e. a "(" without corresponding ")" in the binary data, which of course can happen and happens in that pdf, shift(-1) skips the searched endstream and can therefore in worst case reach the end-of-file. Therefore I implemented a shift("endstream") in Java, which I now port back to C++, or in other words "There and Back Again" :-) You can test it with bug-poppler16579.pdf if You just change temporary if (longNumber <= INT_MAX && longNumber >= INT_MIN && *end_ptr == '\0') { in XRef.cc to if (gFalse && longNumber <= INT_MAX && longNumber >= INT_MIN && *end_ptr == '\0') { 2. The small change in XRef.cc was another point I detected during the Java port: if You save a PDF with defect xref offsets, the readXRefUntil(-1 /* read all xref sections */, &xrefStreamObjNums) in XRef::scanSpecialFlags() will destroy the already reconstructed entries table, but this means that any modification which the user did in the meantime get lost. This can be tested i.e. with bug168518.pdf. The attached patch solves this two issues.
Not sure i understand what to test for point 2, saving or opening?
(In reply to comment #1) > Not sure i understand what to test for point 2, saving or opening? Saving: Open the PDF in okular and modify it. All changes should be lost if You save it then (if okular doesn't save the PDF automatically after opening because it detects that the PDF is broken).
So i see this as two separate patches, the first one that improves the shifting seems to me like an extra layer of security and would be enough in master while the second fixes saving of some files and should go to 0.22 too. Agreed?
(In reply to comment #3) > So i see this as two separate patches, the first one that improves the > shifting seems to me like an extra layer of security and would be enough in > master while the second fixes saving of some files and should go to 0.22 too. > > Agreed? Even if I don't think that shift(-1) delivers correct results in a binary stream and so it's not only an improvement, I agree to put it not into 0.22 because we don't have any bug reports which this would fix. So it's okay for me.
Pushed
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.