Summary: | pdfunite: embedded files discarded during merge | ||
---|---|---|---|
Product: | poppler | Reporter: | Vovodroid <qa23d-vvd> |
Component: | utils | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Test files
insert embedded files in result pdf insert embedded files in result pdf two embedded files insert embedded files in result pdf insert embedded files in result pdf insert embedded files in result pdf |
Created attachment 115160 [details] [review] insert embedded files in result pdf This patch solves the issue with pdfunite Created attachment 115161 [details] [review] insert embedded files in result pdf This patch solves the issue with pdfseparate Created attachment 115163 [details]
two embedded files
I testes patch for pdfunite, it better now, thanks!
But attachment in first file is preserved, attachments in next files are discarded.
I tested patch for pdfunite, it better now, thanks! But only attachment in first file is preserved, attachments in second file is discarded. Created attachment 115174 [details] [review] insert embedded files in result pdf This patch merges the name trees in the catalog dicts correctly, so all embedded files remains in the result pdf created by pdfunite. One hint to pdfseparate: since pdfseparate is not able to detect which embedded files are used by the extracted page, all embedded files are inserted in every separated pdf. Great, thanks Thomas! Can this fix be committed to git? Comment on attachment 115174 [details] [review] insert embedded files in result pdf Review of attachment 115174 [details] [review]: ----------------------------------------------------------------- ::: utils/pdfunite.cc @@ +42,5 @@ > + Object srcNameArray; > + mergeNameTree->lookup("Names", &mergeNameArray); > + srcNameTree->lookup("Names", &srcNameArray); > + if (mergeNameArray.isArray() && srcNameArray.isArray()) { > + Object *newNameArray = new Object(); You are leaking newNameArray in a couple places. srcNameTree->set/add will make a shallow copy so you shouldn't call newNameArray->free, but you still need to delete the Object. Or use a stack variable. @@ +61,5 @@ > + if (mkey.getString()->cmp(key.getString()) < 0) { > + Object *newKey = new Object(); > + newKey->initString(new GooString(mkey.getString()->getCString())); > + newNameArray->arrayAdd(newKey); > + Object *newValue = new Object(); same, newKey and newValue are being leaked in a few places. Object::arrayAdd makes a shallow copy. @@ +249,5 @@ > } > + catDict->lookup("Names", &names); > + if (!names.isNull() && names.isDict()) { > + docs[0]->markPageObjects(names.getDict(), yRef, countRef, 0, refPage->num, refPage->num); > + } You are processing the first document twice, first here and again below. (In reply to Jason Crain from comment #7) Thanks for reviewing. I hadn't done a memory leak check, sorry, normally I do that. The next patch will remove the memory leaks. BUT: > @@ +249,5 @@ > > } > > + catDict->lookup("Names", &names); > > + if (!names.isNull() && names.isDict()) { > > + docs[0]->markPageObjects(names.getDict(), yRef, countRef, 0, refPage->num, refPage->num); > > + } > > You are processing the first document twice, first here and again below. That's not true. The problem is probably the name "markPageObjects". But the first parameter is a dictionary, so probably a better name for this method should be markDictObjects. So here it marks every object referenced by the "Names"-dict in the catalog dict, and this is the only place where markPageObjects is called with it (in doMergeNameTree it will be called for the "Names"-dict of the othere pages!). The name for the method "markPageObjects" has historical reasons. When I developped it first it was just called with the page dictionary! Created attachment 115244 [details] [review] insert embedded files in result pdf This patch removes the memory leaks by my first patch. In additions it closes also two other small leaks in pdfunite. Tested with valgrind This patch works well, so any plans to merge patch to trunk? Thanks. Thomas, don't GooString *newKey = new GooString("Names"); and GooString *newKey = new GooString(key); leak too? Also is "This patch solves the issue with pdfseparate" unneeded? (In reply to Albert Astals Cid from comment #11) > Thomas, don't > GooString *newKey = new GooString("Names"); > and > GooString *newKey = new GooString(key); > > leak too? Sorry for the delay. I just looked into it again, and yes, in case of keys in a Dict the choice of "new GooString" is definitely wrong, it could not cause only a memory leak, in case of the static version of GooString it could also cause memory corruption. I have to use copyString in this cases. I wonder a little bit why valgrind didn't show me that problem. I will upload a new patch in a few minutes. (In reply to Albert Astals Cid from comment #12) > Also is "This patch solves the issue with pdfseparate" unneeded? Yes. At the beginning I uploaded two patches. These were combined in one patch after the first correction. Created attachment 117257 [details] [review] insert embedded files in result pdf Use copyString where necessary Pushed |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 115149 [details] Test files When merging pdf file with embedded attachment it not present in resulted pdf (I use Acrobat Reader to see it). Two files attached in zip: attach.pdf with embedded file, and merged without