Created attachment 32644 [details] [review] patch From downstream bug report: "When I open a particular PDF document with evince, page link doesn't work. Page Number of Index is not correct. English Language is ok but Russian an Simplified Chinese are not ok. When I open it with Acrobat Reader, it works fine."
Please attach the pdf with the problem.
I'll attach one as soon as I can - the one from the downstream bug isn't publicly releasable afaik.
I confirm this bug and that the patch fixes it. Look at the example in http://www.dim.uchile.cl/~jaliste/named_dests.pdf if you look into the Named destinations Nametree, the tree keys should be ordered. Note that the names are not in plain ascii, but in unicode (it's russian text)... If you look at the DestNamedTree in Catalog, you will see that with the current cmp function, the tree it's not ordered, and therefore, the bsearch for the link fails and the link can't be activated. More precisely, the current comparison only works when working with proper ascii characters, for the names in the example this is not true. Hope that this makes some sense.
Ok, so i understand that our cmp function is not ordering stuff as whatever created that PDF and so the bsearch fails, but this patch as a fix seems totally fragile, what we should IMHO do is sort the entries field in NameTree::parse using GooString::cmp so that we are sure that the bsearch that uses GooString::cmp will work. Am I making sense? Any taker for the work?
Albert I thought about your proposal and actually I tried to do something along these lines, but since the PDF spec says that the NameTree IS ordered lexically, and acrobat is parsing the nametree right, then I concluded (i might be wrong) that the patch in this bug is the right thing to do because then things in the test case are ordered (assuming that the PDF really respect the spec)... Of course, I am concluding based on only one example... And also, we could "fix" broken PDFs by ordering the name tree as you suggest, but in my opinion we should still apply this patch.
I mean, the pdf spec says the comparison should be made byte-wise, and to my understanding, that is to compare using unsigned chars.
"acrobat is parsing the nametree right" -> acrobat may as well be doing a linear search (which would also work for us if we use it instead of a bsearch) and you would not know it. Honestly I don't see a reason why we compare all the GooStrings everywhere, just because in one place it may not be what we want it to do.
I know my conclusion is only a feeling. Btw, if the patch would just change the Entry::cmp function, this would probably be better. Anyway, already understood I won't convince you, so will try to cook a patch that sorts the array after constructing it.
Created attachment 89431 [details] [review] sort entries of NameTrees to ensure lookup works So this patch should be more robust than previous one.
Commited, thanks for the patch!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.