StructTreeRoot::parse reports an error that "K has a child of wrong type for a tagged PDF" if the StructTreeRoot K entry holds a dictionary object instead of an array of dictionaries. However, Table 322 of PDF 32000-1 states that the K entry hold a dictionary representing a single structure element.
Please attach a pdf to reproduce this.
Created attachment 135252 [details] test case
(In reply to novalazy+freedesktop from comment #0) > However, Table 322 of PDF 32000-1 > states that the K entry hold a dictionary representing a single structure > element. That is for the general Logical Structure. Tagged PDF imposes additional restrictions. In section 14.8.4.2: "In a tagged PDF document, the structure tree shall contain a single top-level element; that is, the structure tree root (identified by the StructTreeRoot entry in the document catalogue) shall have only one child in its K (kids) array."
Thanks for the reference. I wonder if that is just bad wording, or the true intention of the spec writers. Many (possibly most) of the tagged PDFs I have come across suffer this problem, including those in the PDF/UA-1 reference suite. You can close this bug.
I'm not opposed to removing the check. Even if it is against the spec it is effectively the same as a single element array containing a dict.
Adrian did you include that fix/workaround in the patches of bug 103912 ? Would make sense to do that?
I've pushed out the fix as it a trivial change.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.