Bug 103587 - StructTreeRoot K value may be dictionary
Summary: StructTreeRoot K value may be dictionary
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium trivial
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-06 01:03 UTC by novalazy+freedesktop
Modified: 2018-01-04 05:15 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
test case (12.53 KB, application/pdf)
2017-11-06 04:45 UTC, novalazy+freedesktop
Details

Description novalazy+freedesktop 2017-11-06 01:03:09 UTC
StructTreeRoot::parse reports an error that "K has a child of wrong type for a tagged PDF" if the StructTreeRoot K entry holds a dictionary object instead of an array of dictionaries. However, Table 322 of PDF 32000-1 states that the K entry hold a dictionary representing a single structure element.
Comment 1 Albert Astals Cid 2017-11-06 04:36:56 UTC
Please attach a pdf to reproduce this.
Comment 2 novalazy+freedesktop 2017-11-06 04:45:11 UTC
Created attachment 135252 [details]
test case
Comment 3 Adrian Johnson 2017-11-06 07:27:28 UTC
(In reply to novalazy+freedesktop from comment #0)
> However, Table 322 of PDF 32000-1
> states that the K entry hold a dictionary representing a single structure
> element.

That is for the general Logical Structure. Tagged PDF imposes additional restrictions. In section 14.8.4.2:

"In a tagged PDF document, the structure tree shall contain a single top-level element; that is, the structure tree root (identified by the StructTreeRoot entry in the document catalogue) shall have only one child in its K (kids) array."
Comment 4 novalazy+freedesktop 2017-11-06 08:07:33 UTC
Thanks for the reference. I wonder if that is just bad wording, or the true intention of the spec writers. Many (possibly most) of the tagged PDFs I have come across suffer this problem, including those in the PDF/UA-1 reference suite.

You can close this bug.
Comment 5 Adrian Johnson 2017-11-06 08:15:10 UTC
I'm not opposed to removing the check. Even if it is against the spec it is effectively the same as a single element array containing a dict.
Comment 6 Albert Astals Cid 2018-01-03 22:59:20 UTC
Adrian did you include that fix/workaround in the patches of bug 103912 ?

Would make sense to do that?
Comment 7 Adrian Johnson 2018-01-04 05:15:20 UTC
I've pushed out the fix as it a trivial change.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.