Bug 13679

Summary: Error-prone eexec stream in PDF
Product: cairo Reporter: Alex Cherepanov <alexcher>
Component: pdf backendAssignee: Kristian Høgsberg <krh>
Status: RESOLVED FIXED QA Contact: cairo-bugs mailing list <cairo-bugs>
Severity: minor    
Priority: medium CC: alexcher
Version: 1.4.10   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: foo.pdf - sample file with '\r' after eexec

Description Alex Cherepanov 2007-12-15 15:14:54 UTC
I have a PDF file generated by cairo 1.4.10 that contains a Type 1 font
with eexec stream that is likely to cause problems for PDF consumers.
The file works on Acrobat, prompted a fix in Ghostscript, doesn't work
on Adobe level 3 printers with build-in PDF interpreter.

The cause of the problem is the differences between handling space
characters '\t', '\r', '\n', ' ' after eexec operator.
Adobe PostScript interpreters always skips them and bug-compatible clones
do the same. Adobe Acrobat doesn't skip space characters.

The sample PDF file contains '\r' character in the stream "5 0" that must be
interpreted. When PDF is converted to PS without complete re-writing of the
font stream, this character is skipped resulting in an invalid font.

To generate robust PDF one has to modify the original stream (for instance,
prepend spaces) to avoid any of the skipped space characters in the 1st
position of the eexec-encoded stream. See seexec.c in current Ghostscript
sources for details.
Comment 1 Alex Cherepanov 2007-12-15 15:16:48 UTC
Created attachment 13132 [details]
foo.pdf - sample file with '\r' after eexec
Comment 2 Alex Cherepanov 2007-12-15 15:48:13 UTC
eexec encoding has special means to avoid undesirable characters in the 1st
position; there's no need to add spaces. I forgot about this feature.
Comment 3 Adrian Johnson 2007-12-15 21:33:11 UTC
(In reply to comment #2)
> eexec encoding has special means to avoid undesirable characters in the 1st
> position; there's no need to add spaces. I forgot about this feature.

So just to check that I understand this problem correctly:

  The '\r' character is the first ciphertext byte of the Private
  dictionary.  In PDF files where the eexec-encrypted text is
  embedded in binary some PDF interpreters treat the '\r' as part
  of the white space between the 'eexec' and the start of
  the eexec-encrypted text.

Is that correct?

When cairo subsets Type 1 fonts it just copies (after decrypting then
encrypting) all of the original Private dictionary from the start up
to "/CharStrings" to the subsetted font before it starts filtering out
unused glyphs. I don't have the exact same version of that font to
check but unless there is a bug in cairo it appears that the '\r'
resulted from the choice of the four random bytes required to be
inserted at the Private dictionary in the original font. However
section 7.2 of Adobe Type 1 Font Format requires the first four
plaintext bytes to be chosen so that the first cipertext byte is not
whitespace.

Should cairo be checking and if necessary modifying the four random
bytes of plaintext to ensure that the first ciphertext byte is not
white space?  Or is this bug in the font?
Comment 4 Alex Cherepanov 2008-05-30 18:25:15 UTC
PDF and PS handle whitespace characters after eexec.
At least, Adobe interpreters do and Ghostscript follows the lead.
Conversion from PDF to PS may create invalid PS file unless
the font is parsed and re-created. I suggest to avoid whitespace characters
after eexec.

The exact logic (as I understand it) is documented in the comments to
eexec operator in Ghostscript sources.
Comment 5 Adrian Johnson 2008-06-03 06:07:19 UTC
Fixed with this commit:

http://gitweb.freedesktop.org/?p=cairo;a=commit;h=0dbb5c9f6222660b1083420419d0eaa71c809ac5

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.