Created attachment 134244 [details] screen shot from PDF screen Attached are two screenshots, one is the text of the PDF file, the other is the text generated, apparently using poppler, by pdftops. Most likely the bar code causes the trouble. On request the PDF file can be made available, but only to the developer.
Created attachment 134245 [details] screen shot from ps file
Please share the document with me, though i can't guarantee a quick fix
William this regression is caused by https://cgit.freedesktop.org/poppler/poppler/commit/?id=2cf901c817fc99e1fa57745c11aa79cdfb4e8c99 i.e. the fix for bug https://bugs.freedesktop.org/show_bug.cgi?id=63963 Freek would you mind if I share the document with William too if he's willing to have a look why he's patch caused this regression?
I can look at it if someone can send me the PDF. The basic idea of my change is the line } else if (maxGlyph > 0 && code > maxGlyph) { Some fonts define only the non-empty glyphs but reference zero-size glyphs way off the end of the list. Without the code > maxGlyph test, the generated postscript will be invalid. Without seeing the PDF, I suppose that somehow maxGlyph isn't correct. William
(In reply to Albert Astals Cid from comment #3) > William this regression is caused by > https://cgit.freedesktop.org/poppler/poppler/commit/ > ?id=2cf901c817fc99e1fa57745c11aa79cdfb4e8c99 i.e. the fix for bug > https://bugs.freedesktop.org/show_bug.cgi?id=63963 > > Freek would you mind if I share the document with William too if he's > willing to have a look why he's patch caused this regression? It's OK to share the file with William.
Sent
When I run pdffonts -f 1 -l 1 bug.pdf it lists ArialBold twice. The first copy has 19 glyphs, and the second copy has 15 glyphs. The missing characters are glyphs in positions 16 to 19 of the first copy because after reading the second copy, it thinks that the font has only 15 glyphs and avoids accessing positions 16 to 19. The question is how to handle it. In my test files, the last defined glyph is under 200, the referenced glyphs are in the 1000's, and each font is in the pdf only once. 1. When I look at the PDF, object 59 starts /BaseFont/ArialBold/ and objects 57 and 58 start /DescendantFonts ... /BaseFont/ArialBold, so maybe there is a way to differentiate the copy with 19 glyphs from the copy with 15 glyphs, although I think that information is lost by the time that it gets into PSOutputDev::drawString(), which is why I had create the hash and couldn't just add a maxGlyphs field to GfxFont. 2. I could change the code in PSOutputDev::setupExternalCIDTrueTypeFont() and PSOutputDev::setupEmbeddedCIDTrueTypeFont() so that when it sees a font for the second time, instead of always updating the hash mapping font names to glyph counts, it could update the glyph count only if the new number is larger. That should fix this file without breaking my test files. If this is OK, I can submit a patch within a day or two. William
The choice for a solution is up to you. As a user I am only interested in a proper print. In case the PDF is improperly generated I can report back to the organization I got the PDF from, with a description of what is at fault.
We could try #2 if that's easier but this means that your patch maybe broke lots of files more since it's quite common to have duplicated names in pdf files. Let's try patch #2 and then i'll run a full regtest of pdftops without your original patch and after the original patch + this one to see if there's still regressions. Does that sound ok?
Created attachment 134321 [details] [review] proposed patch This patch moves the code to update the max valid glyph hash into its own function and updates the max valid glyph only if the new value is higher than the previous value. This fixes a problem with pages that have multiple copies of the same font with different glyph counts. If poppler processed the font with the smaller count last, and then the PDF wrote text in the font with the larger count, pdftops would not show the glyphs above the maximum of the smaller font. I suspect that this issue is rare because the original issue https://bugs.freedesktop.org/show_bug.cgi?id=63963 was reported in 2013, not touched for several years in poppler, and still exists in the recently released xpdf-4.00. Also, my patch was applied in December 2016 (9 months ago), and this is the first reported regression. Regards, William
Pushed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.