Bug 102290

Summary: [PATCH] Replace Splash font rendering by Qt font rendering
Product: poppler Reporter: oliver.sander
Component: arthur backendAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Patch that replaces Splash font rendering by Qt font rendering
Patch that replaces Splash font rendering by Qt font rendering
Patch that replaces Splash font rendering by Qt font rendering
Patch that replaces Splash font rendering by Qt font rendering
Patch that replaces Splash font rendering by Qt font rendering
Patch that replaces Splash font rendering by Qt font rendering
Patch that replaces Splash font rendering by Qt font rendering

Description oliver.sander 2017-08-18 10:08:42 UTC
Created attachment 133600 [details] [review]
Patch that replaces Splash font rendering by Qt font rendering

Previously, the Arthur backend would use Splash code to do its font rendering.  That was not a satisfactory solution: Qt can do font rendering directly.  Also, the Splash font rendering in the Arthur code had a few bugs, which lead to legible-but-not-pretty fonts.
    
This patch replaces the Splash font rendering by Qt font rendering.  It only adds the new code, in order to make the diff short and easy to review.  A subsequent patch will then remove the Splash code in Arthur (which is quite a bit).

I took a lot of inspiration from Mihai Niculescu's patch at
    
  https://lists.freedesktop.org/archives/poppler/2013-June/010370.html
    
That's why the patch adds Mihai's name in the copyright list.

Bear with me if this patch is doing something really stupid: I am not a font expert.
Comment 1 Albert Astals Cid 2017-08-20 22:17:23 UTC
if (!(render & 1))

seems highly suspicious

And how is this thing loading fonts?
Comment 2 oliver.sander 2017-08-21 12:55:46 UTC
> if (!(render & 1))
> seems highly suspicious

I took this from the Splash-based font rendering code, which has the same text.  I guess the code does only 'fill' text rendering for now, but that is not a regression.  At least a few others of the mode seem reasonably straightforward, but I don't have any documents to test that with.

> And how is this thing loading fonts?

The updateFont method does (e.g.)

  const char* fontData = font->readEmbFontFile(xref, &fontDataLen);
  m_rawFont = std::unique_ptr<QRawFont>(new QRawFont(QByteArray(fontData, fontDataLen), fontSize));

Isn't that good enough?  Am I missing something?
Comment 3 Albert Astals Cid 2017-08-21 21:59:57 UTC
i totally missed the updateFont part of the diff.
Comment 4 Albert Astals Cid 2017-08-21 22:11:35 UTC
Have you tested this?

First pdf i tried causes text to go from "not great but is text" to "can't see anything". 

http://svn.ghostscript.com/ghostscript/tests/Ghent_V3.0/010_CMYK_OP_x3.pdf
Comment 5 oliver.sander 2017-08-22 12:23:00 UTC
Of course I tested, but not with that file. :-)  I'll have a look.
Comment 6 oliver.sander 2017-08-22 20:48:35 UTC
It's a CID font, and right now I don't know how to handle that.

Seems like I'm not the only one.  Quoting Mihai from https://lists.freedesktop.org/archives/poppler/2013-June/010338.html :

"I only have a problem with CID font with which I didn't advance enough- this is why I cannot make it display CJK."

Any ideas?

In the worst case I could make the code fall back to the Splash font renderer for all CID fonts.  Would that be acceptable?

BTW: My life would be much simpler if I knew what the parameters of the drawChar method are.  Right now I operate mainly by educated guesswork.  Could somebody please take the time and write down descriptions of what they are?  I'll gladly prepare a patch from any kind of description.  Thanks!
Comment 7 Albert Astals Cid 2017-08-22 21:07:10 UTC
(In reply to oliver.sander from comment #6)
> It's a CID font, and right now I don't know how to handle that.
> 
> Seems like I'm not the only one.  Quoting Mihai from
> https://lists.freedesktop.org/archives/poppler/2013-June/010338.html :
> 
> "I only have a problem with CID font with which I didn't advance enough-
> this is why I cannot make it display CJK."
> 
> Any ideas?
> 
> In the worst case I could make the code fall back to the Splash font
> renderer for all CID fonts.  Would that be acceptable?

That kind of defeats the point of not using Splash anymore?

> BTW: My life would be much simpler if I knew what the parameters of the
> drawChar method are.  Right now I operate mainly by educated guesswork. 
> Could somebody please take the time and write down descriptions of what they
> are?  I'll gladly prepare a patch from any kind of description.  Thanks!

There's exactly one call to drawChar in Gfx.cc i hope you'll be able to kind of reverse-engineer the params from there. I would have to reverse-engineer it too at this point, kind of forgot all i knew of it.
Comment 8 Adrian Johnson 2017-08-22 21:11:50 UTC
(In reply to oliver.sander from comment #6)
> BTW: My life would be much simpler if I knew what the parameters of the
> drawChar method are.  Right now I operate mainly by educated guesswork. 
> Could somebody please take the time and write down descriptions of what they
> are?  I'll gladly prepare a patch from any kind of description.  Thanks!

Have a look at the cairo backend code (CairoOutputDev and CairoFontEngine) as it is doing a similar thing to what you are trying to do. ie drawing to an external drawing API that has its own font support.

CharCode code, int nBytes, Unicode *u, int uLen

code is the character code in the content stream. This needs to be mapped back to a glyph index. GfxFont has the code to do this. u is the UCS-4 mapping used for text extraction (TextOutputDev). nBytes and ULen are the number of character codes and unicode respectively.  You can ignore the unicode. Also the unicode is not always valid so you can't use it for drawing.

I can't recall how all the x/y params work. Just copy what splash or cairo backends are doing.
Comment 9 oliver.sander 2017-08-23 09:18:56 UTC
Thanks Adrian, the information on the method parameters is very helpful.

Can you be a bit more explicit about nBytes?  CharCode is simply an 'unsigned int'.  Is nBytes the number of bytes of that unsigned int that are actually used?
Comment 10 Adrian Johnson 2017-08-23 09:49:55 UTC
Have a look in Gfx.cc where drawChar() is called. nBytes comes from the call to getNextChar() in GfxFont. There is a comment explaining it in GfxFont.h. The text strings in the content stream can consists of either 8-bit or 16-bit character codes depending on the font. nBytes is the number of bytes in the character code. You can ignore it.

If you look at CairoOutputDev::drawChar() it essentially is:

    glyphs[glyphCount].index = currentFont->getGlyph (code, u, uLen);
    glyphs[glyphCount].x = x - originX;
    glyphs[glyphCount].y = y - originY;
    glyphCount++;

ie it builds up a run of glyphs to show when endString() is called. index is the glyph index. x, y is the location.

The hard part is loading the fonts and finding the code to glyph index mapping. See CairoFreeTypeFont::create() in CairoFontEngine.cc for how it handles all the different font types.
Comment 11 oliver.sander 2017-08-23 12:57:26 UTC
The documentation of QRawFont says[1]:

"The [font data must be] a TrueType or OpenType font."

Does that rule out CID fonts?  Because QRawFont claims to construct glyph indices from unicode for me[2].


[1] http://doc.qt.io/qt-5/qrawfont.html#QRawFont-2
[2] method glyphIndexesForChars
Comment 12 Adrian Johnson 2017-08-23 13:14:10 UTC
CID fonts use CID encoding. The font type can be Truetype or CFF. If QRawFont expects complete TrueType or OpenType fonts it is unlikely to be useful. Fonts in PDF files are subsetted to remove unused tables. PDF also uses Type 1 fonts and bare CFF fonts.

You may have to write you own font rendering class that uses FreeType to render the glyph bitmaps.
Comment 13 oliver.sander 2017-08-24 08:33:33 UTC
Okay, I figured it out, at least for Albert's test file.

For that particular file, the CharCode 'code' _is_ the glyph index.  No mapping is needed at all.  I do not know under what conditions this property holds.  The font in the file has encoding 'Identity-H'.  Maybe it only holds for such encodings.

My patch also ignored the text matrix -- that's why the glyphs were so tiny.  The font in the file claims to have font size 1, and then scales the glyphs using the text matrix.

It is not clear to me whether the text matrix may change with each glyph, or only with each font.  To be on the safe side I update it at each glyph now.  That's not a lot of additional work, as I need to do a per-glyph transformation anyway.

A new patch is attached.  Let's see where it breaks next. :-)
Comment 14 oliver.sander 2017-08-24 08:34:09 UTC
Created attachment 133738 [details] [review]
Patch that replaces Splash font rendering by Qt font rendering
Comment 15 Albert Astals Cid 2017-08-24 21:07:45 UTC
Have a look at https://bugs.freedesktop.org/attachment.cgi?id=41459

The "Patch 1.0 - CMYK Overprint" sub-title should be pink-ish, your patch regresses that, though OTOH it makes the top title better.
Comment 16 oliver.sander 2017-08-25 09:53:07 UTC
That one was easy: we store fill colors in the QPainter's brush, and they are set correctly.  However, in what seems like a small inconsistency to me, QPainter uses its _pen_ (what's otherwise used for stroking) also for the text fill. See, e.g., http://doc.qt.io/qt-5/qpainter.html#setPen .  Therefore, we need to overwrite the pen color temporarily.

New patch is attached. However, a second bug remains: a few glyphs are missing.  I'll have a look.

Also, if you look very closely to the letter 't' in the title, you'll notice that Arthur still renders it a tiny bit differently than Splash does.  I have to assume that this is a Qt bug, because the patched Arthur code never touches the actual glyph shape.
Comment 17 oliver.sander 2017-08-25 09:53:32 UTC
Created attachment 133768 [details] [review]
Patch that replaces Splash font rendering by Qt font rendering
Comment 18 oliver.sander 2017-08-29 19:47:22 UTC
Here is a new version of the patch.  It now renders all glyphs in Albert's test files correctly.  The patch didn't really get any prettier, though.

In fact, I completely ripped out the Qt-based code-to-glyph-index conversion code, and replaced it with what Splash does.  That works, but I am a bit surprised that Qt does not offer more help here.  QRawFont has the glyphIndexesForChars method, but that only takes QChar (aka unicode) as input, and we don't generally have reliable unicode.

Unfortunately, to use the Splash font engine to do the code-to-gid decoding, I had to add a new method getCodeToGID to the SplashFTFontFile class.  Is that okay?  I tried to do without that, but I couldn't.

During my last tests I noticed that there is still a problem with font colors.  I'll look into that, but I'd like to get an opinion on the new SplashFTFontFile method while I look.  Thanks!
Comment 19 oliver.sander 2017-08-29 19:47:45 UTC
Created attachment 133867 [details] [review]
Patch that replaces Splash font rendering by Qt font rendering
Comment 20 Albert Astals Cid 2017-08-29 20:45:56 UTC
Adding a getter is ok.
Comment 21 oliver.sander 2017-09-05 14:30:35 UTC
Okay, I got the color thing sorted out as well.  Apparently, the value of state->getFillRGB() can change without OutputDev::updateFillColor being called.  Then the local brush color of ArthurOutputDev (in m_currentBrush) does not always contain the same color as state->getFillRGB().  Using the latter for the font colors solves the problem.

I recognize that this may be a bug in ArthurOutputDev.  Maybe there is a second way besides updateFillColor to update the fill color?  ArthurOutputDev only implements that single one.  I don't know.

Anyway, new patch attached.
Comment 22 oliver.sander 2017-09-05 14:30:56 UTC
Created attachment 133970 [details] [review]
Patch that replaces Splash font rendering by Qt font rendering
Comment 23 Albert Astals Cid 2017-09-05 19:52:38 UTC
There's a severe regression in the title_font_compresses.pdf file.

Without this patch
qt5/tests/test-poppler-qt5 ~/okularfiles/pdf/title_font_compresses.pdf -arthur
Page 1 size:  48 inches x  36 inches
Displaying page using "Arthur" backend:  0
Rendering took 258 msecs
Displaying page using "Arthur" backend:  1
Rendering took 941 msecs
Displaying page using "Arthur" backend:  2
Rendering took 418 msecs
Displaying page using "Arthur" backend:  3
Rendering took 216 msecs


With this patch
Page 1 size:  48 inches x  36 inches
Displaying page using "Arthur" backend:  0
Rendering took 1094 msecs
Displaying page using "Arthur" backend:  1
Rendering took 1714 msecs
Displaying page using "Arthur" backend:  2
Rendering took 5370 msecs
Displaying page using "Arthur" backend:  3
Rendering took 9409 msecs

Please try to see if you can reproduce (in case my build was broken) and fix.
Comment 24 Albert Astals Cid 2017-09-05 19:53:49 UTC
The file is too big to attach here so i'll send it to you via email.
Comment 25 oliver.sander 2017-09-06 07:15:45 UTC
That's probably because the QRawFont objects are not cached yet. I deliberately omitted caching because I wanted to get correct results first, without the patch getting too complicated.

Personally I'd put caching in a follow-up patch, but if you insist I include it right away.
Comment 26 Albert Astals Cid 2017-09-06 18:56:47 UTC
Given that we release every month, master always has to be good, and commiting something that is 40x slower, doesn't seem "good" to me.
Comment 27 oliver.sander 2017-09-11 21:19:37 UTC
Here is a new version, which now also implements caching of the QRawFont objects.  It certainly is a lot faster than the old one. :-)

I had to implement a Compare class for 'Ref' objects (in ArthurOutputDev.h).  I shameless stole that from pdfinfo.cc.  Given that we now have the same Compare class twice, should we unify the two somehow?  Simply implement operator< for 'Ref'?
Comment 28 oliver.sander 2017-09-11 21:20:09 UTC
Created attachment 134170 [details] [review]
Patch that replaces Splash font rendering by Qt font rendering
Comment 29 Albert Astals Cid 2017-09-12 21:53:59 UTC
(In reply to oliver.sander from comment #27)
> Given that we now have the same
> Compare class twice, should we unify the two somehow?  Simply implement
> operator< for 'Ref'?

I guess, yeah
Comment 30 Albert Astals Cid 2017-09-12 21:54:42 UTC
(In reply to oliver.sander from comment #28)
> Created attachment 134170 [details] [review] [review]
> Patch that replaces Splash font rendering by Qt font rendering

Any chance you can cleanup the code you're not using? (and if you need to use all the code remove those weird markers about "here new code", "here old code")
Comment 31 oliver.sander 2017-09-14 15:21:40 UTC
New patch that now has all obsolete Splash code removed.

Not nearly as much as I originally hoped, but still...

The patch also fixes a regression wrt the previous version: Using only the ID for caching is not sufficient.  The font size needs to be taken into account, too.
Comment 32 oliver.sander 2017-09-14 15:23:03 UTC
Created attachment 134226 [details] [review]
Patch that replaces Splash font rendering by Qt font rendering
Comment 33 Albert Astals Cid 2017-09-14 22:18:40 UTC
Pushed

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.