Bug 5168 - Text rendering problem
Summary: Text rendering problem
Status: RESOLVED MOVED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium minor
Assignee: poppler-bugs
QA Contact:
URL: http://wdc.custhelp.com/cgi-bin/wdc.c...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-26 05:50 UTC by Evert Verhellen
Modified: 2018-08-20 21:43 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Screenshot displaying the difference between fonts in poppler and xpdf (82.94 KB, image/png)
2007-06-15 08:41 UTC, Sven Arvidsson
Details
Another file with rendering problems (660.23 KB, application/pdf)
2008-10-28 15:14 UTC, Philip Ganchev
Details
Patch to check Panose values in GlobalParams::getDisplayFont() (4.28 KB, patch)
2010-08-31 09:13 UTC, suzuki toshiya
Details | Splinter Review
pdftoppm result of previous patch (38335) (72.33 KB, image/png)
2010-08-31 09:15 UTC, suzuki toshiya
Details
Patch to check Panose values in GlobalParams::getDisplayFont(), revision 2 (4.86 KB, patch)
2010-08-31 10:34 UTC, suzuki toshiya
Details | Splinter Review
pdftoppm result of previous patch (38339) (73.15 KB, image/png)
2010-08-31 10:35 UTC, suzuki toshiya
Details

Description Evert Verhellen 2005-11-26 05:50:17 UTC
1. Download the following PDF:

http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/fattach_get.php?p_sid=SOUT9CAh&p_tbl=9&p_id=1219&p_created=1088113257

2. Open this PDF in Adobe Reader 7.0 and Evince 0.4.0 and compare the text
rendering.

Expected results:
Same text rendering.

Actual results:
Evince employs different fonts. Garamond and Univers Condensed are respectively
replaced with Times New Roman and something else. Page 2 of the PDF is hard to
read due to the tight character spacing.

Additional information:
Xpdf 3.01 also doesn't use Garamond either (Times New Roman instead) although
the Univers Condensed seems to be displayed correctly there.

Version details:
Evince 0.4.0
Poppler 0.4.2
Xpdf 3.01
Comment 1 Sven Arvidsson 2007-06-15 08:40:30 UTC
I have a similar problem with this file, 
http://www.dnv.se/mou/rapport_rfid.pdf

The font used, and the narrow spacing makes it almost unreadable with poppler. It renders fine in Xpdf.

I'm using poppler 0.5.4 (cairo).
Comment 2 Sven Arvidsson 2007-06-15 08:41:23 UTC
Created attachment 10323 [details]
Screenshot displaying the difference between fonts in poppler and xpdf
Comment 3 Philip Ganchev 2008-10-28 15:14:21 UTC
Created attachment 19918 [details]
Another file with rendering problems

Font appears similar to comment #1.
Comment 4 Philip Ganchev 2008-11-03 12:19:15 UTC
In particular, the letters run together and this makes the document very hard to read.  It also maxes out the CPU (but not the memory) while trying to render the page (Evince bug http://bugzilla.gnome.org/show_bug.cgi?id=523372).

By the way, my comment above was transferred here from a bug filed against Evince: http://bugzilla.gnome.org/show_bug.cgi?id=558180 .  nshmyrnev@yandex.ru asked me to transfer it.
Comment 5 Albert Astals Cid 2009-06-17 13:58:15 UTC
Can you check with a newer poppler?
Comment 6 Sven Arvidsson 2009-06-17 15:46:32 UTC
How new? The problem is still present with Poppler 0.10.6, using this PDF: 
http://www.dnv.se/mou/rapport_rfid.pdf

The screenshot I made in attachment 10323 [details] is still valid.
Comment 7 suzuki toshiya 2010-08-31 07:40:27 UTC
I guess this issue may be related with buildFcPattern() which
does not put a preference for serif/sans-serif info to the
fontconfig's searching pattern. I remember even non-embedded
font descriptor in PDF can hold a Panose value which can give
such hint. I will try to improve. If it's my misunderstanding,
I'm sorry in advance.
Comment 8 suzuki toshiya 2010-08-31 09:13:22 UTC
Created attachment 38335 [details] [review]
Patch to check Panose values in GlobalParams::getDisplayFont()
Comment 9 suzuki toshiya 2010-08-31 09:15:06 UTC
Created attachment 38336 [details]
pdftoppm result of previous patch (38335)
Comment 10 suzuki toshiya 2010-08-31 09:18:52 UTC
Oops, it is not so simple issue. fontconfig does not provide
(nor search with the consideration for) serif/sans-serif info.
I had to pick such info from the font by opening the font
via FreeType, load OS/2 table, check Panose values. But even
if I can reflect serif/sans-serif flag in the font selection,
it does not mean the best matching font would be chosen.

I attached my patch to use Panose values in the font selection,
and its result. You can find that the regular roman text is
drawn by fixed-pitched Latin glyph in CJK fonts. It's very
ugly. Worse than original result. More metric matching evaluation
is needed.

The question would be "why xpdf show good result? xpdf does
more complicated work for better font substitution?"
The answer is that xpdf's font substitution is more simple.
It does not use fontconfig. It seems that xpdf prefers PS
Type1, and find conventional typefaces (because the variety
of PS Type1 fonts on free unix system is not so wide).

I have to reconsider how to find better matching fonts.
Comment 11 suzuki toshiya 2010-08-31 10:34:30 UTC
Created attachment 38339 [details] [review]
Patch to check Panose values in GlobalParams::getDisplayFont(), revision 2

The "proportional" value in Panose is also considered.
Comment 12 suzuki toshiya 2010-08-31 10:35:53 UTC
Created attachment 38340 [details]
pdftoppm result of previous patch (38339)

Slightly better (because fixed-width Latin in CJK font
can be excluded), but the regular roman text is drawn
by Myanmar3_ship.ttf!
Comment 13 suzuki toshiya 2010-08-31 10:36:47 UTC
I think further improvement should be done in fontconfig side.
Or, poppler should have font substitution mapping database?
Comment 14 Albert Astals Cid 2010-08-31 10:50:11 UTC
suzuki i'm happy you try to help, but please do not suggest silly things, FontConfig *is* the system font database, adding one to poppler is just wrong.
Comment 15 suzuki toshiya 2010-08-31 10:59:49 UTC
Albert, thank you for comment. My English is terribly poor, so
please let me confirm: do you say that "font substitution
mapping database in poppler" is silly idea? I was thinking about
OpenOffice.org's font substitution database - OOo uses fontconfig,
but has familyname-based substition map, too. Sorry for bothering
you.

--

Some accented characters are lost in my pdftoppm results.
It seems that buildFcPattern() cannot detect the language
of AGaramond, so we cannot force to choose fonts including
them.
Comment 16 Albert Astals Cid 2010-08-31 11:17:05 UTC
Yes, i say that fontconfig is a database for fonts and allows the user to have familyname-based substition rules, so there is no need for poppler to do the same.
Comment 17 Albert Astals Cid 2010-08-31 11:19:07 UTC
Also, please if you know that a patch does not solve the problem, do not post it here, since it's not really useful to anyone.
Comment 18 suzuki toshiya 2010-08-31 11:31:09 UTC
OK, I stop further posting till when I summarize my patch.
Comment 19 James Cloos 2010-09-01 06:25:47 UTC
> I think further improvement should be done in fontconfig side.

There has been interest for some time to have fontconfig cache the
panose info and permit matching against it.

How long it will take — even if patches are provided — is an
/interesting/ question....

(My proposal/patch to cache the psname languishes on the fc list
and ’zilla w/o any comment. ☹)
Comment 20 Dotan Cohen 2015-11-26 13:49:10 UTC
I can confirm that the issue still exists in Evince 3.16.1 and Okular 0.23.2, which are the latest versions as of this writing. Tested using the OP's example PDF in comment #1 dated 2005-11-26.
Comment 21 GitLab Migration User 2018-08-20 21:43:18 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/44.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.