Bug 65956 - okular does not show non-latin characters in inline notes
Summary: okular does not show non-latin characters in inline notes
Status: RESOLVED MOVED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other Linux (All)
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-20 08:34 UTC by Serge Gavrilov
Modified: 2018-08-21 11:18 UTC (History)
9 users (show)

See Also:
i915 platform:
i915 features:


Attachments
preview (3.55 KB, image/png)
2014-07-22 15:24 UTC, Jiri Slaby
Details
patch no. 1 (1022 bytes, patch)
2014-07-24 15:05 UTC, Jiri Slaby
Details | Splinter Review
patch no. 1 fixed (1023 bytes, text/plain)
2014-07-24 15:13 UTC, Jiri Slaby
Details
patch no. 2 (2.55 KB, patch)
2014-07-25 09:17 UTC, Jiri Slaby
Details | Splinter Review
Annotations with CJK characters example (10.96 KB, application/pdf)
2016-04-18 20:08 UTC, V字龍(Vdragon)
Details
Okular inline note with Chinese and English characters (15.52 KB, image/png)
2018-01-31 19:37 UTC, H Zeng
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Serge Gavrilov 2013-06-20 08:34:38 UTC
This is generally okular bug https://bugs.kde.org/show_bug.cgi?id=310154. 

If I enter non-latin characters into an inline note (the kind of pdf annotations), then okular does not show these symbols in inline notes.
Comment 1 Jiri Slaby 2014-07-22 15:24:05 UTC
Created attachment 103286 [details]
preview

(In reply to comment #0)
> This is generally okular bug https://bugs.kde.org/show_bug.cgi?id=310154. 

As was noted there, this is not an okular bug. It's a bug in poppler.

I am also attaching a preview of the bug: it should contain all "ěščřžýáíé", but contains only a few of them.
Comment 2 Jiri Slaby 2014-07-24 15:05:12 UTC
Created attachment 103401 [details] [review]
patch no. 1

Ok, when one turns on debugging, they see:
okular(15836) PDFGeneratorPopplerDebugFunction: [Poppler] "Error: AnnotWidget::layoutText, cannot convert U+011B"


There is a bug in Annot::layoutText. We are having also 2-wide chars, but pass 1 to mapToCharCode. So this patch fixes that. But there is still nother problem, that the characters are not still in smap...
Comment 3 Jiri Slaby 2014-07-24 15:13:36 UTC
Created attachment 103402 [details]
patch no. 1 fixed
Comment 4 Jiri Slaby 2014-07-25 09:17:44 UTC
Created attachment 103432 [details] [review]
patch no. 2

Ok, the annotation font is only 8bit AFAIU. But I cannot make it work with this patch and have no more time to play with it ATM.

If anyone can continue investigation and fix it, it would be awesome :).
Comment 5 Jiri Slaby 2014-07-25 09:18:33 UTC
BTW the error is now this:

okular(5623) PDFGeneratorPopplerDebugFunction: [Poppler] "Error: AnnotWidget::layoutText, cannot convert U+010D"
okular(5623) PDFGeneratorPopplerDebugFunction: [Poppler] "Error: AnnotWidget::layoutText, cannot convert U+0159"
okular(5623) PDFGeneratorPopplerDebugFunction: [Poppler] "Error: AnnotWidget::layoutText, cannot convert U+0159"
okular(5623) PDFGeneratorPopplerDebugFunction: [Poppler] "Error: AnnotWidget::layoutText, cannot convert U+010D"
okular(5623) PDFGeneratorPopplerDebugFunction: [Poppler] "Error: Couldn't find a font for 'Helvetica'"
Comment 6 Albert Astals Cid 2014-07-25 22:15:26 UTC
Can you please explain what patch 2 does versus patch 1?
Comment 7 Jiri Slaby 2014-07-26 07:35:22 UTC
(In reply to comment #6)
> Can you please explain what patch 2 does versus patch 1?

Sure, it tries to use cidtype0 font instead of type0, but fails to do so obviously. The idea was to have some font that can handle unicode characters, not only latin1.

mapLen is currently 256 for the type0 font, smapLen is 0. So there are only 256 codes.
Comment 8 Albert Astals Cid 2014-08-02 23:04:07 UTC
I've commited patch 1 since it does improve the rendering of some files, not patch 2 since i don't see the point.
Comment 9 Jiri Slaby 2014-08-19 08:46:58 UTC
(In reply to comment #8)
> I've commited patch 1 since it does improve the rendering of some files, not
> patch 2 since i don't see the point.

The point of patch 2 was to render also letters like ř (U+0159). But the patch is incomplete and it does not work obviously. It was meant to load some font with maps for 2-byte usize, as smap is empty for the current font.

With patch 1 only, some letters like č (U+010D) are no longer rendered. They happened to be rendered by coincidence previously, but I have never come to a conclusion why it happened.

Just a note, that letters like á (U+00E1) is always rendered (as they are ascii and usize is 1 for them).

Can you help with fixing č and ř properly? (To reproduce, just open okular, f6 and add an inserted note containing the letters to see they are not rendered).
Comment 10 Albert Astals Cid 2014-08-19 21:29:26 UTC
I know how to reproduce, this is not the problem ;)

One of the things you may want to try is not use Helvetica but some other font.
Comment 11 antmak 2015-08-04 11:37:27 UTC
I confirm this bug. Fedora 22.

poppler-0.30.0-3.fc22.x86_64
okular-15.04.0-1.fc22.x86_64
Comment 12 V字龍(Vdragon) 2016-04-18 20:08:06 UTC
Note that this bug prevents all CJK characters from being rendered in the annotations, but not just some rare-used characters.

Example attached.
Comment 13 V字龍(Vdragon) 2016-04-18 20:08:29 UTC
Created attachment 123031 [details]
Annotations with CJK characters example
Comment 14 devel 2016-07-13 13:20:26 UTC
Apparently the font configured to be used in the annotation is ignored in general. Maybe this should be fixed and the rest might work then?
Comment 15 Dainius Masiliūnas 2017-10-12 08:51:35 UTC
Still an issue in Poppler(-Qt5) 0.43.0. Indeed, the font changes in the inline note properties are also ignored, only the size gets changed.
Comment 16 H Zeng 2018-01-31 19:37:52 UTC
Created attachment 137093 [details]
Okular inline note with Chinese and English characters

In addition to Vdragon's attachment in Comment #13, I'd like to add a more intuitive example here with mixed Latin and Chinese characters. The shown characters are "Note has 中文 and English". Obviously, "中文" appears in the editor but not in the inline note.
Comment 17 H Zeng 2018-01-31 19:41:37 UTC
I forgot to say that I am using Okular version 1.3.1 with poppler 0.62.0 in the environment of,
```
KDE Frameworks 5.42.0
Qt 5.10.0 (built against 5.10.0)
The xcb windowing system
```
Comment 18 Philipp Marek 2018-02-22 19:38:06 UTC
Still an issue with libpoppler-qt5-1=0.62.0-1 and okular=1.2.3.

Please fix this issue, it makes reviewing technical papers really hard - I can't enter any greek characters (αβγ...), superscripts (σ²) or other "special" characters.


Thank you so much!
Comment 19 Drew Parsons 2018-04-26 05:31:17 UTC
This bug is dreadful.  Please fix.
Comment 20 GitLab Migration User 2018-08-21 10:46:08 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/362.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.