Distribution/Version: Fedora Core 3 Steps to reproduce: 1) Open http://www.aetat.no/data//f/0/24/53/1_702_0/rapporten_hele2004.pdf in evince. 2) Go to page 5. 3) Select the first line. 4) Try to paste the text into gedit (or some other app). Actual results: No text is pasted into gedit. Evince prints the following message to stdout: (evince:11622): Gdk-WARNING **: Error converting from UTF-8 to STRING: Invalid byte sequence in conversion input Expected results: The selected text should be pasted into gedit. This happens only with text that contains non-ASCII characters such as ø and æ. Here's two more examples: http://www.stud.uni-karlsruhe.de/~udatk/evince/oowriter1.pdf http://www.zuv.uni-heidelberg.de/studsekr/rechtsgrundlagen/ordnungen/11/1103901.pdf Try to copy "Prüfungsordnung" (page 1, first word) or any other word that contains an umlaut. xpdf 3.00 and acroread 7.0.0 don't have this problem. I originally reported this to Evince bugzilla: http://bugzilla.gnome.org/show_bug.cgi?id=172846
Created attachment 2577 [details] [review] Change default text encoding to UTF-8 That's an easy one! Just change the default text encoding to UTF-8. Maybe GlobalParams::textEncoding should even be totally deprecated and unchangeable in poppler. (To test with evince, use the EVINCE_0_2_1 tag, HEAD is broken wrt. clipboard stuff).
Patch committed, closing bug.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.