Bug 81129 - Unicode Chars with 5-hex-digit Codes are Filtered Away in Pasting (Paste Special Option Doesn't Work; Works in WordPad)
Summary: Unicode Chars with 5-hex-digit Codes are Filtered Away in Pasting (Paste Spec...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version: 4.2.5.2 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
: 85315 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-07-09 22:58 UTC by jburrill
Modified: 2015-01-17 00:45 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description jburrill 2014-07-09 22:58:58 UTC
I’ll be submitting this problem report to both OpenOffice and LibreOffice.

  To perform these steps, you will need to have Tahoma and Lucida Sans Unicode fonts (common) and EITHER Symbola font OR both FreeSerif font and Segoe UI Symbol font.  (While the latter are not as common, they are available for free).

1. The six test lines below all begin with a Unicode character followed by text which describes the character.  Copy the six lines into a WordPad document and change the font for the lines to Tahoma.  Note that only the Unicode character in the second line (sharp sign) is displayed.

2. Next change the font of all six lines to Lucida Sans Unicode.  Note that now the first two Unicode characters (flat and sharp) are displayed.

3. Now -either- change the last four lines to Symbola -or- change the middle two lines to FreeSerif and the last two lines to Segoe UI Symbol.  Note that now all the Unicode characters are displayed correctly.

  The first three steps have demonstrated the correct behavior.  I would expect the same behavior in OpenOffice Writer and LibreOffice Writer.

4. But now repeat the same three steps in either version of “Writer.”  Note that the Unicode characters in the last four lines are never displayed correctly.

5. Now change the last four lines to OpenSymbol font.  Note that this still doesn’t help...

It seems that this happens with any 5-hex-digit Unicode character which is supported by a font like Symbola, FreeSerif or Segoe Symbol UI in WordPad.  (Related LibreOffice Bug 71603 appears to be just one instance of this general problem with 5-hex-digit Unicode characters.)

I haven’t tried it in MS Word, but I’d assume that if it works in WordPad, it will work in Word.
_   _   _   _   _

the six test lines:

♭  266D music flat sign
♯  266F music sharp sign
𝄋  1D10B segno
𝄌  1D10C coda
🎶  1F3B6 multiple musical notes
🎷  1F3B7 saxophone

I’ll switch to using and recommending whichever “Office” is either [a] quickest in showing me how to display 5-hex-digit Unicode characters in the Writer application the way it is -or- [b] quickest in fixing the problem.

jburrill@gmail.com
Comment 1 Adolfo Jayme 2014-07-10 00:53:52 UTC
I tested this issue under Linux (the operating system I use).

So I copied the test lines from this bug report, and the pasting mechanism threw away the special characters (except ♭ and ♯). But then I tried pasting the test lines by using the “Text Without Formatting” option (from the Paste Special dialog, Ctrl+Shift+V) and all of the special characters were pasted correctly.

Please let me know if using Paste Special > Text Without Formatting works for you under Windows. I’m adjusting this bug’s title a bit.
Comment 2 Jay Philips 2014-07-11 12:41:17 UTC
(In reply to comment #1)
> I tested this issue under Linux (the operating system I use).
> 
> So I copied the test lines from this bug report, and the pasting mechanism
> threw away the special characters (except ♭ and ♯). But then I tried pasting
> the test lines by using the “Text Without Formatting” option (from the Paste
> Special dialog, Ctrl+Shift+V) and all of the special characters were pasted
> correctly.
> 
> Please let me know if using Paste Special > Text Without Formatting works
> for you under Windows. I’m adjusting this bug’s title a bit.

Adolfo,

Regular paste gave the same 2 characters you mentioned appeared on linux, but paste special only gives 'HTML format' and 'HTML format without comments'. Selecting without comments had the first entry as a blank box sometimes and sometimes as the b, the second entry showed correctly always, while the remaining 4 showed as questions marks. This was on 4.2.4 and 4.3.0 on Windows 7.
Comment 3 jburrill 2014-07-11 22:38:37 UTC
Changing title again since Paste Special does not work.  Depending on the font, it can look as though the characters were dropped in the paste when they weren't.  It's just that they aren't displayed.  This might even be correct behavior, depending on the font, so it's important to mention the font you're attempting to have the characters rendered in.

WordPad correctly renders all four of the 5-hex-digit Unicode characters (lines 3 - 6) in Symbola font.  It also correctly renders lines 3 and 4 in FreeSerif -- and lines 5 and 6 in Segoe UI Symbol.  Both LibreOffice Writer and OpenOffice Writer should be able to do as well.  And they should be exportable to PDF.

Since submitting this, I've discovered that KingSoft Writer does render these characters correctly, but they are lost when KingSoft tries to export them to PDF.  (But KingSoft also has some other bugs related to Unicode characters that LibreOffice/OpenOffice doesn't have).
Comment 4 Urmas 2014-07-12 02:39:53 UTC
The symbols do appear after application restart and reloading the document.
Comment 5 Matthew Francis 2014-09-27 15:41:11 UTC
Reproduced on OSX / LO 4.3.2.2 and 4.4 master:

"Paste Special" as "Unformatted text" does the right thing, and all characters are displayed.

Regular paste and paste as HTML appear to replace all the characters outside the Unicode basic multilingual plane (i.e. >0xFFFF) with "?" (a literal question mark, not a placeholder for a non-rendered character). Given that, whatever the font is then changed to makes no difference.

-> Platform: All
-> NEW
Comment 6 Matthew Francis 2015-01-17 00:45:38 UTC
*** Bug 85315 has been marked as a duplicate of this bug. ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.