Bug 63927 - Export (HTML, DOC, DOCX): handles weak bidi characters as strong ones
Summary: Export (HTML, DOC, DOCX): handles weak bidi characters as strong ones
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version: 3.5.0 release
Hardware: Other All
: medium major
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: rtl-meta
  Show dependency treegraph
 
Reported: 2013-04-25 16:04 UTC by Lior Kaplan
Modified: 2013-11-29 07:13 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
testdoc (11.48 KB, application/vnd.oasis.opendocument.text)
2013-04-25 16:06 UTC, Lior Kaplan
Details

Description Lior Kaplan 2013-04-25 16:04:25 UTC
Taking RTL text which has weak bidi characters like commas, brackets, hyphen, etc. and exporting it to HTML results in HTML code that separates the weak characters from the RTL text.

Text:
"בית משפט" - לרבות בית דין לעבודה, בית דין דתי, ראש הוצאה לפועל לפי חוק ההוצאה לפועל, תשכ"ז-1967 (להלן - חוק ההוצאה לפועל), ולמעט בית דין צבאי כמשמעותו בחוק השיפוט הצבאי, תשט"ו– 1955;

HTML export:
<P DIR="RTL" ALIGN=RIGHT STYLE="margin-bottom: 0cm">	&quot;<FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">בית
משפט</SPAN></FONT>&quot; - <FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">לרבות
בית דין לעבודה</SPAN></FONT>, <FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">בית
דין דתי</SPAN></FONT>, <FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">ראש
הוצאה לפועל לפי חוק ההוצאה לפועל</SPAN></FONT>,
<FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">תשכ</SPAN></FONT>&quot;<FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">ז</SPAN></FONT>-1967
(<FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">להלן </SPAN></FONT>-
<FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">חוק ההוצאה
לפועל</SPAN></FONT>), <FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">ולמעט
בית דין צבאי כמשמעותו בחוק השיפוט הצבאי</SPAN></FONT>,
<FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">תשט</SPAN></FONT>&quot;<FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">ו–
</SPAN></FONT>1955;
</P>

Notice the font and span tags end before each weak character and start again afterwords:
<SPAN LANG="he-IL">בית
משפט</SPAN></FONT>&quot; - <FONT FACE="Nachlieli CLM"><SPAN LANG="he-IL">לרבות
בית דין לעבודה</SPAN></FONT>,
Comment 1 Lior Kaplan 2013-04-25 16:06:03 UTC
Created attachment 78482 [details]
testdoc
Comment 2 Lior Kaplan 2013-04-25 16:21:25 UTC
This is a regression for 3.3.4, and also affects export to word formats. The report is done on HTML because it's very easy to demonstrate on it.

Setting importance major, as the doc/docx export problems prevents people using LibreOffice from working with Microsoft Office users (each save of the document alters the file drastically).


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.