Bug 77803 - Implement separate numbering styles for Chinese and Japanese (they're similar, but not the same)
Summary: Implement separate numbering styles for Chinese and Japanese (they're similar...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Libreoffice (show other bugs)
Version: Inherited From OOo
Hardware: All All
: medium enhancement
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: CJK-METABUG
  Show dependency treegraph
 
Reported: 2014-04-23 10:31 UTC by Steph ZHANG
Modified: 2014-09-17 14:31 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments
a odt file with Chinese-char page numbers (8.75 KB, application/vnd.oasis.opendocument.text)
2014-04-26 05:47 UTC, Kevin Suo
Details
pdf file shows the current bug behaviour (9.58 KB, application/pdf)
2014-04-26 05:48 UTC, Kevin Suo
Details
Numbering list numbered by Chinese numbers from 90 to 111 in odt (17.82 KB, application/vnd.oasis.opendocument.text)
2014-04-26 08:24 UTC, Steph ZHANG
Details
Japanese and Chinese NatNum# native numbering (27.66 KB, application/vnd.oasis.opendocument.spreadsheet)
2014-09-16 16:40 UTC, Eike Rathke
Details

Description Steph ZHANG 2014-04-23 10:31:02 UTC
As I have recently started a very long article that with a length of several hundreds of pages, I found that the page number is somewhat weird between one hundred and two hundred.

The correct experession of one hundred in Chinese should be 「一百」, but the "one" in "one hundred" is missing and the display become 「百」. Further more, the "zero" between hundreds and ones are missing too. The correct expression of one hundred and one is 「一百零一」, but now it became 「百一」 which literally means one hundred and ten and quite confusing.

I have recently written a java program which can convert integer number to number in Chinese, and it is now in http://paste.ubuntu.com/7313644 . I know that this program has terrible structure and efficiency so it is just for referencing. It can only convert numbers from 0 to 99,999,999 too, but I think that's enough.
Comment 1 Kevin Suo 2014-04-26 05:43:31 UTC
Confirmed in libreoffice 4.2.3.3,Build ID: 6c3586f855673fa6a1576797f575b31ac6fa0ba3

Set to new.

(However, I don't think it's a good idea to use "一,二,三..." as page numbers, and few people do that.
Comment 2 Kevin Suo 2014-04-26 05:47:58 UTC
Created attachment 98003 [details]
a odt file with Chinese-char page numbers

I have inserted a page break and set page number start from 100.

so the 2nd page number should be "一百" (100), rather than "百"
the 3rd page number should be "一百零一" (101), rather than "百一".
Comment 3 Kevin Suo 2014-04-26 05:48:38 UTC
Created attachment 98004 [details]
pdf file shows the current bug behaviour
Comment 4 Steph ZHANG 2014-04-26 08:24:03 UTC
Created attachment 98009 [details]
Numbering list numbered by Chinese numbers from 90 to 111 in odt

This problem is not only affecting the page number, but also affecting other words using the same algorithm, such as the numbered list, as this attachment shows.
Comment 5 Kevin Suo 2014-05-24 05:31:03 UTC
I don't think this is a localization issue. Changing component to LibreOffice.
Comment 6 Matthew Francis 2014-08-26 04:19:22 UTC
The issue here appears to be that there are not separate page (and other - list, etc.) numbering styles for Chinese and Japanese. The behaviour of the current 一, 二, 三, ... numbering is correct for Japanese, where one says "Hundred One" (百一) for 101 rather than "One Hundred Zero One" (一百零一) as in Chinese.

The precedent for this is that there are already separate entries for the visually similar Bulgarian, Russian and Serbian numbering, each tagged with the correct language.

If we follow this example, we would need to split the numbering styles into at least:

一, 二, 三, ... (Chinese)
一, 二, 三, ... (Japanese)


There are probably other implications, e.g. for import/export filters and forward/backward compatibility
Comment 7 Matthew Francis 2014-09-11 09:59:29 UTC
See also:
http://cgit.freedesktop.org/libreoffice/core/tree/filter/source/xslt/import/wordml/wordml2ooo_page.xsl#n345

...
            <xsl:when test="$number-format = 'chinese-counting-thousand' or $number-format = 'ideograph-digital' or $number-format = 'japanese-counting' or $number-format = 'japanese-digital-ten-thousand' or $number-format = 'taiwanese-counting-thousand' or $number-format = 'taiwanese-counting' or $number-format = 'taiwanese-digital' or $number-format = 'chinese-counting' or $number-format = 'korean-digital2' or $number-format = 'chinese-not-impl'">
                <xsl:attribute name="style:num-format">一, 二, 三, ...</xsl:attribute>
            </xsl:when>
...

If I read this correctly, we are folding numerous OOXML numbering formats into "一, 二, 三, ...". A thorough resolution of this issue should include consideration of precisely how these formats differ, and whether we are doing everything reasonably possible in terms of round trip compatibility.
Comment 8 Eike Rathke 2014-09-16 16:40:12 UTC
Created attachment 106383 [details]
Japanese and Chinese NatNum# native numbering

We have 8 different native number types for each, Chinese and Japanese, implemented. You can check these for example in Calc by applying number formats to a cell value, i.e.

[NatNum1]General
[NatNum2]General
...
[NatNum8]General

for each language. I'm attaching a document illustrating this. See also offapi/com/sun/star/i18n/NativeNumberMode.idl or http://api.libreoffice.org/docs/idl/ref/namespacecom_1_1sun_1_1star_1_1i18n_1_1NativeNumberMode.html

However, it seems not all are available for native numbering in page numbers and numbering lists. See also offapi/com/sun/star/style/NumberingType.idl or http://api.libreoffice.org/docs/idl/ref/namespacecom_1_1sun_1_1star_1_1style_1_1NumberingType.html

But the exact numbering as mentioned in comment 2 ("一百" (100), "一百零一" (101)) is not present even as NatNum# numbering. The closest would be Chinese NatNum4 with "一百" (100) and "一百〇一" (101).
Comment 9 Kevin Suo 2014-09-17 14:31:03 UTC
(In reply to comment #8)

> But the exact numbering as mentioned in comment 2 ("一百" (100), "一百零一" (101))
> is not present even as NatNum# numbering. The closest would be Chinese
> NatNum4 with "一百" (100) and "一百〇一" (101).

"一百〇一、一百〇二..." are acceptable, not bad Chinese. "百一、百二..." are really bad.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.