Bug 81272 - Libreoffice Is Very Slow Rendering Chinese Characters (because of font fallback?)
Summary: Libreoffice Is Very Slow Rendering Chinese Characters (because of font fallba...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version: 4.2.5.2 release
Hardware: All Linux (All)
: medium major
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard: target:4.4.0
Keywords:
Depends on:
Blocks: CJK-METABUG
  Show dependency treegraph
 
Reported: 2014-07-12 18:22 UTC by carldong76
Modified: 2015-01-12 14:36 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
A Very Slow Chinese Document (29.13 KB, text/plain)
2014-07-12 18:22 UTC, carldong76
Details
A Very Fast English Document (24.04 KB, application/vnd.oasis.opendocument.text)
2014-07-12 18:23 UTC, carldong76
Details
A Very Slow Chinese Document (29.13 KB, application/vnd.oasis.opendocument.text)
2014-07-12 18:23 UTC, carldong76
Details

Description carldong76 2014-07-12 18:22:43 UTC
Created attachment 102677 [details]
A Very Slow Chinese Document

I tried writing some long documents using Chinese. However, as the number of characters approaches 1000, the scrolling becomes so slow that if I scroll once, I need to wait for several seconds for it to stop. I have created some simple comparason documents for Chinese and English. The English document is very fast though.

I am under Funtoo Linux, amd64. The version I use is 4.2.5.2, but it also appeared for earlier versions. The attachments are the two simple documents I used for comparing.
Comment 1 carldong76 2014-07-12 18:23:14 UTC
Created attachment 102678 [details]
A Very Fast English Document
Comment 2 carldong76 2014-07-12 18:23:56 UTC
Created attachment 102679 [details]
A Very Slow Chinese Document
Comment 3 Kevin Suo 2014-07-13 04:55:55 UTC
Confirmed with libreffice 4.3.0.2, ubuntu 14.04 x86.

Scrolling and editing the Chinese document is very slow, compared with the english document.

It may be because of the font fallback. 
Western font was applied to the Chinese chars. If you set a Chinese font (for example, SimSun, Wenquanyi MircoHei, etc) it will be fast again.

Set to NEW, changed summary to reflect the font fallback issue.
Comment 4 carldong76 2014-07-13 15:41:25 UTC
I confirmed that if I set font explicitly, it becomes fast.
Comment 5 Matthew Francis 2014-08-17 14:55:29 UTC
OSX 10.9.4 / LO 4.3.0.4, 4.4 master:

The "slow" document seems sluggish to me whether or not the text is set explicitly to a Chinese font

If anything it also seems rather slower on 4.4 master than in 4.3.0.4 release

If I open the document and type some "a"s, it is initially slow but responsive as long as I keep typing. However, after stopping, the next letter typed is only processed after a long pause.

I wondered based on this if spellchecking might be involved, but setting the language of the text to "None" doesn't seem to make any difference - unless there is some processing that isn't disabled when this is done
Comment 6 Matthew Francis 2014-09-04 07:20:43 UTC
Poked at this a little with callgrind. There appear to be a number of villains in this case:

1) SwTxtFrm::CollectAutoCmplWrds
Called from beneath SwLayIdle::DoIdleJob()
Workaround: Disable Tools – Autocorrect Options... – Word Completion – Collect words

2) SwTxtFrm::_AutoSpell
Called from beneath SwLayIdle::DoIdleJob()
Workaround: Disable Tools – Automatic Spell Checking

3) SwTxtNode::CountWords
Called from beneath SwLayIdle::DoIdleJob()
Workaround: None found

4) SwTxtNode::CountWords
Called from beneath DocumentStatisticsManager::IncrementalDocStatCalculate
Workaround: None found


Each of these spends a long time dissecting text using SwScanner. In addition, (3) and (4) appear to be counting the same words twice, which compounds the fact that it's a slow operation on a long paragraph.

With all four disabled (commenting code out where necessary), editing the giant paragraph in the text document is merely slow rather than intolerable.
Comment 7 Matthew Francis 2014-09-04 10:54:40 UTC
For a paragraph with N continuous characters of Chinese text (e.g. N x "中"), iterating over the paragraph with SwScanner will cause xdictionary::getWordBoundary() to be called N times, each of which will call xdictionary::seekSegment(), which will in turn iterate over each of the N characters

-> N^2 operations

This needs refactoring so seekSegment() doesn't keep doing the same work over and over again



For Chinese text, the path through from SwScanner to xdictionary goes like this:

    frame #1: 0x0000000110919243 libi18npoollo.dylib`com::sun::star::i18n::xdictionary::seekSegment(this=0x000000010c08a000, rText=0x00007fff5fbfa930, pos=1, segBoundary=0x000000010c08a028) + 115 at xdictionary.cxx:280
    frame #2: 0x00000001109199dd libi18npoollo.dylib`com::sun::star::i18n::xdictionary::getWordBoundary(this=0x000000010c08a000, rText=0x00007fff5fbfa930, anyPos=1, wordType=3, bDirection=true) + 173 at xdictionary.cxx:412
    frame #3: 0x00000001109073e7 libi18npoollo.dylib`com::sun::star::i18n::BreakIterator_CJK::getWordBoundary(this=0x000000011ff93ab8, text=0x00007fff5fbfa930, anyPos=1, nLocale=0x00000001206d7600, wordType=3, bDirection='\x01') + 119 at breakiterator_cjk.cxx:81
    frame #4: 0x000000011090753c libi18npoollo.dylib`non-virtual thunk to com::sun::star::i18n::BreakIterator_CJK::getWordBoundary(this=0x000000011ff93ae0, text=0x00007fff5fbfa930, anyPos=1, nLocale=0x00000001206d7600, wordType=3, bDirection='\x01') + 92 at breakiterator_cjk.cxx:88
    frame #5: 0x000000011090e0e4 libi18npoollo.dylib`com::sun::star::i18n::BreakIteratorImpl::getWordBoundary(this=0x0000000117072b78, Text=0x00007fff5fbfa930, nPos=1, rLocale=0x00000001206d7600, rWordType=3, bDirection='\x01') + 612 at breakiteratorImpl.cxx:182
    frame #6: 0x000000011090e1dc libi18npoollo.dylib`non-virtual thunk to com::sun::star::i18n::BreakIteratorImpl::getWordBoundary(this=0x0000000117072ba0, Text=0x00007fff5fbfa930, nPos=1, rLocale=0x00000001206d7600, rWordType=3, bDirection='\x01') + 92 at breakiteratorImpl.cxx:186
    frame #7: 0x00000001184e85d0 libswlo.dylib`SwScanner::NextWord(this=0x00007fff5fbfa918) + 1296 at txtedt.cxx:836
Comment 8 Commit Notification 2014-09-10 14:11:57 UTC
Matthew J. Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a34a8fca21c670c4e7ee147d05ed9e6e4136cbe1

fdo#81272 Speed up break iterators



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2014-09-10 15:49:52 UTC
Caolan McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=997d1387abcfa40eca8d15a2fe025edc4a1de040

Revert "fdo#81272 Speed up break iterators"



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2014-09-10 20:03:37 UTC
Matthew J. Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=44ead04eb5fc61a3f56f783adb1509fab440e212

fdo#81272 Speed up break iterators



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Caolán McNamara 2014-09-10 20:04:25 UTC
caolanm->fdbugs: Can we call this fixed now ?
Comment 12 Matthew Francis 2014-09-11 03:57:02 UTC
Two concerns with closing at this point:

1) The reporter had a slightly different symptom to the problem that's just been patched - that the slowdown was dependent on fonts. I never reproduced that on OSX; the reported issue may not be identical.

2) After the patch, performance with CJK text is much better in the 10,000 character range, but still poor with a few times that (30k or above is still appreciably slow on my local machine). This compares poorly with a paragraph of the same number of western text characters.


I have a few ideas on possibilities for further improving performance, but not much that's as simple as the first patch...
Comment 13 Kevin Suo 2015-01-12 14:36:42 UTC
(In reply to Matthew Francis from comment #12)
I confirm that this bug still exists in the following version:
Version: 4.4.0.2
Build ID: a3603970151a6ae2596acd62b70112f4d376b990
Locale: zh_CN
Fedora 22 X64.

Steps to reproduce:
1. Open attachment 102679 [details] with Writer;
2. Try to delete some chars using the BACKSPACE, or try to type in some text.
--> Very slow.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.