Created attachment 102677 [details] A Very Slow Chinese Document I tried writing some long documents using Chinese. However, as the number of characters approaches 1000, the scrolling becomes so slow that if I scroll once, I need to wait for several seconds for it to stop. I have created some simple comparason documents for Chinese and English. The English document is very fast though. I am under Funtoo Linux, amd64. The version I use is 4.2.5.2, but it also appeared for earlier versions. The attachments are the two simple documents I used for comparing.
Created attachment 102678 [details] A Very Fast English Document
Created attachment 102679 [details] A Very Slow Chinese Document
Confirmed with libreffice 4.3.0.2, ubuntu 14.04 x86. Scrolling and editing the Chinese document is very slow, compared with the english document. It may be because of the font fallback. Western font was applied to the Chinese chars. If you set a Chinese font (for example, SimSun, Wenquanyi MircoHei, etc) it will be fast again. Set to NEW, changed summary to reflect the font fallback issue.
I confirmed that if I set font explicitly, it becomes fast.
OSX 10.9.4 / LO 4.3.0.4, 4.4 master: The "slow" document seems sluggish to me whether or not the text is set explicitly to a Chinese font If anything it also seems rather slower on 4.4 master than in 4.3.0.4 release If I open the document and type some "a"s, it is initially slow but responsive as long as I keep typing. However, after stopping, the next letter typed is only processed after a long pause. I wondered based on this if spellchecking might be involved, but setting the language of the text to "None" doesn't seem to make any difference - unless there is some processing that isn't disabled when this is done
Poked at this a little with callgrind. There appear to be a number of villains in this case: 1) SwTxtFrm::CollectAutoCmplWrds Called from beneath SwLayIdle::DoIdleJob() Workaround: Disable Tools – Autocorrect Options... – Word Completion – Collect words 2) SwTxtFrm::_AutoSpell Called from beneath SwLayIdle::DoIdleJob() Workaround: Disable Tools – Automatic Spell Checking 3) SwTxtNode::CountWords Called from beneath SwLayIdle::DoIdleJob() Workaround: None found 4) SwTxtNode::CountWords Called from beneath DocumentStatisticsManager::IncrementalDocStatCalculate Workaround: None found Each of these spends a long time dissecting text using SwScanner. In addition, (3) and (4) appear to be counting the same words twice, which compounds the fact that it's a slow operation on a long paragraph. With all four disabled (commenting code out where necessary), editing the giant paragraph in the text document is merely slow rather than intolerable.
For a paragraph with N continuous characters of Chinese text (e.g. N x "中"), iterating over the paragraph with SwScanner will cause xdictionary::getWordBoundary() to be called N times, each of which will call xdictionary::seekSegment(), which will in turn iterate over each of the N characters -> N^2 operations This needs refactoring so seekSegment() doesn't keep doing the same work over and over again For Chinese text, the path through from SwScanner to xdictionary goes like this: frame #1: 0x0000000110919243 libi18npoollo.dylib`com::sun::star::i18n::xdictionary::seekSegment(this=0x000000010c08a000, rText=0x00007fff5fbfa930, pos=1, segBoundary=0x000000010c08a028) + 115 at xdictionary.cxx:280 frame #2: 0x00000001109199dd libi18npoollo.dylib`com::sun::star::i18n::xdictionary::getWordBoundary(this=0x000000010c08a000, rText=0x00007fff5fbfa930, anyPos=1, wordType=3, bDirection=true) + 173 at xdictionary.cxx:412 frame #3: 0x00000001109073e7 libi18npoollo.dylib`com::sun::star::i18n::BreakIterator_CJK::getWordBoundary(this=0x000000011ff93ab8, text=0x00007fff5fbfa930, anyPos=1, nLocale=0x00000001206d7600, wordType=3, bDirection='\x01') + 119 at breakiterator_cjk.cxx:81 frame #4: 0x000000011090753c libi18npoollo.dylib`non-virtual thunk to com::sun::star::i18n::BreakIterator_CJK::getWordBoundary(this=0x000000011ff93ae0, text=0x00007fff5fbfa930, anyPos=1, nLocale=0x00000001206d7600, wordType=3, bDirection='\x01') + 92 at breakiterator_cjk.cxx:88 frame #5: 0x000000011090e0e4 libi18npoollo.dylib`com::sun::star::i18n::BreakIteratorImpl::getWordBoundary(this=0x0000000117072b78, Text=0x00007fff5fbfa930, nPos=1, rLocale=0x00000001206d7600, rWordType=3, bDirection='\x01') + 612 at breakiteratorImpl.cxx:182 frame #6: 0x000000011090e1dc libi18npoollo.dylib`non-virtual thunk to com::sun::star::i18n::BreakIteratorImpl::getWordBoundary(this=0x0000000117072ba0, Text=0x00007fff5fbfa930, nPos=1, rLocale=0x00000001206d7600, rWordType=3, bDirection='\x01') + 92 at breakiteratorImpl.cxx:186 frame #7: 0x00000001184e85d0 libswlo.dylib`SwScanner::NextWord(this=0x00007fff5fbfa918) + 1296 at txtedt.cxx:836
Matthew J. Francis committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=a34a8fca21c670c4e7ee147d05ed9e6e4136cbe1 fdo#81272 Speed up break iterators The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Caolan McNamara committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=997d1387abcfa40eca8d15a2fe025edc4a1de040 Revert "fdo#81272 Speed up break iterators" The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Matthew J. Francis committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=44ead04eb5fc61a3f56f783adb1509fab440e212 fdo#81272 Speed up break iterators The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
caolanm->fdbugs: Can we call this fixed now ?
Two concerns with closing at this point: 1) The reporter had a slightly different symptom to the problem that's just been patched - that the slowdown was dependent on fonts. I never reproduced that on OSX; the reported issue may not be identical. 2) After the patch, performance with CJK text is much better in the 10,000 character range, but still poor with a few times that (30k or above is still appreciably slow on my local machine). This compares poorly with a paragraph of the same number of western text characters. I have a few ideas on possibilities for further improving performance, but not much that's as simple as the first patch...
(In reply to Matthew Francis from comment #12) I confirm that this bug still exists in the following version: Version: 4.4.0.2 Build ID: a3603970151a6ae2596acd62b70112f4d376b990 Locale: zh_CN Fedora 22 X64. Steps to reproduce: 1. Open attachment 102679 [details] with Writer; 2. Try to delete some chars using the BACKSPACE, or try to type in some text. --> Very slow.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.