Bug 31480 - Find/replace non-printing characters easily
Summary: Find/replace non-printing characters easily
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Libreoffice (show other bugs)
Version: unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
: 31509 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-11-08 16:01 UTC by David Nelson
Modified: 2014-12-25 11:17 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description David Nelson 2010-11-08 16:01:27 UTC
Hi, :-)

In Microsoft Office, when you do a find/replace, you have a dropdown list enabling you to easily include many special characters to search for, such as

- carriage return
- new line
- tab mark
- page break
- non-breaking spaces
- and various others.

In LibO, I think you can only do this via regular expressions. But your average user is incapable of using regular expressions.

Could you possibly add a similar dropdown box?

Thanks if so, and thanks very much for your work. :-)
Comment 1 Don't use this account, use tml@iki.fi 2010-11-09 01:47:33 UTC
But note that being able to search for "carriage return", "new line" and "page break" (and possibly also the other ones you mention) depends on those being present in the internal representation of text. I am not sure at all these *typographical concepts" exist in the internal representation of text in OpenOffice.org/LibreOffice. I think I have been told that OOo/LO uses a much more "structured" approach with separate objects for paragraphs etc, and maybe then even stores forced line breaks just as data structures, not as actual embedded carriage returns and/or new line characters.

So implementing this might be much more complex than what it perhaps is in MS Office. That doesn't mean it wouldn't be useful, of course. Even if we keep the traditional OOo way to store text in LibreOffice, we could present to the user an illusion that also the formatting characters you mention are actually present. That might be useful for people migrating from MS Office.

On the other hand, for the (few...) people who actually prefer to think of documents in a structured fashion and not as stream of characters including formatting characters, being able to search for for instance carriage returns sure would seem unnatural. In an ideal world, that is how one should conceptualize documents, no?

Of course, I might be totally misunderstanding stuff above, and in that case, feel free to correct me, and/or ignore my rambling.
Comment 2 David Nelson 2010-11-09 03:09:26 UTC
Hi Tor, 

Thank you for your comments. I think you've indeed understood what I was on about, but:

I understand what you're talking about as regards LibO/OOo's internal storage.

However, that is invisible to the end user.

I, the dumb end user, pressed the carriage return key while typing. I don't care how the software stores it. But I want to be able to search for that carriage return after.

Same thing when I press Shift-Enter (a "new line" or "soft return"). I want to be able to find those "new line" "characters" after.

Same thing for tab "characters". Etc.

Since I made those keystrokes and they have a result on-screen, they are obviously being stored in some form or other. Otherwise, next time I open it, my doc would look different from the way it looked when I typed it, no? ;-)

I sometimes need to search for the "new line" characters and replace them with a "carriage return" and thus create new paragpaphs, etc. Or I need to search for 8 space characters and replace them with a "tab mark" instead.

In MS Office, I have a dropdown list of such "special characters" and it makes life very simple to use them in find/replaces.

Could we get that in LibO, too, please?

Thanks if so. ;-)

Please let me know if I haven't explained clearly. :-)
Comment 3 Kohei Yoshida (inactive) 2010-11-09 11:30:30 UTC
*** Bug 31509 has been marked as a duplicate of this bug. ***
Comment 4 David Nelson 2010-11-09 11:36:34 UTC
Please note that the term I meant was NON-PRINTING CHARACTERS, not "special characters"...
Comment 5 Gudmund 2011-04-16 09:49:45 UTC
(In reply to comment #3)
> *** Bug 31509 has been marked as a duplicate of this bug. ***

(In reply to comment #2)
> Hi Tor, 
> 
> Thank you for your comments. I think you've indeed understood what I was on
> about, but:
> 
> I understand what you're talking about as regards LibO/OOo's internal storage.
> 
> However, that is invisible to the end user.

Indeed, unlike plain text, where you actually can search for and replace newlines (LF), carriage returns (CR) or combinations (CRLF) if you use the right text handling tools.

Some Unicode pointers:
 LF:    Line Feed, U+000A
 CR:    Carriage Return, U+000D
 CR+LF: CR (U+000D) followed by LF (U+000A)
 NEL:   Next Line, U+0085
 LS:    Line Separator, U+2028
 PS:    Paragraph Separator, U+2029

(I wonder how LibreOffice handles plain text files internally, since those characters really *are* there then...)

> I, the dumb end user, pressed the carriage return key while typing. I don't
> care how the software stores it. But I want to be able to search for that
> carriage return after.
> 
> Same thing when I press Shift-Enter (a "new line" or "soft return"). I want to
> be able to find those "new line" "characters" after.

I can't see why LibreOffice couldn't handle these things by allowing the user an easy way to *both* search *and* replace arbitrary combinations of CR and LF, by handling these things inside the content.xml.

> Since I made those keystrokes and they have a result on-screen, they are
> obviously being stored in some form or other. Otherwise, next time I open it,
> my doc would look different from the way it looked when I typed it, no? ;-)


This is what it can look like:
"<text:p text:style-name="Standard">Two paragraphs starting with this line ending</text:p>
<text:p text:style-name="Standard"/>
-<text:p text:style-name="Standard">Two newlines starting with this line ending<text:line-break/>
<text:line-break/>"

It looks like there may be a few cases to handle. Paragraphs seem to have and opening tag (<text:p text:style-name="Standard"/>), and a closing tag (</text:p>) only if there was text in the line, while newlines only have closing tags (<text:line-break/>).

Writing a textutils script that can handle this simple example is a bit of work, but surely not too hard, even for a non-programmer like me, so a pro like the LibreOffice developers shouldn't find it hard at all ;). 

The only potential problem I can see in this simple example, is the "Standard" style-name bit inside the opening tag. Is there a policy for this in LibreOffice, like using the closest preceding one, or polling a standard template?

My guess at why LibreOffice handles it this way, is that it helps make it handle paragraphs, newlines etc. in a uniform way across platforms that have different ways of handling new lines.

> I sometimes need to search for the "new line" characters and replace them with
> a "carriage return" and thus create new paragpaphs, etc. Or I need to search
> for 8 space characters and replace them with a "tab mark" instead.

You're not alone in this. It's a showstopper for me too and many others, reducing LibreOffice to a very limited number of tasks, forcing me to keep MS Office, which I want to get rid of.
Comment 6 Björn Michaelsen 2011-12-23 11:34:04 UTC
[This is an automated message.]
This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it
started right out as NEW without ever being explicitly confirmed. The bug is
changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back
to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases.
Details on how to test the 3.5.0 beta1 can be found at:
http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1

more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
Comment 7 sasha.libreoffice 2012-03-23 07:28:49 UTC
In 3.5.1 not implemented yet
> - carriage return
> - new line
> - tab mark
IMHO it is more easy to add to context Help and tooltips information how to search for these characters using regular expressions than actually implement them.
Similarly for replacing for them.

Problem is with this:
> - page break
> - non-breaking spaces
I do not know how to find them using regular expressions.
Comment 8 Gryllida 2012-04-26 17:43:22 UTC
Implementing graphical user interface (drop-down list) for at least the existing regular expressions, such as \t, \n, $, ^, would be useful to novice users.

There is an add-on ("alternative find and replace" [1]) which does the job, (including probable workarounds of the way LibreOffice stores text? it can actually handle \n in a way different from what the regular expressions page [2] says); it can probably be helpful to implement this bug.

[1] http://extensions.openoffice.org/en/project/AltSearch
[2] http://help.libreoffice.org/Common/List_of_Regular_Expressions
Comment 9 QA Administrators 2014-10-23 17:32:08 UTC
Please read this message in its entirety before responding.

Your bug was confirmed at least 1 year ago and has not had any activity on it for over a year. Your bug is still set to NEW which means that it is open and confirmed. It would be nice to have the bug confirmed on a newer version than the version reported in the original report to know that the bug is still present -- sometimes a bug is inadvertently fixed over time and just never closed.

If you have time please do the following:
1) Test to see if the bug is still present on a currently supported version of LibreOffice (preferably 4.2 or newer).
2) If it is present please leave a comment telling us what version of LibreOffice and your operating system.
3) If it is NOT present please set the bug to RESOLVED-WORKSFORME and leave a short comment telling us your version and Operating System

Please DO NOT
1) Update the version field
2) Reply via email (please reply directly on the bug tracker)
3) Set the bug to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case)

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 
LibreOffice is powered by a team of volunteers, every bug is confirmed (triaged) by human beings who mostly give their time for free. We invite you to join our triaging by checking out this link:
https://wiki.documentfoundation.org/QA/BugTriage

There are also other ways to get involved including with marketing, UX, documentation, and of course developing -  http://www.libreoffice.org/get-help/mailing-lists/. 

Lastly, good bug reports help tremendously in making the process go smoother, please always provide reproducible steps (even if it seems easy) and attach any and all relevant material
Comment 10 sasha.libreoffice 2014-10-24 11:08:51 UTC
in 4.3.1.2 not implemented yet
Comment 11 Adolfo Jayme 2014-12-25 11:17:11 UTC
*** Bug 87645 has been marked as a duplicate of this bug. ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.