Bug 52504 - Regular Expression Search for circumflex by itself does not match anything
Summary: Regular Expression Search for circumflex by itself does not match anything
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version: 3.5.3 release
Hardware: All Linux (All)
: medium enhancement
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-25 21:05 UTC by Jim Avera
Modified: 2014-11-05 16:52 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Jim Avera 2012-07-25 21:05:07 UTC
What is expected to happen in Writer, Calc, or the Macro Editor is when one opens the Find & Replace window, with Regular Expression checkbox checked, in the Search for drop down put a circumflex in, and the beginning of every paragraph is found. Consulting the LO Wiki and built-in LO help, it is implied that using a circumflex by itself in the find field should match the beginning of a paragraph:
http://help.libreoffice.org/Common/List_of_Regular_Expressions

What happens instead is nothing is found.

NOTE: A dollarsign ($) by itself *does* work as expected, i.e., it matches the end of each line.
Comment 1 Cor Nouws 2012-10-06 20:51:47 UTC
Hi Jim,

Pls use    "^."  (without the quotes) to find the first character of a paragraph.
I think the ^ only is used in combinations.
See some examples/explanation in the help .

Regards,
Cor
Comment 2 Jim Avera 2012-10-07 07:56:53 UTC
No.   ^. is not equivalent.  ^. means to match the first character on the line, and if doing a replace then the first character would be deleted.  ^ by itself matches the start of the line (not including any characters), and replacing it with something effectively inserts the "replacement" text at the start of the line.    You could use something ugly like replacing ^(.) with ${1}PREFIX to avoid deleting the first character, but that would fail on blank lines which don't have any characters in them.


In any case, ^ (by itslef) is a standard, well-defined regular expression syntax used everywhere else (Perl, Python, vim etc. etc.) and Libre Office should not do something incompatible.
Comment 3 Jim Avera 2012-10-07 08:01:57 UTC
If you are unsure how regular expression syntax should work (in industry-wide practice), there are many books and online references, for example

http://en.wikipedia.org/wiki/Regular_expression#POSIX_Basic_Regular_Expressions
Comment 4 Cor Nouws 2012-10-07 22:20:08 UTC
Hi Jim,

OK, sorry & thanks for explanantion. (In the mena time I understood that the same applies for $, that cannot be used on itself to find the end of a paragraph).
Did it ever work as is expected, or is it something that has to be implemented..
In that case, this would be an enhancement...
Comment 5 Jim Avera 2012-10-09 00:37:57 UTC
AFAIK ^ has never worked correctly.  I doubt anyone intentionally made Open Office regular expressions incompatible with industry practice, so I think this is a bug, not a missing feature.
 
-Jim
Comment 6 Jim Avera 2012-10-09 00:58:11 UTC
Incidentally $ does match the end of paragraphs (as documented), but seems to match the paragraph break (not just tne -position- at the end of the paragraph), so paragraphs are merged forming a single new paragraph.  Except only one of a group of successive empty paragraphs is matched.

Matching the para-break itself seems odd to me (as usually unhelpful), but might be intentional.  However the fact that only some empty paragraphs are matched is almost certainly a bug.

EXAMPLE: In the following 1-line paragraphs, there are two empty paras between b and c (<P> indicates the paragraph symbol which is shown when displaying non-printing characters):
a<P>
b<P>
<P>
<P>
c<P>
Find-and-replace of $ with X replaces the 5 paragraphs with 2 paragraphs:
aXbX<P>
Xc<P>
As you can see, the 5 paragraphs were collapsed into two paragraphs, except the "paragraph break" was not removed for one of the empty paragraps.
Comment 7 Jim Avera 2014-05-15 07:38:55 UTC
Any thoughts about fixing this?   It's still a problem in 4.3-alpha1

Note that searching for ^. is not a work-around because it will not match the start of empty paragraphs (the "." does not match).   So if you want to prepend something to every paragraph in a selection which includes empty paragraphs, then ^ alone is necessary.
Comment 8 Cor Nouws 2014-06-22 13:34:30 UTC
Isn't your case just covered by using
 & in search and
 \nFOO in replace?

For me that works in Writer
Comment 9 Jim Avera 2014-06-22 22:47:25 UTC
> Isn't your case just covered by using
> & in search and
> \nFOO in replace?

Maybe that was a typo.  The above does not work (does nothing--not matched).
Can you suggest a work-around which inserts some text at the start of every line in Calc's Basic macro editor (including empty lines)?   That's the problem this bug was originally about and which *should* be easy by replacing ^ with the desired text.  That is standard regex behavior everywhere else in computerdom.

^ on its own should work (just like $ on its own does).
Comment 10 Cor Nouws 2014-06-29 18:58:35 UTC
(In reply to comment #9)

> Can you suggest a work-around which inserts some text at the start of every
> line in Calc's Basic macro editor (including empty lines)? 

The component of this issue is Writer .. ?
Comment 11 Jim Avera 2014-06-30 15:56:09 UTC
Not sure where the regex code is.   It manifests in writer and and ing Basic macro editor in Calc.
Comment 12 Cor Nouws 2014-11-04 20:53:06 UTC
Still a problem in 4.4.0alpha1
 > New
Comment 13 Jim Avera 2014-11-05 16:52:58 UTC
Maybe Component should be changed to Spreadsheet, because the problem is more simply visible when editing Basic macro code.  It is common to want to insert spaces at the start of every line in a range (e.g. to "indent" the code one level), and replacing ^ with spaces does not work.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.