How do I use the new regex? (2.0.9)

Discussion Forums discussion How do I use the new regex? (2.0.9)

This topic contains 0 voices and has 8 replies.

Viewing 9 posts - 1 through 9 (of 9 total)
Author Posts
Author Posts
July 15, 2008 at 9:22 pm #447

jumpfroggy
Member

I just installed 2.0.9.806, and tried to search for (.+), which in the old PN would search for an entire line. Now ( searches for a parenthesis instead of grouping the match.

However, when I change it to (.+) it highlights from the cursor to the end of the file, and then the second search highlights the entire file. Same thing happens with ^.+$. Maybe the problem is with the multi-line flag not being set for the regex?

Is there any documentation for the regex used in PN? I’d be glad to help make the wiki page, if the wiki would let me into my account (see my other thread). Thanks.

July 16, 2008 at 9:20 am #16017

simon
Key Master

Hmm, the line-end marker does seem to be broken now. It was definitely working at one point, so I must have broken it again. The bol marker ^ still works fine, but the eol marker ‘$’ seems to be matching only the very last character in the file. I’ll fix it, and have opened this bug:

http://code.google.com/p/pnotepad/issues/detail?id=120

July 16, 2008 at 9:22 am #16018

simon
Key Master

Oh, and you’re right in that PN now uses standards-conformant syntax so grouping is done with plain parens:

Capture group: ()

Left paren: (

Right paren: )

July 17, 2008 at 8:13 am #16019

simon
Key Master

Ah, I see what’s happening. The reason for the whole file being matched in this case is that regular expressions are by default greedy and the ‘.’ character is matching line breaks too. This means that the regular expression keeps gobbling characters until it can’t match any more. This means that you get the first line start you encounter up to the last line end you encounter.

To make regex match in non-greedy mode you use the ? character after either * or +. Your expression, therefore, would be:

^.+?$

This then correctly matches just one line.

July 17, 2008 at 8:14 am #16020

simon
Key Master

The regular expressions engine can be configured to not match . against newline characters – perhaps that would be a good idea?

July 17, 2008 at 8:19 am #16021

simon
Key Master

I’ve just realised that replacing $ with something doesn’t do what I expected, it simply inserts after the end-of-line.

This is because end-of-line matches are zero-width meaning that they simply match the space between the CRLF and the first character on the next line.

What I’m not sure of is how regular expressions would normally be used to replace empty lines? Replacing ^$ just inserts after empty lines – the behaviour in PN is pretty close to what Visual Studio does.

Anyone any examples of expected behaviour?

July 31, 2008 at 10:15 am #16022

kwantum
Member

Hi guys

I’m using version 2.0.9

Ok, I’m new to this whole search/replace pattern matching stuff. I did read the wiki pages and I now have a pretty good idea of how to use regex. I started with something simple, of course, like the “Fred2XXX” example, but I didn’t get the same results: nothing was replaced.

I’m trying to replace this:

*ORIENTATION 1.20

with this:

<Rotations y = “1.20″ />

Would somebody mind showing me another example?

July 31, 2008 at 11:01 am #16023

simon
Key Master

Find: *ORIENTATION ([0-9.]+)

Replace: <Rotations y = "1" />

Explanation of “Find”:

We start with * because * is a special regular expressions character and putting the in front says that you want the character not the special character.

ORIENTATION matches exactly that, and then we come to the bit that captures the number. The parentheses around the expression () say that you want to use the bit of text that you find later. As it’s the first set of parentheses this will be capture group 1.

The bit inside the parentheses [0-9.]+ is a character set, including digits 0-9 and the . character. This is another special character so you need to excape it again (using the ). This matches any one of those characters, and the plus sign afterwards says to include one or more of these characters.

Explanation of “Replace”:

Most of the replacement is just text, the only special bit is 1 that gets the text from the group 1 that you stored in the find text.

Hope that helps!

July 31, 2008 at 11:03 am #16024

simon
Key Master

To update the conversation above about multi-line regular expressions, I found a bug in the Boost Xpressive regular expressions library and this has now been fixed so we should get some more sane results. You still can’t replace ^$ with a blank line though, and I don’t think that’s a valid use case for regular expressions. In VIM you have to combine that search with an action to run on each match that deletes the line.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic.