Page 1 of 1

Discover repeating words

Posted: 2009-07-21 07:58:13
by js
To discover repeating words I use this find expression

Code: Select all

 ((?:\m\w+\M)) \1 
The drawback is that this matches also if the second word is only _beginning_ with the first one, like in “be better”, or “can candy”. How to avoid these?

Re: Discover repeating words

Posted: 2009-07-21 09:36:01
by Groucho
There's many a way to it. Try this:

Code: Select all

(\<[\w]+\>) \1
I hope this is enough.

Greetings, Henry.

Re: Discover repeating words

Posted: 2009-07-21 09:37:11
by Groucho
Groucho wrote:There's many a way to it. Try this:

Code: Select all

(\<[\w]+\>) \1
I hope this is enough. (But I'm sure now Kino will come out with a score of objections and a biblical macro. :) )

Greetings, Henry.

Re: Discover repeating words

Posted: 2009-07-21 10:13:53
by Hamid
js wrote: The drawback is that this matches also if the second word is only _beginning_ with the first one, like in “be better”, or “can candy”. How to avoid these?
You should have 'Whole Word' checked in the Find window.

Re: Discover repeating words

Posted: 2009-07-21 12:33:31
by js
I am afraid that Grouchos alternative does not work better than the one I had. Hamid is right: adding the -w makes both of them work. Thank you.

Re: Discover repeating words

Posted: 2009-07-21 13:06:03
by Groucho
This will work for sure.

Code: Select all

\<(\w+) \1\>
And you don't need to check Whole Word.

Henry.

Re: Discover repeating words

Posted: 2009-07-22 04:20:11
by js
How about improving this macro so that one of the two items is selected, to easily delete if desired?

Re: Discover repeating words

Posted: 2009-07-22 07:21:06
by Groucho
Here it is:

Code: Select all

Find and Replace '\<(\w+) \1\>', '\1', 'Ea'
By the way (for Martin), I tried the macroize expression command and found it escapes already escaped elements. Like this:

Code: Select all

Find and Replace '\\<(\\w+) \\1\>', '\\1', 'Ea'
This is the first time I use the Macroize command, though, so maybe I did something wrong. I will look into it more deeply when I have time.

Greetings, Henry.

Re: Discover repeating words

Posted: 2009-07-22 08:53:23
by js
My idea was to have the macro show the next case of two repeating words, and have one of them already selected so that you can easily delete it. Sometimes your don't want to delete it, f.e. in "He knocked on the door: toc toc." you don't want to.
Of course this can be done in a two line macro. But I wondered if it is also possible in one line.

Re: Discover repeating words

Posted: 2009-07-22 11:30:40
by martin
Groucho wrote:By the way (for Martin), I tried the macroize expression command and found it escapes already escaped elements.
That's not a problem, because the backslashes are first interpreted in the context of the string literal, and then reinterpreted as a PowerFind/regex. For string literals a "\\" sequence means a single backslash. Perhaps the superfluous escaping is confusing, but both macro commands behave the same.
js wrote: My idea was to have the macro show the next case of two repeating words, and have one of them already selected so that you can easily delete it
Here's a command that does that:

Code: Select all

Find '\b(\w+)\s(?=\1\b)', 'E'
I've replaced the literal space with "\s", which matches any single whitespace character, and use "\b" for word boundary (either start or end) instead of the less standard "\<" and "\>".

Re: Discover repeating words

Posted: 2009-07-22 11:44:57
by Groucho
Ah, I thought you wanted to replace all double occurrences at once. I was sloppy, sorry.
Why, if you want to proceed step by step you don't need a macro. You had better use a Find/Replace expression.

PowerFind Pro enabled.
Find what: \<(\w+) \1\>
Replace with: \1

You can save the expression for a future use.
1. Click on the gear next to the "Find what" field and select Save Expression… to save a Find expression.
2. Click on the gear next to the "Replace with" field and select Save Expression… to save a Replace expression.*
3. To find the next occurrence select Edit > Find > Find Next (or hit Command-G).
4. To delete the detested double select Edit > Find > Replace and Find. (I have set the menu key Command-Option-G for this in Preferences > Menu Keys.)

*It'd be great if NWP allowed saving find and replace expressions pairs, like TextWrangler.

Greetings, Henry.

Re: Discover repeating words

Posted: 2009-07-23 06:25:45
by js
Clever. Thank you.

Re: Discover repeating words

Posted: 2009-07-23 08:24:37
by Groucho
You're welcome, js.

And…
martin wrote:That's not a problem, because the backslashes are…
Thanks for the tip, Martin.

Greetings, Henry.