extracting numerical data from a text file

Get help using and writing Nisus Writer Pro macros.
Post Reply
r1234
Posts: 1
Joined: 2008-12-28 01:14:28

extracting numerical data from a text file

Post by r1234 »

Hi, woud there be a simple way of selecting the numbers in this data at the far right of each line (i.e. the 45.20, 128.39, 200.00 etc - and being able to output those numbers to a new file or to reformat the text so those numbers appear in their own column whose sum could be totalled? (I'd be doing this on much larger, but similarlly structured files all extracted from PDF's.
Thanks for any suggestions for solving this!

11/07/2008 AMAZON.COM AMZN.COM/BILL WA 45.20
11/07/2008 OTHER WORLD COMPUTINWOODSTOCK IL 128.39
11/06/2008 DIRECT RELIEF INTL OGOLETA CA 200.00
11/04/2008 COSTCO GAS #00479 94CULVER CITY CA 36.93
11/04/2008 COSTCO WHSE #00479 9LOS ANGELES CA 262.88
Groucho
Posts: 497
Joined: 2007-03-03 09:55:06
Location: Europe

Re: extracting numerical data from a text file

Post by Groucho »

Hi,
This macro extracts the data you need and pastes them into a new document. (Note: shamelessly adapted from a macro of Martin's).

Code: Select all

# gather all words
Find All '(\d+\.\d+$)', 'E'
$doc = Document.active
$sels = $doc.textSelections

# create new document with all words
New
ForEach $sel in $sels
   $word = $sel.subtext
   Type Text $word
End
And this converts the text into a table and puts your data in a column.

Code: Select all

 Find and Replace '(\s)(\d+\.\d+$)', '\t\2', 'Ea'
Select All

Convert to Table
Any trouble let me know.

Greetings, Henry.
Groucho
Posts: 497
Joined: 2007-03-03 09:55:06
Location: Europe

Re: extracting numerical data from a text file

Post by Groucho »

Oops… there was an error in the first macro, sorry. This should work:

Code: Select all

# gather all words
Find All '(\d+\.\d+\n)', 'E'
$doc = Document.active
$sels = $doc.textSelections

# create new document with all words
New
ForEach $sel in $sels
   $word = $sel.subtext
   Type Text $word
End
Best Regards, Henry.
Kino
Posts: 400
Joined: 2008-05-17 04:02:32

Re: extracting numerical data from a text file

Post by Kino »

For that purpose, you don't need a foreach loop as NW Pro supports non-contiguous selections. The following is sufficient.

Code: Select all

Find All '\d+\.\d+(?:\n|$)', 'E'
Copy
New
Paste
The find expression contains $ in order to get the last data even if it is not followed by a newline char.
Groucho
Posts: 497
Joined: 2007-03-03 09:55:06
Location: Europe

Re: extracting numerical data from a text file

Post by Groucho »

That's what happens when I think to myself, "it can't be that easy, can it?"

Thank you
Henry
Post Reply