How to split text at occurrences of attribute

Everything related to our flagship word processor.
Post Reply
mmt
Posts: 5
Joined: 2007-12-06 21:00:34

How to split text at occurrences of attribute

Post by mmt »

I have a set of 26 long texts written in Nisus Classic (a specialized dictionary/encyclopedia, actually) which consists of many items. An item may be a line or several pages long, but each item starts with a line that has an attribute "item". The 26 texts are for items starting with each letter of the alphabet. I'm trying to find a way of automatically extracting each item into a separate file in a folder, such that the file has the name and text of the item.

For example the text in the file "D" starts with (using pseudo XML to indicate the attribute):

<item>D,d</item>
lots of text in many paras, but with no instances of the "item" attribute
<item>dactylology</item>
a couple of lines
<item>dagger</item>
more text
<item>...

I would like to be able at least to separate these into Nisus open windows called "D,d" "dactylology", "dagger",..., (there would be a lot of windows), and preferably to write them out automagically to files with those names.

The first step would seem to be to select all text between one occurrence of the pattern <item>xxx</item> and the next. I haven't even been able to produce that selection looking for the first item text in a manual find with either PowerFind or PowerFind Pro, which makes it rather hard to try to start writing a macro to do the job for all the items. :?
User avatar
martin
Official Nisus Person
Posts: 5228
Joined: 2002-07-11 17:14:10
Location: San Diego, CA
Contact:

Post by martin »

What do you mean when you say attribute exactly? Are there literal character codes that surround the item? Or is there some sort of special formatting attribute applied? If so, what is the formatting?
mmt
Posts: 5
Joined: 2007-12-06 21:00:34

Post by mmt »

I mean that, in the example, "D,d" "dactylology" and "dagger" have paragraph style "item". Nothing in between has that paragraph style, though the intervening material, which could be anything from one line to many pages, may have other paragraph and character styles. I'd like to wind up eventually with a file called "D,d" with content
D,d
lots of text in many paras, but with no instances of the "item" attribute
another file called "dactylology" with content
dactylology
a couple of lines
etc. etc.

There are other things I want to do with this material, including hyperlinking across items, but that's a start.
User avatar
martin
Official Nisus Person
Posts: 5228
Joined: 2002-07-11 17:14:10
Location: San Diego, CA
Contact:

Post by martin »

Here's a skeleton macro that should get you going:

Code: Select all

# move to start of document
Set Selection 1, 0

# use attribute sensitive find to visit all items
While Find ".+", "Eu-W"
    $itemText = Read Selection

    # Calculate the text that follows the item. 
    Select End
    $entryStart = Selection Location
    If Find ".+", "Eu-W"
        $entryEnd = Selection Location
    Else
        $entryEnd = Selected Storage Length
    End
    $entryLen = $entryEnd
    $entryLen -= $entryStart
    Set Selection $entryStart, $entryLen
    $entryText = Read Selection

    # create a new document for the entry
    New
    Type Attributed Text $itemText
    Type Newline
    Type Attributed Text $entryText
    Save As "~/Desktop/$itemText.rtf"
    Close
End
After pasting this code into a new macro, it's important that you apply the proper attributes to both of the "Find" commands:

1. Select both of the paragraphs using the Find commands in their entirety, eg: the ones starting with While and If.
2. Use the menu Format > Remove Attributes and Styles.
3. Apply your special "item" paragraph style.
mmt
Posts: 5
Joined: 2007-12-06 21:00:34

Post by mmt »

Wow, Thanks a lot. That's above and beyond the call of duty!

I think I understand it except for "Eu-W", which I can doubtless find in the manual. I'll see if I can make it work.
mmt
Posts: 5
Joined: 2007-12-06 21:00:34

Post by mmt »

Well, I did find out what Eu-W means, and I tried to run the macro, having done as you said with the Format of the Find lines. It apparently didn't find anything, so I tried a bit of debugging, using a Prompt as a debugging tool (I imagine there's a better way to debug Macros, but I didn't find it in the Manual).

I tried putting before the "while" loop an attribute-free Find of a word I knew to be in the text. It found the word and presented the Prompt. I then substituted the attribute-free Find with the Styled one in the while loop, and redid the reformatting to make it look for the "item" Paragraph Style, but this still didn't Find anything. I checked that the Paragraph Style of the target items was the same as the Paragraph Style I applied to the Find line. At this point I couldn't think of what to check for next.

All this was made very difficult because NW Pro kept crashing every second or third time I saved a revised version of the Macro and tried to reload it. So I've given up for the time being after the fourth such crash.

(In case the crashing might be due to some bug, here's my setup)

NWPro 1.0.2 on Mac OS X 10.4.10, Dual NewerTech 1.8 Ghz 7448 PPC processors on what was a dual 500 MHz Gigabit Ethernet machine, 2 Gb RAM, 2 internal and 3 Firewire disks, ATI Radeon 9000 Graphics Card.
User avatar
martin
Official Nisus Person
Posts: 5228
Joined: 2002-07-11 17:14:10
Location: San Diego, CA
Contact:

Post by martin »

After you trigger a crash, please send in a feedback report using the menu Help > Send Feedback. In this case it would help to have a copy of your macro and target document- I can also take a look and see why your attribute-sensitive find is failing.
mmt
Posts: 5
Joined: 2007-12-06 21:00:34

Post by mmt »

Boy, are you on the ball! I'll do as you say.
Post Reply