Macro for SwordSearcher to USFM
Posted: 2013-05-01 01:03:23
Hi,
I'm trying to automate conversion of Bible text files from SwordSearcher (SS) format to USFM ([url]hhttp://paratext.org/about/usfm[/url]; same in PDF: http://paratext.org/system/files/usfmReference2_35.pdf). I don't need to include all USFM markers, of course (that would be very involved, and SS uses almost none of those markups anyway) . I want to do this because I can then output the USFM to .rtf files and have imbedded footnotes and nice text I can easily manipulate in Nisus Writer Pro.
The basic format for SS has "$$", the book name abbreviation with chapter number, then a colon and the verse number. The "¶ " [pilcrow + space] indicates beginning of a paragraph. The data between curly brackets {} is footnote data. SS's format is very straightforward, but more information is available in the help file that comes with Forge (module builder software) for SwordSearcher, which can be downloaded here: http://www.swordsearcher.com/forge/index.html. Unfortunately, it's a Windows-only app.
The format for USFM is very different. Genesis 1:1-4 would look like like the sample below. The last item in the beginning marker is a space and the last item in a closing marker is an asterisk (\nd …\nd* stand for names of diety; in the SS sample above it is presented merely in all caps, in USFM with these tags)
This 2nd USFM sample layout is better because it embeds in each footnote the chapter and verse (e.g., "1:4") to which the footnote refers (I don't know how to make a macro do this). It's marked up with the "fr 1:4 \ft " tags and data.
I am learning how to do macros, and I've successfully done a very basic one that can renumber the verses in one chapter. My problem is that I need a macro that will be able to do an entire book of the Bible at a time and add the chapter numbers in there (\c and the #) before a new verse #1 starts in the subsequent chapter. I don't know how to make that kind of macro. Here is my Regex (PowerFind Pro) macro that changes references without getting paragraphs right, and that marks up footnotes, but without putting in the chapter and verse reference in the footnote as I would like [see 2nd USFM sample above]).
Problems I'm having:
1. Getting the chapter number inserted properly for books with more than one chapter. I've attached a sample SS file for Romans 1:1-12:2 (unfortunately, this file was made before I started putting in pilcrows for beginnings of paragraphs, so if someone uses this, it would be good to insert some pilcrows randomly at the beginning of various verses to see if the \p marker is being converted correctly).
2. Getting the paragraph (prose) marker (\p ) on the line before the verse number in USFM (it follows a verse # in SS).
3. I would really like to get the reference of the verse into the footnote (as in 2nd USFM example above), so when one looks at a note at the bottom of the page (in NWP, after I export to .rtf), one can see immediately that footnote "a" is a comment on 1:4. To do this, "\fr 1:1 \ft " must be added to the footnote text. I don't know how to do that dynamically so the chapter and verse numbers are right.
4. It's not essential, but it'd be nice if the macro could convert the SSbook names (e.g., "Ge") to the proper USFM book names (e.g., "id\ GEN") at the top of the text. The abbreviations are in a .csv file attached.
I realize this is quite a project (at least to me), and I'll be grateful for any help. Thanks in advance!
I'm trying to automate conversion of Bible text files from SwordSearcher (SS) format to USFM ([url]hhttp://paratext.org/about/usfm[/url]; same in PDF: http://paratext.org/system/files/usfmReference2_35.pdf). I don't need to include all USFM markers, of course (that would be very involved, and SS uses almost none of those markups anyway) . I want to do this because I can then output the USFM to .rtf files and have imbedded footnotes and nice text I can easily manipulate in Nisus Writer Pro.
The basic format for SS has "$$", the book name abbreviation with chapter number, then a colon and the verse number. The "¶ " [pilcrow + space] indicates beginning of a paragraph. The data between curly brackets {} is footnote data. SS's format is very straightforward, but more information is available in the help file that comes with Forge (module builder software) for SwordSearcher, which can be downloaded here: http://www.swordsearcher.com/forge/index.html. Unfortunately, it's a Windows-only app.
Code: Select all
$$ Ge 1:1
¶ In the beginning GOD created the heaven and the earth.
$$ Ge 1:2
And the earth was without form, and void; and darkness [was] upon the face of the deep. And the Spirit of GOD moved upon the face of the waters.
$$ Ge 1:3
And GOD said, Let there be light: and there was light.
$$ Ge 1:4
And GOD saw the light, that [it was] good: and GOD divided {the light from...: Heb. between the light and between the darkness}the light from the darkness.
Code: Select all
\id GEN
\c 1
\p
\v 1
In the beginning \nd God\nd* created the heaven and the earth.
\v 2
And the earth was without form, and void; and darkness [was] upon the face of the deep. And the Spirit of \nd God\nd* moved upon the face of the waters.
\v 3
And \nd God\nd* said, Let there be light: and there was light.
\v 4
And \nd God\nd* saw the light, that [it was] good: and \nd God\nd* divided \f + the light from...: Heb. between the light and between the darkness\f*the light from the darkness.
Code: Select all
\id GEN
\c 1
\p
\v 1
In the beginning \nd God\nd* created the heaven and the earth.
\v 2
And the earth was without form, and void; and darkness [was] upon the face of the deep. And the Spirit of \nd God\nd* moved upon the face of the waters.
\v 3
And \nd God\nd* said, Let there be light: and there was light.
\v 4
And \nd God\nd* saw the light, that [it was] good: and \nd God\nd* divided \f + fr 1:4 \ft the light from...: Heb. between the light and between the darkness\f*the light from the darkness.
Code: Select all
Find and Replace '\\$\\$ [[:upper:]][[:lower:]]+ [[:digit:]]+:', '\\\\v ', 'Ea'
Find and Replace '{', '\\f + ', 'a'
Find and Replace '}', '\\f*', 'a'
1. Getting the chapter number inserted properly for books with more than one chapter. I've attached a sample SS file for Romans 1:1-12:2 (unfortunately, this file was made before I started putting in pilcrows for beginnings of paragraphs, so if someone uses this, it would be good to insert some pilcrows randomly at the beginning of various verses to see if the \p marker is being converted correctly).
2. Getting the paragraph (prose) marker (\p ) on the line before the verse number in USFM (it follows a verse # in SS).
3. I would really like to get the reference of the verse into the footnote (as in 2nd USFM example above), so when one looks at a note at the bottom of the page (in NWP, after I export to .rtf), one can see immediately that footnote "a" is a comment on 1:4. To do this, "\fr 1:1 \ft " must be added to the footnote text. I don't know how to do that dynamically so the chapter and verse numbers are right.
4. It's not essential, but it'd be nice if the macro could convert the SSbook names (e.g., "Ge") to the proper USFM book names (e.g., "id\ GEN") at the top of the text. The abbreviations are in a .csv file attached.
I realize this is quite a project (at least to me), and I'll be grateful for any help. Thanks in advance!