Page 1 of 1

Why is this macro so slow?

Posted: 2009-07-11 18:52:28
by Nobumi Iyanaga
Hello,

I have two version of a macro which lists up all the style attributes of a given text. One of them works rather quickly [version a], while the other [version b], which seems probably simpler, is VERY slow. Could you please indicate me why the latter is so slow, and how I can get it work more quickly?

Here is the version a:

Code: Select all

# Begin macro 'nwp_1-2_to_utf8_test1'

Require Application Version "3.2"
$doc = Document.active
$text = $doc.text
$text = $text.copy
$footnotes = $doc.footnotes
$endnotes = $doc.endnotes
$section_notes = $doc.sectionnotes

$footnotes_ct = $footnotes.count
$endnotes_ct = $endnotes.count
$sectionnotes_ct = $section_notes.count

if $footnotes_ct
	while $footnotes_ct
		$this_footnote = $footnotes[$footnotes_ct - 1]
		$fn_text = $this_footnote.contentSubtext
		$fn_text.findAndReplace '^\s+', '', 'E'
		$fn_text = "<fn>" & $fn_text
		$fn_text &= "</fn>"
		$fn_ref = $this_footnote.documentTextRange
		$text.replaceInRange $fn_ref, $fn_text
		$footnotes_ct -= 1
	End
End

if $endnotes_ct
	while $endnotes_ct
		$this_endnote = $endnotes[$endnotes_ct - 1]
		$en_text = $this_endnote.contentSubtext
		$en_text.findAndReplace '^\s+', '', 'E'
		$en_text = "<en>" & $en_text
		$en_text &= "</en>"
		$en_ref = $this_endnote.documentTextRange
		$text.replaceInRange $en_ref, $en_text
		$endnotes_ct -= 1
	End
End

if $sectionnotes_ct
	while $sectionnotes_ct
		$this_sectionnote = $section_notes[$sectionnotes_ct - 1]
		$sn_text = $this_sectionnote.contentSubtext
		$sn_text.findAndReplace '^\s+', '', 'E'
		$sn_text = "<sn>" & $sn_text
		$sn_text &= "</sn>"
		$sn_ref = $this_sectionnote.documentTextRange
		$text.replaceInRange $sn_ref, $sn_text
		$sectionnotes_ct -= 1
	End
End

$charIndex = 0 
$limit = $text.length
$ranges = Array.new
$attribute_text = ""

# inspect the attributes applied to every character in the text 
While $charIndex < $limit 
	$range = $text.rangeOfDisplayAttributesAtIndex($charIndex) 
	$ranges.push ($range)
	$attributes = $text.displayAttributesAtIndex($charIndex)
	$attribute_text &= $attributes

	# move to the next area that has different attributes 
	$charIndex = $range.bound 
End 

$tmp_file = '/Users/me/Desktop/test_things/NWP_rtf_to_utf8/test_aby.txt'
$separator = Text.newWithCodepoint (2)
File.writeDataToPath “$text $separator $ranges $separator $attribute_text”, $tmp_file
exit
and the version b:

Code: Select all

# Begin macro 'nwp_1-2_to_utf8_test2'

Require Application Version "3.2"
$doc = Document.active
$text = $doc.text
$text = $text.copy
$footnotes = $doc.footnotes
$endnotes = $doc.endnotes
$section_notes = $doc.sectionnotes

$footnotes_ct = $footnotes.count
$endnotes_ct = $endnotes.count
$sectionnotes_ct = $section_notes.count

if $footnotes_ct
	while $footnotes_ct
		$this_footnote = $footnotes[$footnotes_ct - 1]
		$fn_text = $this_footnote.contentSubtext
		$fn_text.findAndReplace '^\s+', '', 'E'
		$fn_text = "<fn>" & $fn_text
		$fn_text &= "</fn>"
		$fn_ref = $this_footnote.documentTextRange
		$text.replaceInRange $fn_ref, $fn_text
		$footnotes_ct -= 1
	End
End

if $endnotes_ct
	while $endnotes_ct
		$this_endnote = $endnotes[$endnotes_ct - 1]
		$en_text = $this_endnote.contentSubtext
		$en_text.findAndReplace '^\s+', '', 'E'
		$en_text = "<en>" & $en_text
		$en_text &= "</en>"
		$en_ref = $this_endnote.documentTextRange
		$text.replaceInRange $en_ref, $en_text
		$endnotes_ct -= 1
	End
End

if $sectionnotes_ct
	while $sectionnotes_ct
		$this_sectionnote = $section_notes[$sectionnotes_ct - 1]
		$sn_text = $this_sectionnote.contentSubtext
		$sn_text.findAndReplace '^\s+', '', 'E'
		$sn_text = "<sn>" & $sn_text
		$sn_text &= "</sn>"
		$sn_ref = $this_sectionnote.documentTextRange
		$text.replaceInRange $sn_ref, $sn_text
		$sectionnotes_ct -= 1
	End
End

$charIndex = 0 
$limit = $text.length
$ranges = Array.new
$attribute_text = ""
$black_text = 'textColor=<Red=0.000, Green=0.000, Blue=0.000, Opacity=1.000>, textBackgroundColor=<Red=1.000, Green=1.000, Blue=1.000, Opacity=1.000>'

$the_delimiter_a = Text.newWithCharacter(2)
$the_delimiter_b = Text.newWithCharacter(3)

# inspect the attributes applied to every character in the text 
While $charIndex < $limit
	$attributes = ''
	$range = $text.rangeOfDisplayAttributesAtIndex($charIndex) 
	$attributes &= $text.displayAttributesAtIndex($charIndex)
	$attributes.findAndReplace $black_text, ''
	$attribute_text &= $attributes
	$this_text = $text.subtextInRange ($range)
	$this_text = $the_delimiter_a & $this_text
	$this_text &= $the_delimiter_b
	$attribute_text &= $this_text

	# move to the next area that has different attributes 
	$charIndex = $range.bound 
End 


$tmp_file = '/Users/me/Desktop/test_things/NWP_rtf_to_utf8/test_abz.txt'
#$separator = Text.newWithCodepoint (2)
File.writeDataToPath $attribute_text, $tmp_file
exit
Thank you in advance.

Re: Why is this macro so slow?

Posted: 2009-07-11 19:48:01
by Kino
Use “$this_text = $text.substringInRange ($range)” instead of “$this_text = $text.subtextInRange ($range)”. Appending attributed text using & is very expensive.

By inserting “Debug.setCodeProfilingEnabled true” at the top of macro, you can see which commands are slow. Very kindly Martin made available this new command probably because I had harassed him repeatedly and insistently with feedbacks related to macro performance ;-)

Re: Why is this macro so slow?

Posted: 2009-07-11 20:53:50
by Nobumi Iyanaga
Hello Kino,

Thank you for your reply. I tried with your advice, and found some problems. The following code is still slow, but it is much faster than the last one. I found that the most time consuming commands are string concatenations, and findAndReplace...

Code: Select all

# Begin macro 'nwp_1-2_to_utf8_test2'

Require Application Version "3.2"
# Debug.setCodeProfilingEnabled true
# Debug.setDestination "new"
$doc = Document.active
$text = $doc.text
$text = $text.copy
$footnotes = $doc.footnotes
$endnotes = $doc.endnotes
$section_notes = $doc.sectionnotes

$footnotes_ct = $footnotes.count
$endnotes_ct = $endnotes.count
$sectionnotes_ct = $section_notes.count

if $footnotes_ct
	while $footnotes_ct
		$this_footnote = $footnotes[$footnotes_ct - 1]
		$fn_text = $this_footnote.contentSubtext
		$fn_text.findAndReplace '^\s+', '', 'E'
		$fn_text = "<fn>" & $fn_text
		$fn_text &= "</fn>"
		$fn_ref = $this_footnote.documentTextRange
		$text.replaceInRange $fn_ref, $fn_text
		$footnotes_ct -= 1
	End
End

if $endnotes_ct
	while $endnotes_ct
		$this_endnote = $endnotes[$endnotes_ct - 1]
		$en_text = $this_endnote.contentSubtext
		$en_text.findAndReplace '^\s+', '', 'E'
		$en_text = "<en>" & $en_text
		$en_text &= "</en>"
		$en_ref = $this_endnote.documentTextRange
		$text.replaceInRange $en_ref, $en_text
		$endnotes_ct -= 1
	End
End

if $sectionnotes_ct
	while $sectionnotes_ct
		$this_sectionnote = $section_notes[$sectionnotes_ct - 1]
		$sn_text = $this_sectionnote.contentSubtext
		$sn_text.findAndReplace '^\s+', '', 'E'
		$sn_text = "<sn>" & $sn_text
		$sn_text &= "</sn>"
		$sn_ref = $this_sectionnote.documentTextRange
		$text.replaceInRange $sn_ref, $sn_text
		$sectionnotes_ct -= 1
	End
End

$charIndex = 0 
$limit = $text.length
$ranges = Array.new
$attribute_text = ""
$black_text = 'textColor=<Red=0.000, Green=0.000, Blue=0.000, Opacity=1.000>, textBackgroundColor=<Red=1.000, Green=1.000, Blue=1.000, Opacity=1.000>'

$the_delimiter_a = Text.newWithCharacter(2)
$the_delimiter_b = Text.newWithCharacter(3)

# inspect the attributes applied to every character in the text 
While $charIndex < $limit
	$attributes = ''
	$range = $text.rangeOfDisplayAttributesAtIndex($charIndex) 
	$attributes &= $text.displayAttributesAtIndex($charIndex)
#	$attributes.findAndReplace $black_text, ''
#	$attribute_text &= $attributes
	$this_text = $text.substringInRange ($range)
#	$this_text = $the_delimiter_a & $this_text
#	$this_text &= $the_delimiter_b
	$attribute_text &= "$attributes$the_delimiter_a$this_text$the_delimiter_b"

	# move to the next area that has different attributes 
	$charIndex = $range.bound 
End 

$tmp_file = '/Users/me/Desktop/test_things/NWP_rtf_to_utf8/test_abz.txt'
#$separator = Text.newWithCodepoint (2)
File.writeDataToPath $attribute_text, $tmp_file
exit

Re: Why is this macro so slow?

Posted: 2009-07-11 21:32:40
by Kino
Using an array is much faster because, if I don't misunderstand what I have heard from Martin, values in an array (and a hash) remain virtual until they are accessed.

Code: Select all

[...]
# inspect the attributes applied to every character in the text 
$output = Array.new
while $charIndex < $limit
	$range = $text.rangeOfDisplayAttributesAtIndex($charIndex)
	$attribute_text = $text.displayAttributesAtIndex($charIndex)
	$attribute_text &= $the_delimiter_a & $text.substringInRange($range)
	$output.appendValue $attribute_text
	# move to the next area that has different attributes 
	$charIndex = $range.bound
end
$output = $output.join $the_delimiter_b
Document.newWithText $output

Re: Why is this macro so slow?

Posted: 2009-07-12 04:44:15
by Nobumi Iyanaga
Hello Kino,

Thank you for your reply. I tried with your method. It is definitely the fastest way. I will use this code. Thank you very much!

Re: Why is this macro so slow?

Posted: 2009-07-13 12:22:47
by martin
Kino wrote:Using an array is much faster because, if I don't misunderstand what I have heard from Martin, values in an array (and a hash) remain virtual until they are accessed.
NWP does do some macro optimizations where text objects aren't created (as you say, remain "virtual") until the are actually used. However, that doesn't help here, because all the text objects Nobumi creates in the loop are used. The "join" command is faster because it does some internal work that is essentially the most efficient way a macro can concatenate multiple text objects.

All this said, there's always room for improvement to speedup the macro language. Certainly the "&=" operator should be faster.