Reply to topic  [ 6 posts ] 
Why is this macro so slow? 
Author Message

Joined: 2007-01-17 05:46:17
Posts: 145
Location: Tokyo, Japan
Hello,

I have two version of a macro which lists up all the style attributes of a given text. One of them works rather quickly [version a], while the other [version b], which seems probably simpler, is VERY slow. Could you please indicate me why the latter is so slow, and how I can get it work more quickly?

Here is the version a:

Code:
# Begin macro 'nwp_1-2_to_utf8_test1'

Require Application Version "3.2"
$doc = Document.active
$text = $doc.text
$text = $text.copy
$footnotes = $doc.footnotes
$endnotes = $doc.endnotes
$section_notes = $doc.sectionnotes

$footnotes_ct = $footnotes.count
$endnotes_ct = $endnotes.count
$sectionnotes_ct = $section_notes.count

if $footnotes_ct
   while $footnotes_ct
      $this_footnote = $footnotes[$footnotes_ct - 1]
      $fn_text = $this_footnote.contentSubtext
      $fn_text.findAndReplace '^\s+', '', 'E'
      $fn_text = "<fn>" & $fn_text
      $fn_text &= "</fn>"
      $fn_ref = $this_footnote.documentTextRange
      $text.replaceInRange $fn_ref, $fn_text
      $footnotes_ct -= 1
   End
End

if $endnotes_ct
   while $endnotes_ct
      $this_endnote = $endnotes[$endnotes_ct - 1]
      $en_text = $this_endnote.contentSubtext
      $en_text.findAndReplace '^\s+', '', 'E'
      $en_text = "<en>" & $en_text
      $en_text &= "</en>"
      $en_ref = $this_endnote.documentTextRange
      $text.replaceInRange $en_ref, $en_text
      $endnotes_ct -= 1
   End
End

if $sectionnotes_ct
   while $sectionnotes_ct
      $this_sectionnote = $section_notes[$sectionnotes_ct - 1]
      $sn_text = $this_sectionnote.contentSubtext
      $sn_text.findAndReplace '^\s+', '', 'E'
      $sn_text = "<sn>" & $sn_text
      $sn_text &= "</sn>"
      $sn_ref = $this_sectionnote.documentTextRange
      $text.replaceInRange $sn_ref, $sn_text
      $sectionnotes_ct -= 1
   End
End

$charIndex = 0
$limit = $text.length
$ranges = Array.new
$attribute_text = ""

# inspect the attributes applied to every character in the text
While $charIndex < $limit
   $range = $text.rangeOfDisplayAttributesAtIndex($charIndex)
   $ranges.push ($range)
   $attributes = $text.displayAttributesAtIndex($charIndex)
   $attribute_text &= $attributes

   # move to the next area that has different attributes
   $charIndex = $range.bound
End

$tmp_file = '/Users/me/Desktop/test_things/NWP_rtf_to_utf8/test_aby.txt'
$separator = Text.newWithCodepoint (2)
File.writeDataToPath “$text $separator $ranges $separator $attribute_text”, $tmp_file
exit


and the version b:
Code:
# Begin macro 'nwp_1-2_to_utf8_test2'

Require Application Version "3.2"
$doc = Document.active
$text = $doc.text
$text = $text.copy
$footnotes = $doc.footnotes
$endnotes = $doc.endnotes
$section_notes = $doc.sectionnotes

$footnotes_ct = $footnotes.count
$endnotes_ct = $endnotes.count
$sectionnotes_ct = $section_notes.count

if $footnotes_ct
   while $footnotes_ct
      $this_footnote = $footnotes[$footnotes_ct - 1]
      $fn_text = $this_footnote.contentSubtext
      $fn_text.findAndReplace '^\s+', '', 'E'
      $fn_text = "<fn>" & $fn_text
      $fn_text &= "</fn>"
      $fn_ref = $this_footnote.documentTextRange
      $text.replaceInRange $fn_ref, $fn_text
      $footnotes_ct -= 1
   End
End

if $endnotes_ct
   while $endnotes_ct
      $this_endnote = $endnotes[$endnotes_ct - 1]
      $en_text = $this_endnote.contentSubtext
      $en_text.findAndReplace '^\s+', '', 'E'
      $en_text = "<en>" & $en_text
      $en_text &= "</en>"
      $en_ref = $this_endnote.documentTextRange
      $text.replaceInRange $en_ref, $en_text
      $endnotes_ct -= 1
   End
End

if $sectionnotes_ct
   while $sectionnotes_ct
      $this_sectionnote = $section_notes[$sectionnotes_ct - 1]
      $sn_text = $this_sectionnote.contentSubtext
      $sn_text.findAndReplace '^\s+', '', 'E'
      $sn_text = "<sn>" & $sn_text
      $sn_text &= "</sn>"
      $sn_ref = $this_sectionnote.documentTextRange
      $text.replaceInRange $sn_ref, $sn_text
      $sectionnotes_ct -= 1
   End
End

$charIndex = 0
$limit = $text.length
$ranges = Array.new
$attribute_text = ""
$black_text = 'textColor=<Red=0.000, Green=0.000, Blue=0.000, Opacity=1.000>, textBackgroundColor=<Red=1.000, Green=1.000, Blue=1.000, Opacity=1.000>'

$the_delimiter_a = Text.newWithCharacter(2)
$the_delimiter_b = Text.newWithCharacter(3)

# inspect the attributes applied to every character in the text
While $charIndex < $limit
   $attributes = ''
   $range = $text.rangeOfDisplayAttributesAtIndex($charIndex)
   $attributes &= $text.displayAttributesAtIndex($charIndex)
   $attributes.findAndReplace $black_text, ''
   $attribute_text &= $attributes
   $this_text = $text.subtextInRange ($range)
   $this_text = $the_delimiter_a & $this_text
   $this_text &= $the_delimiter_b
   $attribute_text &= $this_text

   # move to the next area that has different attributes
   $charIndex = $range.bound
End


$tmp_file = '/Users/me/Desktop/test_things/NWP_rtf_to_utf8/test_abz.txt'
#$separator = Text.newWithCodepoint (2)
File.writeDataToPath $attribute_text, $tmp_file
exit


Thank you in advance.

_________________
Best regards,

Nobumi Iyanaga
Tokyo,
Japan


2009-07-11 18:52:28
Profile WWW

Joined: 2008-05-17 04:02:32
Posts: 400
Use “$this_text = $text.substringInRange ($range)” instead of “$this_text = $text.subtextInRange ($range)”. Appending attributed text using & is very expensive.

By inserting “Debug.setCodeProfilingEnabled true” at the top of macro, you can see which commands are slow. Very kindly Martin made available this new command probably because I had harassed him repeatedly and insistently with feedbacks related to macro performance ;-)


2009-07-11 19:48:01
Profile

Joined: 2007-01-17 05:46:17
Posts: 145
Location: Tokyo, Japan
Hello Kino,

Thank you for your reply. I tried with your advice, and found some problems. The following code is still slow, but it is much faster than the last one. I found that the most time consuming commands are string concatenations, and findAndReplace...

Code:
# Begin macro 'nwp_1-2_to_utf8_test2'

Require Application Version "3.2"
# Debug.setCodeProfilingEnabled true
# Debug.setDestination "new"
$doc = Document.active
$text = $doc.text
$text = $text.copy
$footnotes = $doc.footnotes
$endnotes = $doc.endnotes
$section_notes = $doc.sectionnotes

$footnotes_ct = $footnotes.count
$endnotes_ct = $endnotes.count
$sectionnotes_ct = $section_notes.count

if $footnotes_ct
   while $footnotes_ct
      $this_footnote = $footnotes[$footnotes_ct - 1]
      $fn_text = $this_footnote.contentSubtext
      $fn_text.findAndReplace '^\s+', '', 'E'
      $fn_text = "<fn>" & $fn_text
      $fn_text &= "</fn>"
      $fn_ref = $this_footnote.documentTextRange
      $text.replaceInRange $fn_ref, $fn_text
      $footnotes_ct -= 1
   End
End

if $endnotes_ct
   while $endnotes_ct
      $this_endnote = $endnotes[$endnotes_ct - 1]
      $en_text = $this_endnote.contentSubtext
      $en_text.findAndReplace '^\s+', '', 'E'
      $en_text = "<en>" & $en_text
      $en_text &= "</en>"
      $en_ref = $this_endnote.documentTextRange
      $text.replaceInRange $en_ref, $en_text
      $endnotes_ct -= 1
   End
End

if $sectionnotes_ct
   while $sectionnotes_ct
      $this_sectionnote = $section_notes[$sectionnotes_ct - 1]
      $sn_text = $this_sectionnote.contentSubtext
      $sn_text.findAndReplace '^\s+', '', 'E'
      $sn_text = "<sn>" & $sn_text
      $sn_text &= "</sn>"
      $sn_ref = $this_sectionnote.documentTextRange
      $text.replaceInRange $sn_ref, $sn_text
      $sectionnotes_ct -= 1
   End
End

$charIndex = 0
$limit = $text.length
$ranges = Array.new
$attribute_text = ""
$black_text = 'textColor=<Red=0.000, Green=0.000, Blue=0.000, Opacity=1.000>, textBackgroundColor=<Red=1.000, Green=1.000, Blue=1.000, Opacity=1.000>'

$the_delimiter_a = Text.newWithCharacter(2)
$the_delimiter_b = Text.newWithCharacter(3)

# inspect the attributes applied to every character in the text
While $charIndex < $limit
   $attributes = ''
   $range = $text.rangeOfDisplayAttributesAtIndex($charIndex)
   $attributes &= $text.displayAttributesAtIndex($charIndex)
#   $attributes.findAndReplace $black_text, ''
#   $attribute_text &= $attributes
   $this_text = $text.substringInRange ($range)
#   $this_text = $the_delimiter_a & $this_text
#   $this_text &= $the_delimiter_b
   $attribute_text &= "$attributes$the_delimiter_a$this_text$the_delimiter_b"

   # move to the next area that has different attributes
   $charIndex = $range.bound
End

$tmp_file = '/Users/me/Desktop/test_things/NWP_rtf_to_utf8/test_abz.txt'
#$separator = Text.newWithCodepoint (2)
File.writeDataToPath $attribute_text, $tmp_file
exit

_________________
Best regards,

Nobumi Iyanaga
Tokyo,
Japan


2009-07-11 20:53:50
Profile WWW

Joined: 2008-05-17 04:02:32
Posts: 400
Using an array is much faster because, if I don't misunderstand what I have heard from Martin, values in an array (and a hash) remain virtual until they are accessed.
Code:
[...]
# inspect the attributes applied to every character in the text
$output = Array.new
while $charIndex < $limit
   $range = $text.rangeOfDisplayAttributesAtIndex($charIndex)
   $attribute_text = $text.displayAttributesAtIndex($charIndex)
   $attribute_text &= $the_delimiter_a & $text.substringInRange($range)
   $output.appendValue $attribute_text
   # move to the next area that has different attributes
   $charIndex = $range.bound
end
$output = $output.join $the_delimiter_b
Document.newWithText $output


2009-07-11 21:32:40
Profile

Joined: 2007-01-17 05:46:17
Posts: 145
Location: Tokyo, Japan
Hello Kino,

Thank you for your reply. I tried with your method. It is definitely the fastest way. I will use this code. Thank you very much!

_________________
Best regards,

Nobumi Iyanaga
Tokyo,
Japan


2009-07-12 04:44:15
Profile WWW
Official Nisus Person
User avatar

Joined: 2002-07-11 17:14:10
Posts: 4251
Location: San Diego, CA
Kino wrote:
Using an array is much faster because, if I don't misunderstand what I have heard from Martin, values in an array (and a hash) remain virtual until they are accessed.

NWP does do some macro optimizations where text objects aren't created (as you say, remain "virtual") until the are actually used. However, that doesn't help here, because all the text objects Nobumi creates in the loop are used. The "join" command is faster because it does some internal work that is essentially the most efficient way a macro can concatenate multiple text objects.

All this said, there's always room for improvement to speedup the macro language. Certainly the "&=" operator should be faster.


2009-07-13 12:22:47
Profile WWW
Display posts from previous:  Sort by  
Reply to topic   [ 6 posts ] 

Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group
Designed by ST Software