Sort Oddity

Have a problem? A question? This is the place for answers from other Express users.
Post Reply
ntx
Posts: 12
Joined: 2006-03-04 08:40:08
Location: tokyo - japan

Sort Oddity

Post by ntx »

Hi,
I am trying to sort a list of file names containing numbers but something is wrong.
For instance, when "sort paragraphs A-Z" macro is applied to the following list :

c
b
a

I get :

a
b
c

Perfect :D . Then I try the same thing with

c
b
a1

I get :

a1
b
c

Okay :D . But when I add a numeric prefix to each entry :

0c
0b
0a1

NWE returns now:

0b
0c
0a1 <-- ?? :(

Any fix or workaround to try ? I'm using NWE 2.61 on OS X 10.4.5.
Metalsmith
Posts: 7
Joined: 2002-08-30 10:59:22
Location: Not Quite Downtown Nevada

Re: Sort Oddity

Post by Metalsmith »

[Hi,
---snip---
Okay :D . But when I add a numeric prefix to each entry :

0c
0b
0a1

NWE returns now:

0b
0c
0a1 <-- ?? :(

Any fix or workaround to try ? I'm using NWE 2.61 on OS X 10.4.5.[/quote]

---

Extending this, you get:

000a
00a
0a
0b
0c
00a1
0a1
10a
a001
a01
a011
a01a
a02
a02a
a1
a11
a11a
a1a

a2
a3

Which seems to indicate that the break between 00a1/0a1 and 00a/0a is a unique oddity. I have NO idea why this would be so :roll:, but note the red sequence that seems to repeat the problem. If a11a follows a11, then it seems as if a1a should follow a1, not a11a. :wink:
midwinter
Posts: 333
Joined: 2004-09-09 18:07:11
Location: Utah
Contact:

Post by midwinter »

That's just bizarre. I don't see a "use random logic on sort" preference anywhere. ;)
dshan
Posts: 334
Joined: 2003-11-21 19:25:28
Location: Sydney, Australia

Sort Weirdness

Post by dshan »

midwinter wrote:That's just bizarre. I don't see a "use random logic on sort" preference anywhere. ;)
Fortunately there's no need for such a feature, it's more than covered by a combination of the Perl sort() function and the way the "Sort Paragraphs" macros are written.

By default the Perl sort() routine orders arrays in character alphabetic order, i.e. 0-9 before a-z and A-Z. Explicit variable comparisons in Perl require the programmer to know if they're dealing with numeric values or alpha values and use the appropriate relational operators in each case (=,<,>,!= for numeric variables and eq,lt,gt,ne, etc. for alpha variables). This presents problems when you're sorting lines/paragraphs of mixed alphanumeric text as in the Sort Paragraphs case.

There is code in the macros that specifically looks for lines that begin with numbers (presumably those produced by the Add Line Numbers and Number Renumber Lines macros or similar) and tries to ensure they are first sorted numerically by those line numbers and not just alphabetically by character value as lines that don't begin with numbers are (e.g. you want 02 and 2 to compare as equal which they won't if you just compare them as character strings). Alas there seems to be a little bug here - once a leading line number is identified the sort routine proceeds to strip out all remaining non-numeric characters in the line and initially order the line(s) based on the line numbers. If two line numbers are equal (e.g. 0 as in 0c and 0b) it then does a character comparison on the whole line. So:

0c, 0b, 0a1 sorts first on the numerics: 0,0,01 then by character on the first two lines giving 0b, 0c, 0a1. The third line is left as is because 01 is not equal to 0 and so is just ordered numerically.

The problem is that lines that begin with numbers can have other completely separate numeric values further on following some alpha chars, like 0a1 in the example, and when you strip out all non-numerics you end up concatenating separate numeric values and treating it as a line number - 0a1 becomes 01 which sorts after any 0<a-z> lines. I think what the macro should do is remove all characters following any line number, not just non-numerics, but I haven't tried this yet.
Post Reply