Sorting In REXX Made Easy

Автор: Martin Packer

Дата: 15.07.2013

Источник: IBM developerWorks

In REXX sorting is a fraught exercise - and you tend to have to resort to other programs to do it:

    DFSORT for the “heavy lifting”
    UNIX Sort

but you might decide your sort is low volume enough to do in REXX. The trouble is it’s difficult to write a routine that isn’t specific on how the data is sorted, or rather how two items are compared - as we shall see.

Following on from Dragging REXX Into The 21st Century? here’s a technique that is quite general. It uses the REXX INTERPRET instruction.

The problem of sorting comprises two challenges:
    * Comparing two items.
    * Deciding which two items to compare when, and acting on the results of the comparison. Much has been written about this, perhaps most notably in Donald Knuth’s Sorting & Searching.

This post is mostly about Challenge 1, though the code presented also addresses Challenge 2.
Comparing Two Items

It’s easy to do a straight numeric comparison in REXX, such as

if a   ...
end

but more complex comparisons are, ahem, more complex. :-) But, more to the point, a generalised comparison is difficult to do, which is where INTERPRET comes in.
Sample Sort Code

In what follows the program works. That isn’t to say the sorting algorithm is the most efficient over all sets of data, but this post isn’t really about the sort’s efficiency.

/* REXX */

/* Set up stem variable to sort */
sortin.0=5
sortin.1="AXXX XS"
sortin.2="HXXXX BSS"
sortin.3="FXXX IS"
sortin.4="XSSSSSSSS ZS"
sortin.5="A JSS"

/* Sort the stem variable */
call sortStem "comp1"

/* Print the results */
do s=1 to sortin.0
  say sortin.s
end

exit

/* Sort comparison routine based solely on length of second word */
comp1: procedure
parse arg x,y

/* Calculate lengths of second word of x and y */
lx=length(word(x,2))
ly=length(word(y,2))

select
when lx=ly then return 0
when lx>ly then return 1
otherwise return -1
end

sortStem:
parse arg comparison_routine
flip_flop=0
do i=sortin.0 to 1 by -1
  flip_flop=1
  do j=1 to i-1
    /* Decide whether to swap */
    interpret "compRes="comparison_routine"('"sortin.i"','"sortin.j"')"
    if compRes<0 then do
      should_swap=1
    end
    else do
      should_swap=0
    end

    if should_swap=1 then do
      /* Swap values */
      xchg=sortin.j
      sortin.j=sortin.i
      sortin.i=xchg

      flip_flop=0
    end
  end
end
return

The item comparison routine in this example is, of course, somewhat artificial - and not one I’m likely to use. It compares the length of the second token in each stem variable. I chose it as an example of something that’s not easy to do in, say, DFSORT. The real point is you can compare items using arbitrarily complex criteria. If your collection of stem variables was more complex than this - essentially a list of objects - you’d want the sophistication this approach allows.

Incidentally when it comes to INTERPRET performance I think it depends on which statements are actually interpreted and which have already been tokenised:

    If your “heavy lifting” is the code passed to the INTERPRET routine that would be expensive.
    If you pass a call to a routine to INTERPRET that isn’t - so long as the actual routine is “static”, meaning appears in your source code where it can be tokenised.

In the above example code you’ll notice I’m using a hard-coded set of stem variable names. In my actual production code I’ve generalised to allow any stem variable names - so I’m actually much heavier in my use of INTERPRET.

I’m leaning towards INTERPRET in more cases - provided it doesn’t obscure the meaning.

At the start of this post I mentioned DFSORT and Unix sort as alternatives. I think you should consider them, though their limitations and the need to work hard at “marshalling” to use them will limit their appeal. So, I hope you’ll find the technique presented in this post useful for the cases where they’re not appealing.

The most important piece of this post - and the main reason for writing it - is the notion of using a single sorting routine and handing it the name of an item comparison routine. This should lead to code that’s more maintainable. Its long been the staple of other programming languages. You can do it with REXX, too!