Expand/Shrink

tagset

Definition: sequence s = tagset(integer lim, start=1, step=1)
-- or --
sequence s = tagstart(integer start, len, step=1)
Description: When passed a single parameter (by far the most common use) tagset returns a sequence of integers: {1,2,3,..,lim}.

The optional start and step parameters are reasonably intuitive, as long as you remember that start is after lim, eg tagset(20,10,2) returns {10,12,14,16,18,20}.

Sometimes you don’t know or really care where it ends, for instance tagstart('A',5) is "ABCDE", and sometimes tagstart(1,26) is just more natural than tagset(26[,1]), likewise tagstart(5,5,5) and tagset(25,5,5) both produce {5,10,15,20,25}. The tagstart() routine is in fact just a trivial one-line wrapper of tagset, performing that slightly fiddly little bit of maths for you, calculating the lim for tagset() as start+(len-1)*step, that’s all.
In other words a tagstart() is just a tagset() where you specify first and length, instead of last [and optionally first].
pwa/p2js: Supported.
Comments: This routine is particularly useful when performing a tag sort, but can freely be used for other purposes.

A tag sort reorders a list of indexes but leaves the data they refer to completely unaltered.

When step is non-1, the result is not guaranteed to contain lim, eg tagset(25,21,3) ==> {21,24}.
When start=lim the result is always of length 1.
Negative steps are also permitted, with obvious consequences for start and lim, eg tagset(24,28,-2) ==> {28,26,24}.
If start is greater than lim and step is positive the result is always an empty sequence, likewise when start is less than lim and step is negative.

All inputs must be integers, and every element of the output is always an integer. A step of zero is illegal.

Of course characters are just integers, hence some other possible uses are tagset('Z','A') is "ABC...Z", tagset('z','a') is "abc...z" and tagset('9','0') is "012...9". In fact a string result is generated whenever both start and lim are in the range #20 (' ') to #7E ('~'), ie the basic latin ascii graphical characters, and of course any and all subscripting etc you might want to do is identical/compatible between strings and dword-sequences.
Example:
sequence Names = {"C. C. Catch", "Army of Lovers", "Boney M.", "Dschinghis Khan"}
sequence Years = { 1985,          1987,             1975,       1979            }

function by_year(integer i, integer j)
integer res = compare(Years[i],Years[j])
    if res=0 then
        res = compare(Names[i],Names[j])
    end if
    return res
end function
sequence yeartags = custom_sort(routine_id("by_year"),tagset(length(Years)))

for i=1 to length(yeartags) do
    integer ti = yeartags[i]
    printf(1,"Year: %d, Name: %s\n",{Years[ti],Names[ti]})
end for

This program, with a sort by name as well as by year, along with a non-tagsort version using the new sort_columns() routine, without needing a custom comparison routine, is included in the distribution as demo\tagsort.exw.

Despite any apparent simplicity, the humble tag sort can be extremely powerful once fully mastered, and just as importantly minimises unintended side effects, often quite a nice little added bonus.
Example 2:
sequence cases = {"Case 3", "caSe 1", "cAse 4", "casE 2"}

sequence casetags = custom_sort(lower(cases),tagset(length(cases)))

for i=1 to length(casetags) do
    printf(1,"%s\n",{cases[casetags[i]]})
end for

This example is also included in demo\tagsort.exw

Note that, especially on a large dataset, invoking upper() or lower() once at the start as the above does would be significantly faster than calling it twice for each comparison.
Example 3:
?tagstart('A',5) -- "ABCDE"
?tagstart('A',15) -- "ABCDEFGHIJKLMNO"
?tagset(5,2) -- {2,3,4,5}
?tagstart(2,4) -- {2,3,4,5}
?tagstart(5,5,5) -- {5,10,15,20,25}
?tagset(25,5,5) -- {5,10,15,20,25}
Implementation: See builtins\ptagset.e (an autoinclude) for details of the actual implementation.
See Also: custom_sort (which also details sort_columns)