Expand/Shrink

permute

Definition: sequence res = permute(integer n, sequence s, bool bFast=false)
-- or --
sequence res = permutes(sequence s, integer cb=0, k=length(s))
-- or --
sequence res = combinations(sequence s, integer k, at=1, sequence res={}, part="")
-- or --
sequence res = combinations_with_repetitions(sequence s, integer k=length(s), at=1, sequence res={}, part="")
Description: permute() returns the nth permute of the given sequence.
n: an integer in the range 1 to factorial(length(s))
s: the sequence or string to permute
bFast: [optional] use older somewhat quirky but faster algorithm (see below)

permutes() generates every unique k-permutation of s in lexicographical first-instance order,
and either returns that as a full set, or invokes the provided callback on each and returns {}.
s: the sequence or string to permute (let l be length(s))
cb: [optional] if non-zero, a function to process each permutation, and res will be {}
k: [optional] to specify partial permutations. If k>l then {} is returned.
when cb=0, and k=l and s contains no duplicates, returns a sequence of length factorial(l), in other cases the length is a bit harder to calculate/describe, but of course follows standard theory.
For the callback function bool continue = cb(sequence p[, integer kth[, tk]]):
p: a single permutation
kth, tk: indicates the kth of tk permutes (see notes below)
The callback should return true to continue or false to terminate.

combinations() returns all subsets of k distinct elements of s in lexicographical order.
s: the sequence or string to permute (any duplicates are stripped, let u be unique(s) and n be length(u)).
k: the length of each subset. Note that should k=n the result would [just] be {u}, and k>n yields {}.
at, res, part: intended for internal use only.
returns a sequence of length choose(n,k).

combinations_with_repetitions() returns all subsets of k non-descending elements of s in lexicographical order.
s: the sequence or string to permute (ditto unique’d, n as above, noting that k does not necessarily default the same).
k: the length of each subset. Somewhat more useful results are yielded when k>=n than from combinations(), as per examples.
at, res, part: intended for internal use only.
returns a sequence of length choose(n+k-1,k).
pwa/p2js: Supported.
Comments: The elements of s can be any type.

Prior to 1.0.2, permute() used a somewhat quirky but faster (2-fold) algorithm that generated all possible permutations but in no particular order, and that can still be used, whereas the default is now in lexicographic position order, and of course that is proper lexicographic order when s is sorted on entry, but a full set can still be generated even when it isn’t.
For both the old and new algorithms in permute(), it is just as fast to generate the (n!)th permutation as the first, so some applications may benefit by storing an integer key rather than duplicating all the elements of the given set.

Neither the old nor new algorithms in permute() handle partial permutations or duplicate elements in s correctly, for that a new routine permutes() must be used [added in 1.0.2], which (alas) does not (and cannot) support integer key reconstruction, that is of course apart from using permutes() to generate a full no duplicates set and saving the position in that (or the kth parameter of the callback) for later reconstruction via permute().
It needs to build a full set or requires a callback to process each, since it uses a mutated tagset internally, which of course would be lost if it tried to return each permutation individually, and/or would be unnecessarily difficult to reconstruct from some prior permutation, especially so should that be a partial one with duplicates.
Obviously building a complete set of permutations can be extremely wasteful if you don’t need the vast majority of them.

For best results any s passed to permute[s]() should be sorted, in which case permutations are generated in a proper lexicographic order.
When s is unsorted, they are generated in "first instance order", eg babb -> {bbba, bbab, babb, abbb}, in other words you get all 3 b at the front first because the first b occurred in s before the first a, and likewise you will get a perfect mirror order of results from "cba" that you would get from "abc".

The callback can take 1, 2, or 3 parameters: it will always be passed a permutation, and optionally two atoms indicating the permutation number and total of them, which may be useful for a progress indicator. (While technically kth and tk are atom, it is probably wiser to use integer as shown, since that may help catch attempts to generate more permutations than nanoseconds since the big bang.)
The callback should return true to continue or false to cease and desist the generation of any more permutations.

Note that the total number of permutations, ie tk as passed to cb(), is incorrect/meaningless for partial permutes with duplicates in s.
For instance, there are 7 length-2 permutes of "UVXX" (UV,UX,VU,VX,XU,XV,XX), but tk is given/calculated as 12, which btw is correct for no duplicates in s. I don’t believe there is a formula to calculate that 7 for such circumstances, but happy to be proved wrong, and (of course) should you happen to know what tk should be, use that instead of the wrong value from this! The value given for tk should however be correct for partial permutes with no duplicates, and full permutes with (or without) duplicates, just not both at the same time. It will of course generate the correct number of permutations, it is just tk that is (sometimes) wrong.
I would of course accept the argument and a suitable modification to the source file builtins\permute.e which passes, say, -1, to the callback instead of a blatently wrong value, see the four ((x...?)) comments in that code [for which there has been no attempt at testing].

For permutations, ordering is important, whereas for combinations it is not, hence the latter tend to return smaller result sets for the equivalent parameters, though that is certainly not the case every time.
Both combinations() and combinations_with_repetitions() perform s=unique(s) when first called, though the latter goes on to duplicate the one remaining copy (as opposed to duplicating each of the duplicates).
The results from combinations() are strictly ascending, those from combinations_with_duplicates() non-descending.
You could of course supply a tagset to these routines, and use those results to extract from your still-with-duplicates and/or out-of-order original set, should that help any.

Aside: Obviously, any way a specific application can cut down the number of permutations or combinations, including simply using a callback to terminate should (say) the 73rd fit the bill, would make for a much better and faster program. It is in fact a common optimisation technique, to scan a much smaller set of permutations or combinations than some much larger range, but if you’re not careful there can sometimes be far more than the original range, especially should you somehow be able to skip some quite large segments early on, whereas it is much harder to say anything like "no more permutations beginning 1,2,3 please". (I am of course always open to suggestions.)
Example 1:
for i=1 to factorial(3) do
    ?permute(i,"abc")       -- displays "abc" "acb" "bac" "bca" "cab" "cba" 
--  ?permute(i,"abc",true)  -- displays "bca" "cab" "bac" "cba" "acb" "abc"
end for

sequence res = permutes("112223445556")
pp(join(shorten(res,"items",2)))
-- `112223445556 112223445565 ... 655544322121 655544322211  (3,326,400 items)`

?join(permutes("123",0,1),’,’)                    -- "1,2,3"
?join(combinations("123",1),’,’)                  -- "1,2,3"
?join(combinations_with_repetitions("123",1),’,’) -- "1,2,3"

?join(permutes("123",0,2),’,’)                    -- "12,13,21,23,31,32"
?join(combinations("123",2),’,’)                  -- "12,13,23"
?join(combinations_with_repetitions("123",2),’,’) -- "11,12,13,22,23,33"

?join(permutes("123",0,3),’,’)                    -- "123,132,213,231,312,321"
?join(combinations("123",3),’,’)                  -- "123"
?join(combinations_with_repetitions("123",3),’,’) -- "111,112,113,122,123,133,222,223,233,333"

?join(permutes("123",0,4),’,’)                    -- ""
?join(combinations("123",4),’,’)                  -- ""
?join(combinations_with_repetitions("123",4),’,’) -- "1111,1112,1113,..,1333,2222,..,2333,3333"

?combinations_with_repetitions(tagset(9,0),3)     -- {{0,0,0},{0,0,1},..{8,9,9},{9,9,9}} (length 220)
?combinations_with_repetitions(tagset(9),3)       -- {{1,1,1},{1,1,2},..{8,9,9},{9,9,9}} (length 165)
Example 2: You could use an outer loop to generate all the different length permutes in length order:

sequence res = {}
for k=1 to 3 do res &= permutes("123",0,k) end for
pp(join(res))
-- `1 2 3 12 13 21 23 31 32 123 132 213 231 312 321`

Or instead, you might prefer do something like this, and maybe doing a bit of filtering or (actually doing, in real time) other useful things in the callback:

sequence res = {}, prev
function collect(sequence s, integer kth)
    integer k = 1
    if kth!=1 then
--      while k<length(s) -- (a strictly unnecessary test,
--        and ...         -- albeit a logically valid one)
        while s[k]=prev[k] do 
            k += 1
        end while
    end if
    for l=k to length(s) do
        res = append(res,s[1..l])
    end for
    prev = s
    return true -- (continue permuting)
end function

{} = permutes("123",collect) -- (instead of k=1,2,3)
pp(join(res))
-- `1 12 123 13 132 2 21 213 23 231 3 31 312 32 321`
Implementation: See builtins\permute.e (an autoinclude) for details of the actual implementation.
See Also: factorial
Expand/Shrink