Expand/Shrink

closures

The file builtins/closures.e (an auto-include) provides an explicit alternative to the traditional implicit closures of other programming languages.

Phix does not support any form of implicit "value capture": instead you must be explicit. If you want the specified code to receive a value, you specify that value, if instead you want that code to receive a reference from which it can retrieve a current value, then guess what, you pass it a suitable reference. What you don’t do is slap down some implicit code and either hope it works or deliberately or by trial and error and quite probably sheer luck somehow manage to create the "appropriate implicit lexical scope" or "rebind" for it to work.

Many programmers swear by [implicit] closures: they can make for some neat-looking code but they also require a far from intuitive "value capture" and inevitably introduce a hidden state, which can make debugging nigh on impossible (hence all the swearing). Clearly [implicit] closures are great when they work, confusing and counter-intuitive for beginners, a notorious source of memory leaks and performance issues, and often don’t work as expected without a seemingly pointless sleight-of-hand additional function nesting level to force the required lexical scope. For instance, this is (it really is) an abridged summary of how MDN explains matters (in JavaScript):
 function showHelp(help) {
   document.getElementById('help').textContent = help;
 }
 
 //function makeHelpCallback(help) {
 //  return function () {
 //    showHelp(help);
 //  };
 //}
 
 function setupHelp() {
   var helpText = [
     { id: 'email', help: 'Your e-mail address' },
     { id: 'name', help: 'Your full name' },
     { id: 'age', help: 'Your age (you must be over 16)' },
   ];
 
   for (var i = 0; i < helpText.length; i++) {
 //  // Culprit is the use of `var` on this line
     var item = helpText[i];
     document.getElementById(item.id).onfocus = function () { showHelp(item.help); };
 //  document.getElementById(item.id).onfocus = makeHelpCallback(item.help);
 //or:
 //  (function () {
 //    var item = helpText[i];
 //    document.getElementById(item.id).onfocus = function () {
 //      showHelp(item.help);
 //    };
 //  })(); // Immediate event listener attachment with the current value of item (preserved until iteration).
   }
 //or:
 //helpText.forEach(function (text) {
 //  document.getElementById(text.id).onfocus = function () {
 //    showHelp(text.help);
 //  };
 //});   
 }
 
 setupHelp();
One fix is to replace the var with let (or const), another is to use makeHelpCallback(), another is to use the anonymous closure, perhaps inside a forEach instead of the for loop. Call me old-fashioned, but if you have to explain something in terms of all the ways it can go wrong, followed by several (in this case four) methods of "fixing" it, well, something ain’t quite right.

The problem is the obsessive yearning for excessively spartan code with implicit and non-immediately-intuitive "magic", without a care in the world for either practical run/compile-time costs or the cognitive overheads unwittingly burdened on the programmer, or the maintainer.

Some further reading on closures can be found here, and it rather tickles me to contemplate that all of that was almost certainly written in favour of [implicit] closures!!! Also this.

Phix solves these problems by forcing the programmer to be explicit.

type lambda(object l)
Of the form {cdx} where cdx is an index to the private tables of closures.e and the rest of the application should consider such values opaque/meaningless. Strongly typed, especially when bCleanup is/was true, but of course generally speaking even a 0.1s saving would be worth a thousand times as much as some vague smudge of extra type-safety.

lambda res = define_lambda(object rid, sequence partial_args, bool bSoleOwner=false, bCleanup=true)

rid: 
usually the plain name of a function, but can also be some prior defined/partially curried function,
ie the result of some previous invocation of define_lambda(), itself perhaps also nested to any reasonable depth.
In fact if rid is from a previous define_lambda(), it simply prepends partial_args with a copy before continuing.
Note that direct use of any such partially curried functions would be difficult, unless the base function somehow
returned a variable number of captured values, which would not be shared with any derivatives anyway (see below).
partial_args: 
hopefully self-explanatory, the explicit captures for rid, such that call_lambda() only has to provide the rest.
bSoleOwner: 
if true rid is given the only copy of any captures, and must return them, which may help performancewise.
bCleanup: 
by default builtins/closures.e reclaims unused slots in it’s internal tables when things go out of scope,
but you can disable that should it be unnecessary, in particular for a closure that never goes out of scope.
(You never know, it might make a big difference when you’ve got thousands of them...)

The result from rid must be an atom or {[captures,]res}. rid must return [all] captures if sole owner or it updates any (except as next).

Note that Phix closures do not naturally share [updates to] captures, as typically achieved in other programming languages by returning several closures at the same time, which will not work in Phix. Instead you would have to generate/capture some form of index or other lookup key (as part of partial_args) and fetch/store from somewhere else via that. A similar approach is also generally advised for other reference types such as classes and mpfr variables, or perhaps explicitly cloning them to prevent them from interfering with each other. In particular, should closures.e invoke deep_copy() it might object to any delete_routine() it encounters, since just copying such would be a pretty sure-fire way to be left with a reference to reclaimed memory or similar.

set_captures(sequence lambdas, object caps)

lambdas: one or more results from define_lambda()1.
caps: a replacement for partial_args, assuming that wasn’t quite ready any earlier.
An example use of this can be found in demo\rosetta\Variadic_fixed-point_combinator.exw
There would be no real harm in using an index capture and central store as mentioned above,
so things can be updated from anywhere, and at any time, or even one-at-a-time, but then again
such things would also have to be fetched/stored from everywhere.

1 Such params would often be type object and that way imply one or many, but in this case a single lambda is also a sequence, fear not it is smart enough to cope with lambda and {lambda}, treating them identically.

object res = call_lambda(object f, args)

f: a result from define_lambda(), can also be a plain function name should that help any.
args: any remaining parameters not already provided via partial_args.

call_lambda() can thus be used as a simple shim to call_func(), if f is int, ie a plain function name as opposed to a result from define_lambda(), which might simplify some uses. Since the args parameter of call_lambda() can be an atom, which is currently illegal for call_func(), the shim forwards an atom(args) as {args}, which might even further simplify some things. Alongside instinctively knowing that delete_routine(<int>,rid) never worked [now fixed], this is another reason why the return value from define_lambda() was made {cdx} rather than a plain cdx.

Note:
  • all "captured" values must (instead) be explicitly specified in the define_lambda() call.
  • the return from a closure must be either an atom result or of the form {[captures,]res}.
  • if a closure is the sole owner of any captures, it must return the full {captures,res}.
  • in the length-2 sequence case length(result[1]) must match define_lambda’s partial_args.
  • the first k parameters replace an implicit "environment" containing an arbitrary set of captures.
While no doubt some will complain these things should be implicit:
  • there is zero chance explicit closures will be accidentally and unintentionally created.
  • there is far less danger of accidentally capturing things you don’t actually really need.
  • there is no sleight-of-hand additional nesting level to force the required lexical scope.
  • there is no need to list reasons why they don’t work and what you must do to fix them.
  • there is no need ask what on earth and in the name of all that is holy "rebind" actually means1.
1 I don’t know, and I don’t care to ever know.

Consider the following snippet (in JavaScript, again):
 function test() {
     function makeAdder(x) {
       return function (y) {
         return x + y;
       };
     }
 
     const add5 = makeAdder(5);
     const add10 = makeAdder(10);
 
     console.log(add5(2)); // 7
     console.log(add10(2)); // 12
 }
 test()
Which creates a couple of implicit lexical scopes that contain the x=5 and x=10.
Phix takes the view that replacing that implict x that isn’t there anymore with an explict 5 or 10 is no bad thing:
procedure test()
    nested function adder(integer x, y)
        return x + y
    end nested function

    object add5 = define_lambda(adder,{5})
    object add10 = define_lambda(adder,{10})
 
    ?call_lambda(add5,2) // 7
    ?call_lambda(add10,2) // 12
end procedure
test()
An alternative equivalent for the above can also be found here. Another quick example (in js):
 const findByIndex = () => {
   console.time('array creation');
   const numbers = Array.from(Array(1000000).keys());
   console.timeEnd('array creation');
 
   return (index) => {
     const result = numbers[index];
 
     console.log(`item by index ${index}=${result}`);
 
     return result;
   };
 };
 
 const find_by_index = findByIndex();
 
 find_by_index(110351);
 find_by_index(911234);
 find_by_index(520109);
 find_by_index(398);
The above in Phix looks like this:
atom t0 = time()
function findByIndex(sequence numbers, integer index)
    if numbers={} then
        integer k = 1000000
        numbers = apply(true,sprintf,{{"n%d"},tagset(k)})
        printf(1,"n(%d) built (%s)\n",{k,elapsed(time()-t0)})
    end if
    return {{numbers},numbers[index]}
end function
constant find_by_index = define_lambda(findByIndex,{{}},true)
?call_lambda(find_by_index,110351)
?call_lambda(find_by_index,911234)
?call_lambda(find_by_index,520109)
?call_lambda(find_by_index,398)
?elapsed(time()-t0)
-- output:
--  n(1000000) built (1.6s)
--  "n110351"
--  "n911234"
--  "n520109"
--  "n398"
--  "1.7s"
In that example, making findByIndex the sole owner avoids a deep_copy on every call and as a result saves about 0.3s overall.
One thing I certainly prefer is "define_lamba()" looking useful, over "const find_by_index =" looking pointless, not that it is.
Likewise an equivalent in Phix for the above can also be found here, which may in fact be slightly better.

define_lambda() can also [partially] curry already [partially] curried functions, for example:
function f2(integer a, b, c)
    return {{a,b},a+b+c}
end function    

object f_2 = define_lambda(f2,{1})
object f12 = define_lambda(f_2,{2})
?call_lambda(f12,5) -- 8

object f13 = define_lambda(f_2,{3})
object f14 = define_lambda(f12,{})
?call_lambda(f13,5) -- 9
?call_lambda(f14,5) -- 8
Note however that f_2 /cannot/ be invoked via call_lambda(), since that has declared a closure that will return one captured value whereas f2() returns two such. Such would however be possible as long as f2 did not return any [updated] captures. Also, of course, f12 creates an entirely new and independant capture set: were f2() to modify a, and you defined an f13, then it would do its own thing with the original 1, and not be affected in any way by anything f12 does or has done. Conversely, f14 takes a fresh copy of the current state of the captures of f12, and hence would be affected by anything that had happened to f12 before f14 was defined, but not after.

Lastly, since I haven’t yet documented them [or rather managed to find the best place to do so], I should point out that anonymous lambda expressions do in fact already exist:
integer f = function (integer i) return i*2 end function
?f(2) -- 4
--Entirely equivalent, apart from the additional identifier:
--function sq(integer i) return i*2 end function
--integer f = sq
--NO:
--integer f = function sq(integer i) return i*2 end function
--                      ^ ’(’ expected
--integer p = procedure (integer i) ?i*2 end procedure
--            ^ illegal use of reserved word
Obviously there is no gain in that particular case over a normal function, but a similar rhs expression could be embedded in something else and that way prove more useful, not that I have a suitable example to hand. I could probably be persuaded to quietly ignore the embedded sq, letting it be there solely to clarify intent. I might even be persuaded to permit "{ ... }" instead of " ... end function", not that I’m really keen, and not that said offer would extend to any embedded end if/for/while, which I always find rather ugly in an inline lambda expression whatever the programming language is anyway (and as with regular expressions, lambdas may sometimes be the perfect band-aid for minor duties but just not meant for anything more serious).

So the answer to the question “Does Phix support closures?” is “In a more explicit fashion, that doesn’t need the docs to explain all the ways to fix them when they don’t work.” smile