Nested routines

It is also possible, albeit in a restrained sense and of limited practical use, to nest functions and procedures.
Note the only real difference is in the scope of the declared identifiers (ie just the routine names), and no extension to "available scopes".

As of 1.0.5, Phix also supports an explicit form of closures, which are perhaps slightly more useful, slightly neater, can be partially curried, and probably a better choice whenever recursion is involved.

First time readers are encouraged to skip the rest of this page and come back to it the first time they encounter an actual use of the static/nested keywords, if ever. As a language designer I am not particularly proud of any of this, nor do I expect anyone to be particularly impressed: explicit closures turned out to be a much better idea, even though they required no changes whatsoever to the language itself, then again there might yet still be some small benefit to using nested functions for said closures, specifically for the "auto-hide" part (and note that closures are always functions, so the procedure parts of this do not apply to them).

first example

function test()
    static [accs]                       -- (a compiler directive)
    static sequence accs = {}           -- (an actual definition)

    nested procedure dump()
        printf(1,"accs is %v\n",{accs})
    end nested procedure

    nested function makeAdder(integer x)
        accs &= x
        return length(accs)
    end nested function

    nested function addd(integer adx, y)
        accs[adx] += y
        return accs[adx]
    end nested function

    -- Note you could make these "constant", but for very little
    -- benefit, & would have to add them to the static directive
    integer add5 = makeAdder(5),
           add10 = makeAdder(10)

    ?addd(add5,2)   -- 7
    ?addd(add10,2)  -- 12
    dump()  -- accs is {7,12}

    return {add5,add10,makeAdder,addd}
end function
integer {a5,a10,make_adder,addd} = test(),
         a15 = make_adder(15)
?addd(a5,2)     -- 9
?addd(a10,2)    -- 14
?addd(a15,2)    -- 17

The above imitates the first example from closures.

Note that I have not had to explain why the above does not work, or offer four different ways to fix it.
[Aside: the bulk of this was written prior to builtins\closures.e and partly argues why Phix don’t need ’em.]
While accs is clearly scoped to the test() function, it appears in an ex.err the same as a file-scoped variable, and of course it all works exactly as it would with it declared as such, bar undefined/out of scope errors. In fact, the above is entirely equivalent, apart from scope differences, specifically in the above accs/dump/makeAdder/addd drop out of scope at the end of test() whereas below they don’t, and produces exactly the same output as:

/*local*/ sequence accs = {}

/*local*/ procedure dump()
    printf(1,"accs is %v\n",{accs})
end procedure

/*local*/ function makeAdder(integer x)
    accs &= x
    return length(accs)
end function

/*local*/ function addd(integer adx, y)
    accs[adx] += y
    return accs[adx]
end function

function test()

    integer add5 = makeAdder(5),
           add10 = makeAdder(10)

    ?addd(add5,2)   -- 7
    ?addd(add10,2)  -- 12
    dump()  -- accs is {7,12}

    return {add5,add10}
end function
integer {a5,a10} = test(),
         a15 = makeAdder(15)
?addd(a5,2)     -- 9
?addd(a10,2)    -- 14
?addd(a15,2)    -- 17

Note in particular that dump/makeAdder/addd have access to exactly the same things: the nested versions do not get magical access to a containing scope which may no longer be present by the time they are called, something which is also technically true for explicit closures, albeit they get some additional parameters automatically.

Should you have squirrelled this code away in a separate file such that accs/dump/makeAdder/addd are hidden from the rest of the application anyway, you may as well just use the second form, which is at the very least 25 fewer characters to type in even with the [optional] four suggested clarifying "local ", which actually change nothing apart from clarity of intent, that is as in "I really did want this one to be local".

Several questions raised by the above are covered in more detail under "edge cases" below.

another example

atom t0 = time()
function findByIndex(integer index)
    static [numbers]

    nested function n(integer k)
        sequence res = apply(true,sprintf,{{"n%d"},tagset(k)})
        printf(1,"n(%d) built (%s)\n",{k,elapsed(time()-t0)})
        return res
    end nested function

    constant numbers = n(1000000)
    return numbers[index]
end function

?findByIndex(110351)
?findByIndex(911234)
?findByIndex(520109)
?findByIndex(398)
?elapsed(time()-t0)
-- output:
--  n(1000000) built (1.7s)
--  "n110351"
--  "n911234"
--  "n520109"
--  "n398"
--  "1.7s"

Of course the whole point here is merely to make "numbers" private to findByIndex(), and created once-only and on-demand. The above is equivalent to the second example from closures. Of course all this really saves is a single boolean flag... (similarly should define_lambda() only be invoked once on any given function) and once again I am driven to wonder why anyone would ever think closures are a good idea, but they do, and in droves.

Perhaps I simply no longer quite get or fully appreciate many of the horrible limitations of other programming languages, especially when they relate to scope and/or "data hiding" issues, which are very rarely if at all any kind of problem in Phix, probably simply because file-level scope is the natural intuitive norm in Phix, whereas in C-based languages #include statements extend the file scope or more accurately "translation unit", almost everything in JavaScript ends up in an anonymous immediately invoked function, since otherwise absolutely everything is visible absolutely everywhere, and/or other programming languages have to mess about with modules, packages, or classes to achieve anything remotely similar.

One other thing to note is that Phix does not support proper block scope for nested functions. They must instead be declared at the "second-level" within a routine (and most certainly not within any conditional or loop constructs), which shouldn’t raise any significant issues since they can only refer to statics in the containing scope [directive/file] anyway [and any globals of course]. No effort whatsoever has been or will be made to implement or test anything "triple-stacked", since any benefits would be tiny - if you think it would be nice for an already nested function to have a couple of private methods, remember they would only be able to access the same things they would if moved outside/before the nested routine (but still in and local to the same top-level one).

edge cases and the nitty gritty details

procedure edge_cases(integer i)
//  static <accs,fred>  -- (triggers (^fred) "static not defined" error,
//                      --  technically at end procedure but shown here)
    static [accs,ctest]
    static sequence accs = {}
//29/4/24:
//  string s = "inner scope"

    nested function makeAdder(integer x)
//      ?s  // illegal (undefined)
--      string s = "adder scope"            -- (but this is fine)
        accs &= x
--      ?s                                  -- ("", but see note)
        return length(accs)
    end nested function

    nested function addd(integer adx, y)
        constant ctest = deep_copy(accs)    -- (for testing only)
        accs[adx] += y
        return accs[adx]
    end nested function
    
    string s = "inner scope"

    integer add5 = makeAdder(5),
           add10 = makeAdder(10)

    ?addd(add5,2)   -- 7
    ?addd(add10,2)  -- 12
    ?i      -- 55
    ?s      -- "inner scope" -- (whether "adder scope" defined or not)
    ?ctest  -- {5,10}        -- nb assignment on declaration semantics
end procedure
edge_cases(55)
--?accs -- invalid (undefined)
--?makeAdder -- ""
--?9/0 -- shows accs in ex.err as {7,12}, ctest as {5,10}

You could also easily have a get_running_total(adx) function, slightly messier with classic closures, since "there is only one x" or "it would create a different lexical binding to a different x", and for that reason (if no other) Phix uses the above more explicit form. Of course you could just as easily have reset/destroy/print/[un]lock procedures and suchlike, which can only be achieved for classic closures by returning multiple functions simultaneously, at which point picking one out of that is no real difference to passing an extra parameter anyway.

A static [identifier{,identifier}] directive causes the compiler to allocate the correct number of required/named static slots, technically outside/external to test() itself, but dropped from scope/visibility at the end of the routine. You can use square, angle, or curly braces to delimit the list (of plain identifiers), or omit the braces completely, the idea behind that simply being to let you choose whatever you think looks the least like code, since it ain’t.
In truth, a "static [id-list]" directive exists to overcome a version 1 "single pass" limitation of the compiler, and may well not be needed in version 2+ (assuming, as planned, there is by then a proper parse sub-tree that can quickly be scanned).

In fact I considered making "static" optional on subsequent actual declarations, before realising it might just one day make removing the no-longer-needed id-list quietly break existing code, and also making the same optional in "single-id" cases, which is also not very beneficial long-term. Behind the scenes, even a simple "= {}" is enough to push the compiler beyond the "allocate slots" phase, which is why you (currently) need to explicitly declare such an id-list. Each slot is initially just "object", but that can and usually is overriden by the formal declaration. An error occurs at the end of a routine definition ("static not defined") should no matching formal declaration be found for any entry in the static id-list, but they do not need to be fulfilled in the same order. Likewise static [fred]; static ferd = .., since fred!=ferd, or a similar orphaned constant declaration, generates the error message "not specified in static directive".

Shadowing a static variable, as shown for s in makeAdder() above, is currently fine on desktop/Phix, but I shall make no promises about and not even bother to test that under the current version of pwa/p2js. The plans for version 2 of Phix include merging the two parsers (etc) and revisiting scope handling, in particular preserving more scope information for debugging purposes, and hence that level of detail is deferred until then. Should such shadowing cause any grief in pwa/p2js, I may simply prohibit it, which to be honest I would have done immediately and without hesitation had said needed any actual work on desktop/Phix.

You can also define static-ids as constants, which obviously just prevents subsequent assignment/modification.

The static id list is only valid on a top-level routine, but can be used to pre-allocate slots for nested routines.
Handling of assignment on declaration is similar to that for optional parameters, in other words effectively the same as if not object(accs) then accs := {} end if, and likewise for constants. Should you declare a constant in addd(), such as ctest above, and of course also in test()’s static id list, it is in scope for the whole of test(), but will not actually be assigned until that routine is first called.

Nested routines must be explicitly marked as such using the "nested" keyword as shown above, on both start and end lines. Of course that error ("nested missing?") may also occur because you accidentally deleted/commented out some prior end function/procedure/type. Speaking of the latter, nested types are not formally supported, not that I have any particular deep objection, however they add very little benefit to (randomly-named) non-nested types, yet quite a few edge cases. One deliberate thing is that Edita does not recognise "nested", so they do not appear in the Quick Jump (Ctrl Q) popup or routine drop-down (and it might get confused if start/end don’t match), though I could perhaps be persuaded to add a new configuration option so they are recognised, and perhaps highlight any that are non-matching. Incidentally and in contrast, <Ctrl [> and <Ctrl ]> needed to and have been tweaked to recognise nested routines.

Any nested routines only have access to their own parameters and local variables, plus anything static, including such statics private to the containing scope, but nothing else in the containing scope proper, such as the string s in the above example, or any parameters of the test() routine. Accessing non-static fields in the calling frame would at best be rather messy, expecially should you throw in a few indirect calls and maybe some recursion, let alone the whole closure "outer scope no longer exists" concept. In some cases you may want to make static copies of various parameters and locals before invoking a nested routine, and in other cases you may want to relay any such on as normal parameters, that is to the nested routine itself.

As of 29/4/24, nested routines must be declared at the start of their containing routine, except for statics. In truth, something is/was not being reset quite right in pmain.e, and as a quick fix I simply enforced that constraint. I may need to revisit that, it may be left like that forever, but at least you get a crystal clear message indicating what needs to be done/the thing it cannot cope with, and apart from a minor annoyance said quick fix in no way restricts the permitted functionality, and indirectly implicitly enforces the fact that nested routines simply cannot reference anything in the containing local scope, by forcing you to move local variable declarations to after any nested routine declarations. It is also quite likely to become far easier to remove that restriction in 2.0+.

Note that a JavaScript local const is, erm, well, just about anything but constant, and p2js hoists local constants and statics suitably renamed into the global namespace, eg "$staticNN$accs" where NN is incremented system-wide. Otherwise it does nothing special and don’t be complaining to me should it start creating (real) closures because you insist on testing it in the browser before/without testing it on the desktop first, where you’ll probably get a better compile-time error message, at least until desktop/Phix and pwa/p2js are [hopefully] merged in version 2.0.0.

The key difference between Phix nested functions and the closures of other programming languages is, of course, add5(2) vs. addd(add5,2): Phix wants/needs things to be a touch more explicit.
Of course I know all those hidden/implied values are perfectly visible when something is actually called, but suppose you have set up a bunch of adders and cannot figure out why 1045 is not being generated, that is quite a bit harder to answer when you simply cannot see half the stuff, not that JavaScript is in particular at fault, several other programming languages are significantly worse in that regard.

So the answer to the question “Does Phix support closures?” is “Not in the traditional/implicit sense, but it has a perfectly adequate and sensible way to achieve the same thing, that is often much easier to debug.” smile

A revised version of that answer can now be found at the end of the closures page.