For in Statement
In many cases the following variant of the for statement permits noticeably neater code, for instance:
The default starting point of 1 can be overidden with a from clause, ditto end point of length(s) using a to clause, which can sometimes save an unecessary slice operation.
The element, e, is automatically declared as object type1 and scoped to the loop if it does not already exist.2
The target sequence, s, can be any legal expression or literal, with of course non-sequences triggering the usual "length of an atom is not defined" error at run time. Other valid forms include but are not limited to:
for i,e in s do
is equivalent to/shorthand for
for i=1 to length(s) do
object e = s[i]
The control variable, i, can be omitted, in which case an anonymous temporary variable is used (which may make debugging harder).The default starting point of 1 can be overidden with a from clause, ditto end point of length(s) using a to clause, which can sometimes save an unecessary slice operation.
The element, e, is automatically declared as object type1 and scoped to the loop if it does not already exist.2
The target sequence, s, can be any legal expression or literal, with of course non-sequences triggering the usual "length of an atom is not defined" error at run time. Other valid forms include but are not limited to:
for e in s do
for i,e in s[k] do
for e in {1,"two",{3,4.5}} do
for word in {"one","two","three"} do
for word in filter(unix_dict(),twovowels) do
for ch in "word" do
for i in tagstart(5,5,5) do -- {5,10,15,20,25}
| Technicalia | |
|
Aside: Everything past this point is a deep dive into the nitty-gritty, along with attempts to justify why it was
deliberately kept as simple and straightforward as possible, that you might want to skip on first reading.
The element is just a normal variable, and nothing special happens should it be assigned something else mid-loop, and there is no problem, error, or warning should you nest two or more "for e in" loops, though like the traditional "for to" statements it is not permitted to nest "for i,e in" loops, since the inner i would clobber the outer i and at best behave erractically, whereas you can nest "j,e" in "i,e" or vice versa. Should object be in any way an inadequate type for e, or the thought of a nested "for e in" clobbering some outer e horrifies you, simply pre-declare it [with a better type], and perhaps/probably reconsider whether you should be using a "to" loop - the whole point is to get things neatly and elegantly into one line, and as soon as it don’t or you otherwise struggle, this ain’t the answer. One thing to note is that modifing some e[k] of a subsequence of s is more than likely to trigger a p2js violation. To actually modify s[i], do so explicitly, and obviously an index is needed, so use the i,e form, or probably even better yet revert to a traditional longhand loop. Since this is merely a shorthand notation, pwa/p2js writes out the equivalent longhand JavaScript3, using an i$idx identifier when the control variable is omitted, and i$seq when s is not a plain variable, both following in the footsteps of the already well-proven i$lim approach, and makes no attempt to revert to shorthand form unless either i$idx or i$seq was in fact required. Likewise, and independently of each other, a "let" will automatically be used or not depending on whether the index/element has previously been declared or not. For more details see mappings. There is no connection to any of the various "in" or "forEach" loop flavours of JavaScript, it simply uses the plainest possible standard loop. In particular the Javascript for (let x of "one") extracts diddy-strings not characters, whereas
for (let x in "one") sets x to 0 1 2, neither of which seem particularly useful from a desktop/Phix
or pwa/p2js point of view.
MDN actually suggests the for..in construct of JavaScript is only of any real use in a quick & dirty debug dump capacity.
You can override the default start of 1 with a from clause, and/or the default finish of length(s) with a to clause (use positive integers only, well, mixed +ve & -ve certainly won’t work!), however note you cannot get the length of an anonymously defined expression, without storing it in a named variable first. As well as the obvious option of simply reverting to a tradional longhand loop, you could perhaps also use (say):
for e in s[2..5] do
for e in s from i do -- == for e in s[i..$] do, but w/o the slice op
for e in s to j do -- s[1..j]
for e in s from i to j do -- s[i..j]
for e in extract(s,{3,5,7}) do
Also note that for..in does not permit a "by step" clause, a longhand/"to" loop must be used instead4. Athough it does not yet do so, the compiler is at liberty to optimise away an outer reverse() or trailing slice and similar, though/and of course I’d like to ensure that p2js simultaneously manages the same. Let me know if you hit a pressing need in that regard. There is no special handling for dictionaries or maps, however you can use any of the following:
integer d = new_dict(...)
for key in getd_all_keys(d) do .. end for
traverse_dict(rid,user_data,d)
include map.e
map m = new_map(...)
for k in keys(m) do .. end for
for v in values(m) do .. end for
for kv in pairs(m) do ..end for
object res = for_each(m,rid)
(Obviously traverse_dict() and for_each() have nothing to do with for .. in, but just belong in that list.) A related change is that previous versions of Phix would generate five new symbol table entries for five consecutive "for i" statements, whereas since 1.0.1 it "resurrects" any local-but-dropped-from-scope entries, temporarily of course, making some ex.err less confusing, ditto some debug sessions, and obviously needing/wasting a tiny little bit less memory. (Much the same could always be achieved, both now and pre-1.0.1, by pre-declaring the control var[s].) No such attempt is made on file-level for loops, and the table of resurrectables is cleared at the end of DoRoutineDef, which obviously prevents any sharing between routines, as it should. I had been of a mind to do this resurrection stuff for quite a while now, and the option to share some of the testing load with the "for..in" work was simply too good to pass up on. I can assure you this relatively simple-looking tweak went through quite some design process, including vastly over-complicating everything, before being clawed all the way back down to some semblance of sanity: 1 Initially the plan was to allow the type of the element to be explicitly declared, but I fell down the rabbit hole of allowing an explicit "integer" on the index to mean "must not already exist" (on both "in" and "to" flavours of the loop) then (as next) trying to get all fancy, before realising that apart from said idea of the control variable not pre-existing, which was never likely to be all that significant anyway, it was adding nothing that a simple two-or-three-liner this is all shorthand for would not already do much better in the first place. In other words, while "for id in" is fine and auto-declares id as an object, instead of "for integer id in...", just do "integer id; for id in...", that is, if you /really/ want a typecheck, and/or id to persist/remain in scope past the end for - the [optional] pre-declaration can solve two problems, not just the one. 2 Allowing multiple assignment within the for construct itself was also carefully considered, eg for {x,y} in points do, however it added significant complications to parsing and
scope handling (presumably with automatic declaration of nested and individually typed elements and all that jazz) that in the end
simply could not be justified, along with it inevitably leading to uglier and harder to read and/or debug code. Not denying there
isn’t still a certain appeal, but it is just plainly and simply more trouble than it’s worth. A typecheck error
on (say) y, when you have no idea where you are in points, or perhaps even worse no such error because it defaulted to
object, is simply not helpful, and for {string key, atom v} in data do
and similar just quickly all start getting far too messy, and ruin all the neat/elegant aspects we’re aiming for.
Again, a simple longhand two/three-liner does all of that stuff and more quite easily enough anyway.
3 Another reason to keep it simple: every change in desktop/Phix must be matched in pwa/p2js. 4 It is difficult to justify, let alone adequately test, the additional complexities that a by clause c/would introduce. Think of a step of -1 flipping from 1..length(s) to length(s)..1, then instead of a constant step a variable expression, with all the extra run-time checks and code that might need, plus never being more than two lines away from explicit expression of intent, and you should understand why I dropped "in".."by" like a hot potato. Despite some initial concerns I had made this "too simple", it quickly became a feature I could almost not stop using, and at least so far has coped admirably with everything I’ve tried throwing at it, which is quite a lot. Although now syntax-coloured as one, technically "in" is not a (reserved) keyword, though (as previously documented) pwa/p2js treats it as one, and similarly "from". |