Expand/Shrink

builtins/structs.e

The file builtins/structs.e (an autoinclude) contains the actual implementation of Structs and Classes.
It is documented here for completeness only: pmain.e/DoStruct() and several other places in that file map the new syntax to routines in this file, both as part of the compilation process, and much the same calls shortly-to-be-repeated at run-time, plus a few more. Hence some of these routines may have the odd subtly awkward trick to invocation, and it may often prove significantly easier to write an extra shim layer than to alter the calling convention directly. Obviously you are free to experiment, however some problems that may arise may prove rather difficult and perhaps even impossible to fix, that is without breaking something else. There is no commitment these routines will not alter drastically between releases, and should there be any doubt you should always trust the actual code of builtins/structs.e instead of these notes, and (conversely/probably/secondary) these notes over any comments in that file.

I cannot possibly stress enough (as already mentioned in the above link) that structs are a programmer convenience that inevitably incurs some performance overhead. Should that be neither insignificant nor outweighed by improved code clarity, simply don’t use structs! Everything should be just fine for up to hundreds of thousands of calls, but not necessarily on the other side of several million calls.

In truth, a large part of the reason for documenting this is to act as a subtle deterrant, by properly syntax-colouring these routines in Edita/Edix and giving F1 lookup something to do, and perhaps helping you to understand why, for instance, your own new() routine might have just gone so horribly wrong, that is if it got confused with this one somehow.

bool res = struct(object s)
The lowest-level builtin struct, identical to class(), implemented as is_struct(s,0), see next
Aside: the F1 lookup of Edita/Edix is meant to go to struct, ditto new(), and likewise class, but the rest here.

bool res = is_struct(object s, integer rid)
Implements the implicit user defined types created for each struct/class, for example (automatically generated):
type bridge(object b) return is_struct(b,routine_id("bridge")) end type
NB: global so that it can be called from such implicit user defined types, rather than being useful itself directly.
The builtin struct/class passes 0 for rid, other types pass their own routine_id, which is guaranteed to be unique.
res: true if s is an instance of the specified class/struct, false otherwise.
TIP: since class() and struct() are in fact identical, get_struct_type(s)=S_CLASS/S_STRUCT (see below) can(/should perhaps) be used to distinguish between them (lowest-level-builtins) instead.

procedure struct_start(integer flags, string struct_name, integer rid=routine_id(struct_name), string base_name="")
flags: S_STRUCT(1)/S_CLASS(2)/S_DYNAMIC(3)/S_CFFI(9)
struct_name: eg "bridge" or "person"
rid: routine_id of the implicit user defined type (same as the rid of is_struct)
base_name: the class being extended, or in the case of S_CFFI, the C definition

procedure struct_add_field(string name, integer tid, field_flags=0, object dflt=NULL, bool bDflt=false)
name: eg "name"
tid: ST_INTEGER(1)/ST_ATOM(3)/ST_STRING(8)/ST_SEQUENCE(12)/ST_OBJECT(15), or the routine_id of the element’s type
field_flags: SF_PUBLIC(0)|SF_PRIVATE(1) [+ SF_PROC(#10)|SF_FUNC(#20)]
dflt/bDflt work in tandem: either neither is passed and a NULL dlft gets replaced with a more type-safe setting, such as "" for ST_STRING, whereas a true in bDflt means "use NULL as the actual default, if that's what it happens to be".

procedure end_struct()
End the struct/class definition. Not needed for S_CFFI.

procedure extend_struct(string struct_name, base_name="")
struct_name: eg "bridge" or "person" (must already exist)
base_name: for multiple inheritance:
To implement "extends a,b" the compiler invokes start_struct(..,"a"); end_struct(); extend_struct(..,"b").
Note that fields in a and b must be unique, else it just simply dies, whereas methods are cheerfully overwritten in the order encountered.
The compiler would then carry on adding any fields and methods being defined up to the end class, as usual.
In theory this could also be used to manually extend a class at some later point, but that is completely untested.

integer sdx = struct_dx(object s, bool rid=false)
NB: this is currently private to structs.e, but potentially quite useful...
s: can be a string, routine_id (if rid is true), or an instance
rid: if true then s can be a routine_id, otherwise it cannot [specifically not the field_dx() routine’s sdx]
sdx: a unique internal index, which really only has meaning to structs.e - cannot be zero
Aside: be advised that struct_dx (and to a slightly lesser extent field_dx) may differ between compile-time and run-time, especially if the programmer resorts to directly invoking any of the routines detailed in this very file. Obviously they are completely consistent within those two halves but cannot cross over: the compiler must not emit actual values, but re-calc at run-time.

integer fdx = field_dx(object sdx, string name)
NB: this is currently private to structs.e, but potentially quite useful...
sdx: can be string or instance, or (internally) a prior struct_dx() result, but not a routine_id, nor a c-struct/S_CFFI
name: eg "name"
fdx: a unique internal index, which really only has meaning to structs.e - can be zero

integer res = get_struct_type(object s)
s: can be a string, routine_id, or an instance
res: S_STRUCT(1)/S_CLASS(2)/S_DYNAMIC(3)/S_CFFI(9)

string res = get_struct_name(object s)
s: as per get_struct_type (str/rid/inst)
res: eg "bridge"
You can, rather pointlessly, query get_struct_name("bridge") and/or get_struct_name(routine_id("bridge")), however it is probably slightly more useful when passed an instance. Useful perhaps for diagnostic output, but if you are (regularly) using this to control the processing logic, you are probably doing it all wrong/missing a trick or two - the very bits that vary by class deserve to be defined within the very class where they vary!

integer/string res = get_struct_flags(object s, bool bAsText=false)
s: as per get_struct_type (str/rid/inst)
bAsText: if true the result is a string, from decode_flags()
res: [S_ABSTRACT(#10)]+[S_NULLABLE(#20)]

sequence res = get_struct_fields(object s)
s: as per get_struct_type (str/rid/inst)
res: for experimentation/diagnostics only

struct/class res = new(integer|string sdx=<inferred_from_context>, sequence imm={})
Aside: the F1 lookup for new() of Edita/Edix is meant to go to struct, ditto struct, but the rest here.
sdx: the routine_id of the user_defined type, inferred when possible, or a string
imm: initial field settings, if any, assigned in numerical order
res: an instance of the requested type
The result is a reference type, which is not suitable for passing to repeat().
The front-end attempts to provide/insert a suitable default for sdx from context, see T_new in pmain.e/ParamList().
No attempt has been made to support or mimic inline named parameters (that is, beyond "sdx" and "imm" with above caveat). Instead of

            politician p := new(name := "boris", age := 3)
use
            politician p := new()
                       p.name := "boris"
                       p.age := 3
Alternatively you can of course write your own new_politician() function and use named parameters on that.
Note that since new() is now a proper (auto-included) builtin, along with all that T_new handling in pmain.e, you are not allowed another of that name - hence the one that was in builtins/map.e has been renamed to new_map().

procedure store_field(struct s, string field_name, object v, context=0)
s: an instance
fdx: eg "name"
v: the value to store
context: as sdx of new(), allows class methods to access private fields

object res = fetch_field(struct s, string field_name, object context=0)
s: an instance
fdx: usually a string field name
context: as sdx of new(), allows class methods to access private fields
res: any type - you can store pretty much anything in a struct

object res = get_field_default(object s, string field_name)
Aside: not entirely sure this has much use, besides showing off on rosettacode
s: as per get_struct_type (str/rid/inst). A fatal error occurs if s is S_CFFI.
field_name: eg "name"
res: any type - you can store pretty much anything in a struct

procedure set_field_default(object s, string field_name, object v)
Required (by the compiler) for replacing virtual methods (ie override whatever[/virtual] start_struct() or extend_struct() set, when we encounter another[/the real deal])
s: as per get_struct_type (str/rid/inst). A fatal error occurs if s is S_CFFI.
field_name: eg "task"
v: any type, though obviously for methods it is an integer routine_id

integer res = get_field_flags(object s, string field_name, bool bAsText=false)
s: as per get_struct_type (str/rid/inst)
field_name: eg "name"
bAsText: if true the result is a string, from decode_flags()
res: SF_PUBLIC(0)|SF_PRIVATE(1) [+SF_PROC(#10)|SF_FUNC(#20)], as per struct_add_field(), or NULL if field_name not found.
Note that a NULL ==> "SF_PUBLIC" result occurs when bAsText is true, but only if field_name actually exists.
Used by pmain.e when checking for compile-time errors (duplicate fields).
A fatal error occurs if s is S_CFFI: at best, perhaps
integer sdx = s[I_SDX],
        id = structs[sdx][S_BDX],
        {offset,size,sign} = cffi:get_field_details(id,field_name)
but that is/gets nothing remotely like the "res" detailed above.

integer res = get_field_type(object s, string field_name, bool bAsText=false)
s: as per get_struct_type (str/rid/inst). A fatal error occurs if s is S_CFFI.
field_name: eg "name"
bAsText: if true the result is a string, one of "ST_INTEGER".."ST_OBJECT", or an embedded struct type. Note however that other user defined types (ie legacy, not struct or class) will not be recognised correctly.
res: ST_INTEGER(1)..ST_OBJECT(15), or the routine_id of a type, as per struct_add_field(), or NULL if not found.

atom pMem = struct_mem(object s)
s: A fatal error occurs if s is not an instance of an S_CFFI struct.
pMem: the result, suitable/intended for passing to c_func/proc(), and possibly set_struct_field(), poke(), peek(), etc.



TODO:

Structure fields are not (yet) fully integrated into every possible aspect/corner of the language, for instance:
s.name[1] = upper(s.name[1])
       ^ not yet supported
The rhs is in fact fine, however the left hand side needs a fetch/modify/store, since it is not implemented in the same (low-level) way as (eg) %opRepe. Similar things can happen for s.age += 1 and/or s.name &= " (deceased)", etc. Some things may never ever bother anyone enough to actually warrant being implemented, but you should always get a human readable rejection, as opposed to requiring a detailed debug session. Two of many possible ways to fix the above:
string name = s.name
name[1] = upper(name[1])    -- (this line be all-traditional-non-struct-code)
s.name = name
-- or --
s.name = proper(s.name)

-- and, for those last two compound assigments, the obvious workaround...
s.age = s.age + 1
s.name = s.name & " (deceased)"
Once I get stuck in, I expect/hope such things might be reasonably easy to fix/implement, but I also expect there to be quite a few.

There is some code for handling a simple constructor in new() [see variable ctor]. Currently, for class bridge, new() creates a new instance then looks for a bridge() constructor, and should one be found passes it two parameters, the implicit/hidden "this" (just created), and the sequence imm from the new() call.
However that argument handling is probably not ideal and probably conflicts with how it might be called directly.
A derived class pontoon extends bridge could invoke this.bridge([args]) without problem, unlike a more generic or ambiguous (super|parent).constructor/__init__(), especially under multiple inheritance, hence (de)constructors are more sensibly named as (~)bridge/pontoon.
I need to have a bit more of a think/play with constructor arguments, specifically x = new bridge(...) syntax - which is much more constructor-specific than anything we have right now, but still gets an implicit this, which might only be permitted/prove sensible for nullable classes. [Ding, penny just dropped: have builtins/structs.e maintain [CRID] on each class and make is_struct() do a final "if flags!=S_NULLABLE then end if".] Plus of course document and properly test all these things.
Also note that, currently, destructors are not guaranteed to be called in a timely fashion, or indeed at all. This is becase something like s.show(true) is invoked using call_proc(s.show,{s,true}) and that s buried in the opMkSq of the call_proc() clings on to a reference count, potentially forever. There /must/ be a way to fix that... [Erm, test that, shouldn’t the tmp be passed non-incref & nulled to call_proc()...] There may also be other (non-opMkSq) things that ultimately cause the same or very similar effects. I am beginning to think that you should never be allowed to call a destructor directly/explicitly but should instead invoke delete() [optionally and with appropriate care not to carry on using whatever it is you just threw at delete()], and there should be nothing in the form of hierarchical/automatic/nested destructors, except that if say ~bridge() calls release_mem() then the ~pontoon() routine w/c/should also need to invoke release_mem() explicitly as well. Destructors would naturally be argument-less, except for the single implicit this.

To facilitate lazy binding, all methods are invoked via call_func/proc(), which interferes with/prevents any named parameter handling, which should really be available under early binding...

Implement/honour abstract/final on class methods.

Resurrect sequence-based structs? The main difficulty is nesting/embedding them, and/or perhaps not converting to literal integer indexes asap, so if we bar/do then things should progress a bit better? [PROBABLY NOT]

Also, S_CFFI handling deserves quite a bit more testing than just the light skimming it got.