Expand/Shrink

sockets

The file builtins/sockets.e (not an autoinclude) contains routines for low-level handling of network sockets.

Most (client) applications will not use sockets directly, but although libcurl can perform all client-related application needs (in a much simpler and better tested way), it does not contain any server-side handling, for which sockets.e can (perhaps) be used. There may also be some cases where the full might and power of libcurl is not needed, or at least not worth the hassle of ensuring the required dll/so are distributed and installed.

Based heavily on the work of Brett A. Pantalone (Windows), and Irv Mullins, jbrown and Pete Eberlein (Linux), the file builtins/sockets.e (not an autoinclude) is intended to unify those code bases into a single cross-platform file, while adding 64-bit support (achieved via extensive reworking to cffi). At the time of writing, the library is Windows only, however the Linux API is apparently almost identical, so (touch wood) it should be fairly straightforward (any volunteers?).

See demo\pGUI\Chat.exw, demo\rosetta\SimpleHttpServer.exw and demo\rosetta\Sockets.exw for examples of use.

As you can clearly see, there is no real attempt to explain socket programming in any depth, such things (done far better than I ever could) can be found easily enough. This is just a quick lookup of (Phix) result & parameter types, and is more suited to translating existing code than anything else.

None of these routines are supported by pwa/p2js.

constants

AF_UNSPEC  = 0
AF_UNIX  = 1
AF_INET  = 2
SOCK_STREAM  = 1
SOCK_DGRAM  = 2
INVALID_SOCKET  = #FFFFFFFF
INADDR_ANY  = 0
INADDR_NONE  = INVALID_SOCKET
SOCKET_ERROR  = -1
SD_RECEIVE  = 0, -- Shutdown receive operations.
SD_SEND  = 1, -- Shutdown send operations.
SD_BOTH  = 2 -- Shutdown both send and receive operations.
SOL_SOCKET  = #FFFF
SO_RCVTIMEO  = #1006

routines

sequence res = 
get_socket_error(integer err=SOCKET_ERROR) -- (Phix specific) obtain error details

err: An error code, if you happen to have one. Should probably be omitted more often than not.

res is of the form {integer err, string id, string short}, where:

err is eg 10049 (and never SOCKET_ERROR, which is -1)
id is eg "WSAEADDRNOTAVAIL" (matching in this case the constant WSAEADDRNOTAVAIL = 10049)
short is eg "Cannot assign requested address."

Invoked when socket/bind/listen/select/accept/recv/connect/send/etc return SOCKET_ERROR or otherwise indicate failure.

Note that SOCKET_ERROR effectively means "go fetch the real error" and obviously there is no point in passing that along inside an "if res=SOCKET_ERROR then" conditional branch (nor any real harm, apart from proving to everyone that you don’t quite know what you’re doing).

Also note that many Windows WSA codes are the same as the Linux equivalents, eg WSAEWOULDBLOCK == EWOULDBLOCK. The (private) constant MAPWSA in builtins\sockets.e is expected to be gradually extended as and when needed, at the time of writing it only contains about half a dozen entries (including the one just given).

atom hSocket = 
socket(integer af, socktype, pf=0) -- create a new socket bound to a specific transport service provider

af: Address family, typically AF_INET for internet addresses.
socktype: Socket type, use SOCK_DGRAM for UDP or SOCK_STREAM for TCP.
pf: Protocol family, eg IPPROTO_xxx, or 0 to let the service provider choose, based on the address family.

There should be a matching closesocket() call for every socket() call.

Returns INVALID_SOCKET on error.

atom long = 
htonl(atom long) -- host to network byte order for a u_long.

Network byte order is big-endian, but your machine is most likely little-endian.
This routine swaps the byte order for a long (4-byte) field, if necessary.

atom long = 
ntohl(atom long) -- network to host byte order for a u_long.

The inverse of htonl (functionally identical, but with different semantics).

integer short = 
htons(integer short) -- host to network byte order for a u_short.

Network byte order is big-endian, but your machine is most likely little-endian.
This routine swaps the byte order for a short (2-byte) field, if necessary.

integer short = 
ntohs(integer short) -- network to host byte order for a u_short.

The inverse of htons (functionally identical, but with different semantics).

atom pSockAddr = 
getsockname(atom hSocket) -- retrieves the local name for a socket.

hSocket: a socket handle, typically a result from socket(), or perhaps something out of libcurl, etc.

Returns a SOCKADDR structure, terminates on error.

integer port = 
getsockport(atom hSocket) -- (Phix specific) retrieves the port number from a socket.

hSocket: a socket handle, typically a result from socket(), or perhaps something out of libcurl, etc.

Returns the port number, from the SOCKADDR structure obtained from getsockname().

atom addr = 
getsockaddr(atom hSocket) -- (Phix specific) retrieves the address from a socket.

hSocket: a socket handle, typically a result from socket(), or perhaps something out of libcurl, etc.

Returns the address from the SOCKADDR structure obtained from getsockname(), suitable for passing to eg ip_to_string().

atom addr = 
gethostbyname(string host) -- retrieves host information corresponding to a host name from a host database.

host: a host name, can be "".

Returns just the address portion of a hostent structure, INADDR_ANY if host is "", or SOCKET_ERROR if an error occured.
Note that the C function of the same name returns the entire hostent structure.

Apparently gethostbyname is deprecated and getaddrinfo should be used instead, erm - good luck with that.

atom pSockAddr = 
sockaddr_in(integer af=AF_INET, string host="", integer port=0) -- allocate and populate a sockaddr_in structure.

af: Address family, typically AF_INET for internet addresses.
host: a host name, can be "" (for localhost).
port: a port number, or 0 (meaning "any").

Applies gethostbyname() and takes care of any host-to-network byte order conversions as needed.

Returns a sockaddr_in structure, or SOCKET_ERROR if an error occured.

integer res = 
bind(atom sock, pSockAddr, integer len=socklen) -- associates a local address with a socket.

sock: a prior result from socket()
pSockAddr: a prior result from (eg) sockaddr_in()
len: Defaulted to the value used in sockaddr_in()

Must be used on an unconnected socket before a subsequent call to the listen() function.

Returns zero or SOCKET_ERROR on failure.

integer res = 
connect(atom sock, pSockAddr, integer len=socklen) -- establishes a connection to a specified socket.

sock: a prior result from socket()
pSockAddr: a prior result from (eg) sockaddr_in()
len: Defaulted to the value used in sockaddr_in()

Returns zero or SOCKET_ERROR on failure.

integer res = 
listen(atom sock, integer backlog) -- places a socket in a state in which it is listening for an incoming connection.

sock: a prior result from socket()
backlog: specifies the maximum number of pending connections.

Returns zero or SOCKET_ERROR on failure.

sequence res = 
select(sequence read_set={}, write_set={}, error_set={}, object timeout={}) -- determine status of one or more sockets, waiting if necessary.

Check/wait the specified socket[s] until one or more is ready to read or write or has an error.
The result is of the form {integer ret_code, sequence {read_set, write_set, error_set}} = res,
or when only checking one socket the simpler integer {ret_code} = select(...) suffices.

read_set: sockets to check for readiness to read.
write_set: sockets to check for readiness to write.
error_set: sockets to check for errors.
timeout: in {seconds,microseconds} or atom microseconds format:
{0,0} or 0 means "return immediately", whereas {} means "wait forever".
{2,0} or 2000000 means "timeout after two seconds".

Each of the handles in each set should be a result from socket(), usually with listen() or connect() applied.
(IANASS - I Am Not A Socket Specialist).
I have also used this successfully with sockets obtained from libcurl/CURLINFO_ACTIVESOCKET.
For thread safety you may want to use sequence none = repeat(0,0) rather than {}, and avoid relying on those parameter defaults.

ret_code is SOCKET_ERROR, 0 for timeout, or a positive total number of socket handles that are ready.
Each of the returned read_set, write_set, and error_set should be a subset of the respective input sets.

atom peer = 
accept(atom sock) -- permits an incoming connection attempt on a socket.

sock: previously passed to listen()

Returns a new socket on which the actual connection is made, or INVALID_SOCKET in the case of error.

sequence res = 
recv(atom peer, integer maxlen=2048) -- receives data from a connected socket or a bound connectionless socket.

peer: a connected or bound socket
maxlen: obviously, can be increased if not large enough, or perhaps even reduced.

The result is of the form {integer len, string buffer} = res, where
len is SOCKET_ERROR (-1), 0 if the connection has been gracefully closed, or the number of bytes recieved, and
buffer is "" for len<=0, otherwise a binary string of length len (<=maxlen).
Note the C function accepts a buffer, but since we allocate a string result anyway, we may as well just use that directly.
Also note that some unreliable protocols (eg UDP) may simply discard any data beyond maxlen.

integer bytes_sent = 
send(atom peer, string message) -- sends data on a connected socket.

peer: previously passed to connect()
message: can be a binary string.

Returns the actual number of bytes transmitted, which may be less than the total message length, or SOCKET_ERROR on failure.

procedure 
shutdown(atom sock, integer how) -- disable sends or recieves on a socket

sock: a prior result from socket()
how: SD_RECEIVE, SD_SEND, or SD_BOTH

Deliberately crashes on error, eg with ERROR (shutdown): {10057,"WSAENOTCONN","Socket is not connected."}

procedure 
atom sock = closesocket(atom sock) -- close a socket

There should be a matching closesocket() call for every socket() call.

Deliberately triggers a (catchable) crash on error, but does nothing when passed a sock of 0 or SOCKET_ERROR.
Returns 0 on success, intended for clearing the input parameter as shown, or SOCKET_ERROR if it was already 0.

procedure 
WSACleanup() -- terminates use of the Winsock 2 DLL (Ws2_32.dll).

The corresponding WSAStartup() is handled automatically for you, should you forget, does nothing on Linux, and is not separately documented.

Deliberately crashes on error. The "WSAStartup not called" error should never happen, but it can fail if the network subsystem has failed or something is still in progress, so occasionally you may need to wrap the call in a try/catch, which I figured was probably better than forcing everyone else to ignore an irrelevant error code and/or in so doing make them miss out on spotting some other coding error, such as leaving more and more blocked threads hogging resources for no good reason.
Does nothing on Linux, apart from making the program more portable.

atom res = 
inet_addr(string cp) -- converts a string containing an IPv4 dotted-decimal address into a proper address for the IN_ADDR structure

Example: inet_addr("127.0.0.1") ==> #100007F

Returns INADDR_NONE on error.

There is a simple IPv4-only phix-specific inverse: string cp = ip_to_string(atom ip) - which does things manually, rather than wrapping the inet_ntoa C function, which might have been a mistake.

procedure 
setsockopt(atom sock, integer level, optname, atom pOptVal, integer optlen) -- set a socket option

sock: a prior result from socket()
level: eg SOL_SOCKET
optname: eg SO_BROADCAST
pOptVal: A pointer to buffer in which the value for the requested options is specified
optlen: The size, in bytes, of the buffer pointed to by the optval parameter

Not actually used in anger anywhere that I know of yet. Terminates in error on failure.

Note that none of these routines have yet undergone any significant real-world testing, but should be fairly easy to fix/enhance as needed.
Error handling is most likely wholly inadequate, and many constants are as yet missing, as no doubt are some other useful routines.