Phix vs Conventional Languages

By basing phix on one, simple, general, recursive data structure, the sequence, a tremendous amount of the complexity normally found in programming languages has been avoided. The arrays, structures, unions, arrays of records, multidimensional arrays, etc. of other languages can all be easily represented in phix with sequences. So can higher-level structures such as lists, stacks, queues, trees etc.

Furthermore, in phix you can have sequences of mixed type; you can assign any object to an element of a sequence; and sequences easily grow or shrink in length without you having to worry about storage allocation issues. The exact layout of a data structure does not have to be declared in advance, and can change dynamically as required. It is easy to write generic code, where, for instance, you push or pop a mix of various kinds of data objects using a single stack. Making a flexible list that contains a variety of different kinds of data objects is trivial in phix, but requires dozens of lines of ugly code in other languages.

Data structure manipulations are very efficient since the phix interpreter just points to large data objects rather than copying them.

Programming in phix is based entirely on creating and manipulating flexible, dynamic sequences of data. Sequences are it - there are no other data structures to learn. You operate in a simple, safe, elastic world of values far removed from the rigid, tedious, dangerous world of bits, bytes, pointers and machine crashes.

Unlike other languages such as LISP and Smalltalk, phix’s "garbage collection" of unused storage is a continuous process that never causes random delays in execution of a program, and does not pre-allocate huge regions of memory.

The language definitions of conventional languages such as C, C++, Ada, etc. are very complex. Most programmers become fluent in only a subset of the language. The ANSI standards for these languages read like complex legal documents.

You are forced to write different code for different data types simply to copy the data, ask for its current length, concatenate it, compare it etc. The manuals for those languages are packed with routines such as strcpy, wcscpy, _tcscpy, lstrcpy, StrCpy, StringCbCpy, StringCbCpyEx, StringCchCpy, StringCchCpyEx, strncpy, _strncpy_l, wcsncpy, _wcsncpy_l, _mbsncpy, _tcsncpy, _mbsncpy_l, strcat, lstrcat, wcscat, _mbscat, strcat_s, wcscat_s, _mbscat_s, StrCat, StringCbCat, StringCbCatEx, StringCbCatN, StringCbCatNEx, StringCchCat, StringCchCatEx, StringCchCatN, StringCchCatNEx, strncat, _strncat_l, wcsncat, wcsncat_l, _mbsncat, _mbsncat_l, strlen, lstrlen, StringCbLength, StringCchLength, strcmp, lstrcmp, strncmp, wcsncmp, _mbsncmp, _mbsncmp_l, _strnicmp, _wcsnicmp, _mbsnicmp, _strnicmp_l, _wcsnicmp_l, _mbsnicmp_l, strrchr, wcsrchr, _mbsrchr, _mbsrchr_l, strspn, wcsspn, _mbsspn, _mbsspn_l, etc. (clearly I could be here all day just listing them) that each only work on one of the many types of data.

Much of the complexity surrounds issues of data type. How do you define new types? Which types of data can be mixed? How do you convert one type into another in a way that will keep the compiler happy? When you need to do something requiring flexibility at run-time, you frequently find yourself trying to fake out the compiler.

In these languages the numeric value 4 (for example) can have a different meaning depending on whether it is an int, a char, a short, a double, an int * etc. In phix, 4 is the atom 4, period. Phix has something called user defined types as we shall see later, but it is a much simpler concept, used mainly for validation and debugging.

Arithmetic operations in phix always generate sensible answers, and terminate the program with a human readable error message when they go badly out of range. For instance 1/2 is always 0.5, rather than 0 or needing to be coded as 1.0/2.0. Likewise zero minus one is always -1, not (on a 32-bit architecture) +4GB as can happen in C-based languages, and #100_0000 * #10_0000 is #1000_0000_0000 (which will trigger an error if stored in an integer instead of an atom), whereas C-based languages silently yield 0, and 1,000,000*100,000 yields a meaningless 1,215,752,192 in C, because it quietly discards the leading #17_0000_0000. If you need to replicate the math behaviour of C (eg in cryptographic routines) then you must explicitly apply the appropriate mask, eg and_bits(x,#FFFFFFFF), floor(x), trunc(x), ceil(x), in those few special cases where it is required. (Note that translating working C code is pretty much always a mugs game: you are far better off spending that time learning to build a dll/so and invoking that, and if you hit problems find a better version or a prebuilt one.)

Issues of dynamic storage allocation and deallocation consume a great deal of programmer coding time and debugging time in these other languages, and make the resulting programs much harder to understand. Programs that must run continuously often exhibit storage "leaks", since it takes a great deal of discipline to safely and properly free all blocks of storage once they are no longer needed. (The same can be said for phix code that interfaces to other languages, but not for normal hll code.)

In most C based languages, strings are "immutable". In phix, you can modify strings any way you like, everything is automatically taken care of for you, and all with no nasty surprises. Note, however, that in phix you are expected to say str = modified(str), rather than modify(str). On that note, when reading phix code it is always absolutely clear what is being mofified; you are not expected to continuously assume that arguments to routines might be modified, or hope and pray they are not.

In other languages pointer variables are extensively used. The pointer has been called the "go to" of data structures. It forces programmers to think of data as being bound to a fixed memory location where it can be manipulated in all sorts of low-level, non-portable, tricky ways. A picture of the actual hardware that your program will run on is never far from your mind. Except when needed to invoke routines written in other languages, and even then only as a normal atom value, with no special syntax, phix does not have pointers and does not need them.