Recent Changes - Search:

PmWiki

pmwiki.org

edit SideBar

Tom-lwn

Jake Edge <jake@lwn.net>

Hi Tom,

This is not at all an LWN article, sorry to say ... I think the topic could make for an LWN article, but the style of this seems targeted at a very different audience ... You very well might have some success with it at other sites/magazines ... thanks,

jake

---

Lessons I have gleamed:

  • emphasis on could was by Jake
  • submisision was roughly what is most recent article here
  • had some diagrams
  • ok, too much Phix bias
  • ok!, too much Phix propaganda
  • should be exclusively Rosetta-Code emphasis
  • initially, Jake was not familiar with Rosetta
  • let Phix take a gentle ride along with Rosetta wiki, what it is, what it's good for, ...
  • equal billing to Phix,Wren,Julia (as they are top three)
  • emphasis on comparing code between languages
  • still, could say "Phix is a Rosetta key, ... since has all solved"
  • show solutions in other languages leads to synergy

? Phix used to solve javascript challenges (I mentioned it as a plus)

? must consider "taste" in syntax, or at least c--python mind condrol due to uni courses

? conventional languages have huge inertia

? classic Euphoria had no traction, I don't fully understand why

? NEED more technical presentation, only Pete can attack this angle

? serious presentation of Phix hybrid nature, engine, ... would fit LWN better

? I guess the quotations can go


Phix is First at Rosetta Code

Phix is First at Rosetta Code


Phix is first--and only--to cross the finish line; Google "Rosetta Code most popular" to see how the remaining eight hundred programmaning languages are doing. The challenge is to solve about 1507 computer science tasks. The scope goes from "Hello World", 99beers, writing a compiler, and then to some fiendishly difficult problems. You can find Phix at http://phix.x10.mx ; what Pete Lomax has achieved in creating Phix is special. ../images/julia.png

Not only is Phix first, but the accompaning p2js (phix to javascript) transpiler is ahead of pure Javascript in solving Rosetta tasks.

A concept in a terminal: ? "Hello" becomes a desktop gui:

with js include pGUI.e IupOpen() IupMessage("Hello", "Hello, World!") IupClose() ../images/hello_desktop.png

which can be transpiled to Javascript and run in a browser: ../images/hello_browser.png

Phix is the "Simple is Better" language. Simplicity starts with a hybrid interpreter|compiler that works in Windows and Linux. Code is written in UTF8 encoding; in a terminal type: p hello.exw to interpret; type: p -c hello.exw to compile. Phix is self-hosting; type: p -c p to see Phix recompile itself within seconds. Simplicity is notable in the freeform syntax: no special punctuation, no special rules, always predictable, symmetrical indexing, and ... no surprises. The type system is reduced to two concepts: atom and sequence. You can write desktop Phix-code, and transpile it to Javascript to work in a browser. Simple means that regardless of your programmng background, Phix is easy to read. Phix simplicity does have its advantages.

			"Simplify and add lightness." fundamental engineering, oft repeated at Lotus motorcars. 

Rosetta (www.rosettacode.org) is a chrestamathy (programming alike) wiki site. You can compare the approaches taken by various programming languages to solve the same problem. On the Rosetta Stone, identical passages are carved into a rock featuring hieroglyphs, Egyptian, and Greek. The Rosetta Stone was the key to deciphering hieroglyphs. The modern Rosetta is the key to deciphering programming languages. You can compare Phix-code to various favorite, conventional, trendy, and other languages.

To make understanding algorithms easier, some textbooks use a pseudo language. Pseudo-code is accessible due to its simplicity, generic nature, and understandable syntax. Once you understand an algorithm in psueudo-code you can freely write code in your chosen language. Phix reads like pseudo-code, comes with no surprises or quirks, actually executes, and runs fast. That makes Phix a great key language in decoding the various languages found at Rosetta.

			"A language that doesn't have everything is actually easier to program in than some that do." — Dennis M. Richie

Phix is a "deliberately simple language" using Pete's own description. You can see this design paradigm at several levels: the interpreter|compiler, self-hosting design, p2js transpiler, and language syntax. Clearly this is a winning strategy at Rosetta.

That means some fancy computer science features are left out in the interest of making the interpreter simpler and the language syntax simpler. For example see the Closures/Value capture task. [rosettacode.org/wiki/Closures/Value_capture]. The problem was solved without requiring an extra feature. It's interesting that langauges that do have extra features (like closures) have yet to make it to the Rosetta finish line.

Phix has tracing which lets you step through lines of code. And Phix has profiling which reveals bottle necks in algorithms. But the "deliberately simple" design excludes elaborate debugging features. When a debugger operates with hidden values the interpreter itself becomes complicated. In turn, debugging the iterpreter itself becomes difficult. Once you abandon simplicity the cost of maintaining a language becomes problematic.

Phix is nimble. An advantage of self-hosting is that an added dimension is available in problem solving: you can modify Phix itself. The multi-line shebang task is for the Linux operating system, but Phix aims to be also Windows and Javascript compliant. The solution was to permit #[ and #] to behave like multi-line comments normally delimied by /* and */ markers. Since the language tokenizer ptok.e is written in Phix-code; it could be modified (changing six lines and adding twelve more) to create a more comprehensive solution. Phix has no external dependencies, p -c p , produced the new improved Phix just 15 seconds later.

In contrast to Phix, I doubt that many would consider re-compiling a programming language to introduce a feature or fix a bug.

For the more casual programmer look to the standard library which is written in Phix-code. You can examine it as an example of Phix coding. It is easy to add-to and improve the standard library.

In addition, the Phix to Javascript transpiler (p2js) currently solves more Javascript tasks than by using pure Javascript; advantage Phix 830 to 758. Of the remaining tasks some ~7% are unsuitable for p2js, leaving ~40% yet to be explored. This is with an engine that is still in development.

			"The readability of programs is immeasureably more important than their writeability." — C. A. R. Hoare

Phix is a predictable language because it is rigorously consistant. Once you learn that the addition operator + "adds numbers" you can depend on that truth for all programming tasks. The result 2+2 is four is as natural as 'A'+32 is 'a'; the results are both achieved through addition. Learn that the concatenation operator is & and you will see the similarity between "bob"&"cat" producing the string "bobcat", as the result of 2&2 is the sequence {2,2}. You can, more easily, solve many problems when the same principles apply to all kinds of data.

Phix is based on the idea that all values belonging to either the atom data-type (single value), or the sequence data-type (a list of values). If a language does not specify the data-type of a variable the reader has no clue about the role of a variable; reading code is harder. When there are many data-types the reader is overwhelmed by complexity; reading code is harder. Phix maximizes readability.

The atom|sequence data-types makes Phix a generic language. The routines and syntax of Phix often apply universaly to all data values. Indexing works the same for strings, sequences of numbers, and all nested forms. Sorting works the same for strings and numbers. Many routines work the same: for example you can take the square root of a number, a character, a sequence of numbers, or a string. There is no trickery here because all values are numbers, and you can take the square root of a number. Phix permits:

sqrt(9) is 3 sqrt('a') is 9.848857802 sqrt( "cat" ) is {9.949874371,9.848857802,10.77032961}

How meaningful sqrt("dog") is depends on your algorithm. Phix will issue a polite warning message, but will not crash.

The other way to be "generic" is to introduce overloading, polymorphism, inheritance, and type-casting to the language. The results of such "extreme generic code" can be slick and incomprehensible at the same time.

If you doubt the value of atom|sequence data-types just benchmark Phix against some popular interpreters. Phix is faster.

			"Are you quite sure that all those bells and whistles, all those wonderful facilities, belong to the solution set rather than the problem set?" — Edgar W. Dijkstra

You could debate the value of ersatz simplicity. Some Rosetta solutions "look" simple because they exploit special language features; which may be exactly what you want. Then again, solving a problem in one line of code with terse notation appeals to some people. Writing code with controlled whitespace is another popular way to simulate simplicity. But, learning to write this kind of "simple" code is an advanced skill. Programming is way too difficult on its own; are the complications of ersatz simplicity worth the extra effort?

I am personally in awe of what Pete Lomax has achieved. First in creating Phix, and then showing what Phix is capable of.


PL: wrong title - isn't the story about being the first programming langugage to hit 100% on RC?
PL: likewise the following reads as a bog-standard intro when it should be the story/newsworthy.

Rosetta Code

The Phix programming langauge ranks first at Rosetta Code; Phix has currently 1499 programming challenges solved (Jan 2022).

The Rosetta Stone is a rock with carved hieroglyphs, Egyptian, and Greek writing; all three languages describing the same content. This was the key to deciphering Egyptian hieroglphs. The Rosetta Code wiki www.rosettacode.org is doing the same for programming languages.

The programing challenges range from "Hello World", common tasks, writing a compiler, ..., to some devilishly difficult problems. There are over 800 languages participating. But it is Phix that is leading in problems solved.

Introducing Phix

Phix is the "simple is better" programming language. If you have doubts consider:

"A language that doesn't have everything is actually easier to program in than some that do." — Dennis M. Richie

"Are you quite sure that all those bells and whistles, all those wonderful facilities, belong to the solution set rather than the problem set?" — Edgar W. Dijkstra

"The readability of programs is immeasureably more important than their writeability." — C. A. R. Hoare

Phix validates the opinions of some leading computer scientists.

Introducing Phix

Phix is a hybrid, self-hosting, interpreter/compiler; that also has a developing transpiler to JavaScript. From the prompt in a terminal, $ p -c p , compiles Phix from its own source-code, all within seconds. Phix targets x86 (32bit and 64bit) Windows and Linux platforms.

Any language must address the needs of data|action|flow. In this sense all computer languages are similar. Using conventional thinking: you examine the binary nature of a computer word, add data-types, use offsets, mix in features, and end up with a conventional language. Using alternative thinking: you start with a programmer's needs, simplify them to the essentials, and end up with the Phix programming language.

data

We say all data belongs to the object data-type. You can program with just one data-type in Phix if you wish.

Code becomes clearer if you recognize a data object as "single or many." An atom is a single number: integer, float, character, or boolean — the emphasis is on single. A sequence is a list of many objects, which includes atoms, sequences, and nested sequences as items — the emphasis is on many. That is all you need to write code.

// double slashes are line comments
— — double dashes are also line comments

// atom variables defined:
atom a = 4, b = 6.7 , c = 'a'

// sequence variables defined:
sequence s = {1,2,3}, t = "hello", x = { 1, {2,3}, {}, "world" }

When you need more refinement you can use the built-in integer and string data-types. You can also define your own user type based on the five built-in types.

Since all values are object, then the idea of "data-type" serves to limit the permitted values that may be assigned to a variable. The basis of Phix data-types is: any, single, many (object, atom, sequence).

Therefore, you can assign an any value to an object variable. You must assign a single value to an atom. You must assign a many value to a sequence.

But, you are free to calculate and express any value in an expression. Data-types only limit assignments.

In Phix you can view an atom as a real number. Phix takes care of casting numbers into integer and float types and automatically provides the needed storage.

In Phix you can view a sequence as something you can do anything with. Pass a huge sequence as an argument to a function and Phix just copies a pointer; updates only happen if you change parts of the sequence. You can pass and copy sequences without cost of making actual copies of huge data-sets.

Phix only looks simple, but it does a lot for the programmer.

action

If all data is object, then all actions should apply equally to any values. This is a tremendous simplification.

For example the concatenation & operator joins data values:

? "hello" & " world"  //"hello world"
? 2 & 4               //{2,4}

For example the addition + operator adds numbers:

? 3+1   //4
? 'a'+32 //'A'
? {1,2,3}+{10,10,10} //{11,12,13}

Phix is inherently generic. You can learn an aspect of Phix, like the meaning of & and + , and be confident that there will be no surprises.

The sequence paradigm is used consistantly. For multiple assignment you write:

atom a,b,c
{a,b,c} = {10,2,300}  // a is 10, b is 2, and so on

The left-hand-side always receives values; the right-hand-side is the source of values.

The atom and sequence data-types work together in a predictable way. If you concatenate two atoms, 3 & 7 , you get a sequence {3,7}. When you concatenate a sequence and atom, {1,2,3} & 100 , you get a longer sequence {1,2,3,100} .

flow

Phix has relaxed syntax: no extra rules, indentation restrictions, or out of place punctuation is needed. This means you can write code that favours readabily instead of someone's idea style.

The usual while, for, if, switch statements behave as expected. You start a statement with a keyword, like while, and finish with end while. This pattern is used, predictably, for all statements.

There are two kinds of routines: procedure and function. A procedure is for action only. A function must return a value. This reduces ambiguity when reading code, and catches silly errors.

There are more flow statemets like: try/catch, type, struct, and class. They follow the same uniform syntax.

The include statement places the contents of a file (written in Phix) into your main program. The included files are in their own namespace. You must selectively export identifiers into the main progam, thus avoiding any messy conflicts.

You have co-operative multitasking. You can suspend action for a block of code and execute another block, alternating between blocks as required.

You can also write multithreaded code which allows the simultaneous exectution of code blocks in parallel.

Simplicity does not mean sacrificing capabilities.

sequence indexing

Sequences can represent most data-structures: list, queue, stack, tree, ... , using a common syntax for all operations.

Indexing uses [ ] as subscripts to each item in the sequence. Indexing is symmetrical: 3 means "third item from the head", while -3 means "third item from the tail".

Indexing can be used for all data manipulations: prepending, appending, inserting, deletion, retrieval, and mutating. You can apply indexing to one item or a slice of a sequence.

sequence or string

Phix is simple because sequences and strings work the same. Indexing, operators, and routines can be applied equally to sequences and strings. A string is not a unique (possibly immutable) data-type but a subset of the sequence data-type.

? sqrt( "cat" ) //{10,10.53565375,10.14889157}

Yes you can, each ascii character is a number; you are not prevented from taking the square root of any number. However, you will be issued a polite warning message, and you will not be allowed to assign this new value to a string variable.

A string is a sequence. The purpose of a string is to save on storage; each item is limited to one byte; this corresponds directly to a UTF8 Unicode string.

As you would expect there are "string" specific functions. The upper function (convert to upper case) makes sense for words and would not be used plain numbers. The regular expression library also makes sense for strings and not arbitrary data.

Simplicity means "work in a predictable consistant manner"; simplicity does not mean you are forced into one mode thinking.

find match

Data naturaly splits into a single or many viewpoint. Phix searching functions follow this viewpoint.

The find function locates an item in a sequence (or string) — think single.

? find( 'a', "cat and mouse" ) //2
? find( {2,1}, { {1,1},{1,2},{1,3},{2,1},{2,2} } //4

The match function locates a slice in a sequence (or string) — think many.

? match( "ta", "nice tea, ta" ) // →  11
? match( {3,4}, {1,2,3,4,5} )   //3

It does not matter if you data is a string of text or a list of complex values — find and match work the same all the time.

performance

Phix simplicity also means better performance: faster, and greener.

Phix is a fast interpreter. Compare the same algorithm in Phix and Python; using an impromptu benchmark:

// Phix
atom N = 31
function fib( integer n)
        if n < 2 then
                return n
                end if
        return fib(n-1) + fib(n-2)
end function
? fib( N )
# Python
N = 31
def fib(n):
        if n<2:
                return n
        return fib(n-1)+fib(n-2)
print( fib(N) )

Calculating fib(32)

	i0 Netbook
	* Phix 1.2 s  /vs/ Phix compiled 1.1 s /vs/ Python 6.0 s
	i7 Desktop
	* Phix 0.38 s  /vs/ Phix compiled 0.35 s /vs/ Python 0.47 s

A compiled program will be faster than an interpreted program. Launch Phix-code as p myapp to interpret, and p -c myapp to compile into a stand-alone application.

Computers will never be fast enough; it makes sense to use a fast interpreter.

cognitive load

They all say "easy to learn." Then, they teach you pseudo-code in preparation of writing real code. Well, Phix is simple like pseudo-code, but executes quickly.

A seasoned programmer may be able to exploit: aliasing, mutable/immutable, a dozen data-types, unique methods, tricky syntax rules and punctuation, convoluted error messages, ..., and surprises. It's hard to recommend a conventional language to a new programmer.


about 1500 words so far

PL:

Many have tried, many have failed, many that is bar one.

Rosetta Code is a wiki-based programming website with implementations of common algorithms and solutions to various programming problems in many different programming languages. Created in 2007, it has since grown (as of 26/01/2022) to a total of 1,503 tasks, some of which are trivial, and some fiendishly difficult. At last, one programming language now has submissions for all 100% of them, and it is one you have probably never heard of: Phix

Phix is a zero-dependency 31.5MB download (though Linux users will almost certainly want/need to install IUP), with extensive help files and over 1000 bundled demos (~100 of which are win32-only), which can recompile itself (naturally the full source code is included) in under 20 seconds. The only real downside is that most of the runtime is written using inline x86 assembly, so it won't run on (eg) arm devices.

While Phix is primarily a desktop programming language, with versions for Windows and Linux, it comes bundled with p2js which can transpile most programs to JavaScript so they can be run in a web browser. Naturally certain activities such as reading and writing disk files are not permitted in a web browser due to security considerations, however of those 1,503 tasks, 830 or 55% have been marked as JavaScript compatible (more in fact than have been submitted for JavaScript itself), with just 112 or 7.5% marked as incompatible, and obviously the remaining 37.5% are as yet unclassified.

Case study: multiline shebangs.

While trawling through the remaining 45% mentioned above it became clear the solution offered for this task was Linux-only, so I decided to add a solution that also worked seamlessly on Windows and under pwa/p2js, namely make #[ and #] be treated identically to /* and */, in other words multiline comments.

The first task was to make Edita/Edix syntax colour them correctly, which was a trivial matter of adding "#[ #]" to the BlockComment line in both demo\edita\syn\Euphoria.syn and demo\edix\syn\Phix.syn - both of which could be done without even restarting the editor(s). Fairly similar tweaks were needed to the GeSHi syntax highlighting as used on the Phix website, but I'll skip that part if you don't mind, especially since I took it upon myself to rebuild the phix.php file automatically from p2js_keywords.e, which has now become my single source of truth, and that way get all the keyword lists and everything else all matching and up-to-date.

I then setup a suitable test.exw file (a few ! were added 'cos GeSHi don't do nested comments):

#!/bin/bash
#[
  echo Phix ignores all text between #![ and #!] in exactly the same way as /* and */
  echo (both "and" are nested comments), allowing arbitrary shell code, for example
  cd /user/project/working
  exec /path/to/phix "$0" "$@"
  exit # may be needed for the shell to ignore the rest of this file.
# comments ignored by Phix end here -> #]


with javascript_semantics
puts(1,"This is Phix code\n")

Running the unmodified compiler on that predictably gave an error, and quickly inserting

?{Ch&"",toktype,SYMBOL,SPACE,tokline}

to ptok.e/getToken() and running p p test.exw gave me

{"#",4,3,2,247}

C:\Program Files (x86)\Phix\test.exw:2
#![
^ illegal

so I now know the toktype I need to handle there is HEXDEC.
All the changes needed to ptok.e (change 6 lines, insert a single new block of 12) are clearly marked 31/1/22.
Note that we're not handling "--#[" and "--#]" in the same way as "--/*" and "--*/".

Similar changes were needed to the tokeniser in pwa/p2js, and in fact I'd never gotten round to ignoring an initial shebang, so that had to go in as well, but the four changes that needed (ditto marked) were even easier. Anyway, job done, including syntax colouring in four different places, and proper compatibility for Windows, Linux, and JavaScript. Oh, there was one small thing I forgot: I ran `p -c p` and had to wait maybe 15 seconds for that to finish.

Edit - History - Print - Recent Changes - Search
Page last modified on March 06, 2022, at 11:30 AM