Number Bases
Numbers (integers, including larger values quietly stored as atoms, but excluding any real numbers with fractions or
(decimal) exponents) can also be entered in hexadecimal, octal, binary, or in fact any number base between 2 and 36.
[There is in fact an example in demo/rosetta that deals with non-decimal fractions, but that is not and never
will be part of the core language, you may be relieved to hear.]
We humans are so used to working in base 10 we forget it is quite arbitrary, and used simply because of how many fingers and thumbs we have. I once heard that Mayans counted their wrists as well and worked in base 12, and the Babylonians did something even more clever with the twelve finger knuckles of one hand, also pre-1970 the UK had 12 pennies in a shilling, so 11p+2p was 1s1p. Of course there are 12 inches in a foot, 12 months in a year, 24 hours in a day, and 60 minutes in an hour and 60 seconds in a minute - all examples of non-decimal number bases, albeit ones we tend to mix with a healthy dose of decimal.
Very briefly, a decimal number is actually a sum of powers of 10 starting from 0: given that 100=1, 101=10, 102=100, 103=1000, etc, the decimal number 27 is in fact the sum of 2*10 + 7*1. Binary numbers are a sum of powers of 2, so 0b00011011 means 16+8+2+1 (=27) and hexadecimal numbers are a sum of powers of 16, with A..F standing in for 10..15, so a B means 11 and #1B means 1*16+11*1 (=27), in other words any (whole) number can be expressed in any base. Technically it is possible to express fractions in other bases, while neither the Phix language nor any other that I know of has any features to directly support that, some (not very pretty) examples of such beasts can be found in demo\rosetta\Generic_multiplication.exw as included with the standard distribution.
As you probably already know, a computer ultimately only has on/off switches, and therefore works in binary, or base 2. Arranging those switches into groups of 3 or 4 leads to octal (base 8) and hexadecimal (base 16) respectively. There may never be much use for say base 7, although weeks may be one possibility, or for that matter any above 16, however it was very little extra work over just 2/8/10/16 (see ptok.e/loadBase() for the full implementation details) and the other bases (2..36) are available should anyone ever need them. As the above examples suggest there may well be times when using base 12 actually simplifies matters, or maybe defining/solving an N-sided rubiks cube is easier in base N. As yet another example, pilx86.e (part of the compiler) makes heavy use of octal and would be even harder to understand if everything had to be expressed in decimal, or even hexadecimal. An x86 CPU is in essence an octal machine: an xrm of 250 or #FA tells me absolutely nothing(!), whereas even though it represents exactly the same number as the other two, 0o372 immediately suggests regs:edi(=7),edx(=2). If you can see that edi(7) hiding in 250 in less than 20 seconds [tip: remainder(floor(250/8),8)], then clearly you are much faster at mental arithmetic than I am. #FA is a tad easier but you still have to mentally extract and glue together #3 and #8, to get that 7. One last example: hexadecimal numbers are very often used for bit settings. Should you see a bunch of constants defined as (#)1,2,4,8,10,20,40,80,100, etc you can immediately tell there are no conflicts, and likewise a flags setting of #4106 clearly has 4 bits set, which is far from obvious when looking at the equivalent decimal 16646.
If all this is very confusing, there is no need to worry about it, just be aware that some problems are much easier to reason logically about using these strange numbers. Edita has builtin functions, hidden under Edit/Case, to convert any current block selected number to the more common number bases. (You have to block select a number, or at least some text, to enable that particular sub-menu, and said functions will mutely do nothing if they do not recognise the entire selection as a number.)
The supported number bases are:
There is one known minor gotcha (hardly worth mentioning really) in that "a=1b=2" compiles fine whereas "a=0b=1" does not, with the same problem when b is replaced with any identifier starting with any of b/t/o/x/d (including type/procedure names, break, then, object, do, etc) - and obviously this can be cured completely by the trivial insertion of a single space, linebreak, or semicolon.
We humans are so used to working in base 10 we forget it is quite arbitrary, and used simply because of how many fingers and thumbs we have. I once heard that Mayans counted their wrists as well and worked in base 12, and the Babylonians did something even more clever with the twelve finger knuckles of one hand, also pre-1970 the UK had 12 pennies in a shilling, so 11p+2p was 1s1p. Of course there are 12 inches in a foot, 12 months in a year, 24 hours in a day, and 60 minutes in an hour and 60 seconds in a minute - all examples of non-decimal number bases, albeit ones we tend to mix with a healthy dose of decimal.
Very briefly, a decimal number is actually a sum of powers of 10 starting from 0: given that 100=1, 101=10, 102=100, 103=1000, etc, the decimal number 27 is in fact the sum of 2*10 + 7*1. Binary numbers are a sum of powers of 2, so 0b00011011 means 16+8+2+1 (=27) and hexadecimal numbers are a sum of powers of 16, with A..F standing in for 10..15, so a B means 11 and #1B means 1*16+11*1 (=27), in other words any (whole) number can be expressed in any base. Technically it is possible to express fractions in other bases, while neither the Phix language nor any other that I know of has any features to directly support that, some (not very pretty) examples of such beasts can be found in demo\rosetta\Generic_multiplication.exw as included with the standard distribution.
As you probably already know, a computer ultimately only has on/off switches, and therefore works in binary, or base 2. Arranging those switches into groups of 3 or 4 leads to octal (base 8) and hexadecimal (base 16) respectively. There may never be much use for say base 7, although weeks may be one possibility, or for that matter any above 16, however it was very little extra work over just 2/8/10/16 (see ptok.e/loadBase() for the full implementation details) and the other bases (2..36) are available should anyone ever need them. As the above examples suggest there may well be times when using base 12 actually simplifies matters, or maybe defining/solving an N-sided rubiks cube is easier in base N. As yet another example, pilx86.e (part of the compiler) makes heavy use of octal and would be even harder to understand if everything had to be expressed in decimal, or even hexadecimal. An x86 CPU is in essence an octal machine: an xrm of 250 or #FA tells me absolutely nothing(!), whereas even though it represents exactly the same number as the other two, 0o372 immediately suggests regs:edi(=7),edx(=2). If you can see that edi(7) hiding in 250 in less than 20 seconds [tip: remainder(floor(250/8),8)], then clearly you are much faster at mental arithmetic than I am. #FA is a tad easier but you still have to mentally extract and glue together #3 and #8, to get that 7. One last example: hexadecimal numbers are very often used for bit settings. Should you see a bunch of constants defined as (#)1,2,4,8,10,20,40,80,100, etc you can immediately tell there are no conflicts, and likewise a flags setting of #4106 clearly has 4 bits set, which is far from obvious when looking at the equivalent decimal 16646.
If all this is very confusing, there is no need to worry about it, just be aware that some problems are much easier to reason logically about using these strange numbers. Edita has builtin functions, hidden under Edit/Case, to convert any current block selected number to the more common number bases. (You have to block select a number, or at least some text, to enable that particular sub-menu, and said functions will mutely do nothing if they do not recognise the entire selection as a number.)
The supported number bases are:
0b{0..1} 0(2){0..1} binary
0(3){0..2} ternary
0(4){0..3}
...
0o{0..7} 0(8){0..7} octal
{0..9} 0(10){0..9} decimal
0x{0..9|A..F} 0(16){0..9|A..F} hexadecimal
...
0(36){0..9|A..Z} base 36
Some examples of various number bases, with our familiar decimal equivalents on the right:
#FE -- 254
#fe -- 254
0xA000 -- 40,960
0(16)FFFF00008 -- 68,718,428,168 (which Phix must store as an atom)
#123456789ABCD -- 320,255,973,501,901 ("")
-#10 -- -16
0d10 -- 10 (ie, explicitly decimal)
0o377 -- 255 (octal, not compatible with Euphoria)
0t377 -- 255 (Euphoria uses t for octal, so Phix supports that too)
0b111 -- 7 (binary)
0(7)10 -- 7 (for even more examples see test\t37misc.e)
The main idea is that Phix can handle almost all number formats ("as found on'tinternet") with minimal editing.
Notable and deliberate exceptions are the "trailing h" of assembly and C, and the cruel and mischievous practical joke that is
"leading 0 is octal" of C (unless it has a trailing h or u, and, I kid thee not, 07 and below are fine but 08 and 09 are invalid,
and of course that exact same nonsense all occurs in JavaScript etc). While the core language itself does not [yet] go that far,
several standard routines support number bases up to 62, such as printf(), scanf(),
mpz_set_str() and mpz_get_str().
There is one known minor gotcha (hardly worth mentioning really) in that "a=1b=2" compiles fine whereas "a=0b=1" does not, with the same problem when b is replaced with any identifier starting with any of b/t/o/x/d (including type/procedure names, break, then, object, do, etc) - and obviously this can be cured completely by the trivial insertion of a single space, linebreak, or semicolon.