Expand/Shrink

printf

Definition: printf(integer fn, string fmt, object args={})
Description: fn: a file or device, typically 1 (stdout) for console apps, or a result from open().
fmt: a format string, eg "Hello %s\n".
args: the object(s) to be printed.

pwa/p2js: Supported, with some minor differences (see sprintf), and no support for unicode_align (which was really only ever meant to align terminal output anyway).
Comments: If args is an atom then all formats in fmt are applied to it.
If args is a sequence, then formats in fmt are applied to successive elements.
Note that printf() takes at most 3 arguments, however the length of the last argument, containing the values to be printed, can vary.

The basic formats are:

%s - print a utf-8 or ansi string or sequence as a string of characters, or print an atom as a single character
%q - as %s but enclosed in quotes, using backticks instead of doublequotes when advantageous
%Q - as %q but always using doublequotes with backslash-escaped characters when necessary
%t - print "true" or "false" from a boolean, from an args[i] of non-zero or zero respectively
%c - print an atom as a single character. NB: not unicode, performs and_bits(args[i],#FF)
%v - print the string result from sprint(args[i])
%V - as %v but without characters. Prints {65,66,67} as-is, whereas %v prints {65'A',66'B',67'C'}.
%d - print an atom as a decimal integer (nb. truncated rather than rounded)
%x - print an atom as a hexadecimal integer (0..9 and A..F)
%X - as %x but using lower case(!!) letters (a..f)
%o - print an atom as an octal integer
%b - print an atom as a binary integer
%a - print in base 2..36 using 0-9, a-z. Corresponding argument must be a {base,num} pair.
%A - print in base 2..62(|36) using 0-9, A-Z, a-z, ditto. %A(2..36)===upper(%a), p2js.js base > 36 crashes.
%e - print an atom as a floating point number with exponential notation
%E - as %e but with a capital E
%f - print an atom as a floating-point number with a decimal point but no exponent
%g - print an atom as a floating point number using either the %f or %e format, whichever seems more appropriate
%G - as %g except %E instead of %e
%% - print the '%' character itself

Field widths can be added to the basic formats, e.g. %5d, %8.2f, %10.4s.
The number before the decimal point is the minimum field width to be used.
The number after the decimal point is the precision to be used.

For %f and %e, the precision specifies how many digits follow the decimal point character, whereas
for %g, the precision specifies how many significant digits to print, and
for %s the precision specifies how many (the maximum number of) characters to print.

For floating points (%e|f|g), attempts to specify a precision over 20 trigger a format error, since that would exceed the precision limits of IEEE 754 floating point numbers on 64-bit, and any further digits would essentially just be gibberish. See mpfr if you need higher precision than native atoms allow. On 32-bit, precisions between 17 and 20 are quietly truncated to 16, to match the IEEE 754 precision limit on 32-bit, so in that way eg "%24.20f" shows the maximum sensible precision for both 32 and 64 bit (albeit with extra padding on 32 bit, and I have found that in practice a field width of precision+4 can sometimes more nicely align +ve and -ve numbers, however ymmv).

If the field width starts with a leading 0, e.g. %08d then leading zeros will be supplied to fill up the field.
If the field width starts with a '+' (not %s, %c, or %v) e.g. %+7d then a plus sign will be printed for positive values.
If the field width starts with a '-' e.g. %-5d then the value will be left-justified within the field. Normally it will be right-justified.
If the field width starts with '=' e.g. %=8s then it will be centred. A '|' is similar except it splits odd padding (eg) 4:3 whereas '=' splits it 3:4.
If the field width starts with a ',' (%d and %f only) then commas are inserted every third character (Phix-specific).
The 'starts with' is not entirely accurate. Note that '-=|' (and none) are mutually exclusive, and cannot co-exist with 0.
Likewise '0' and '+' are also mutually exclusive, however '+' can co-exist with '-=|' as long as it is specified first, whereas ',' can be used with any ('0+-=|') , as long as it is specified last.
You can actually zero-fill the string formats ('s','c', and 'v') if you want, not that I can think of any good reason to.

A statement such as printf(1,"%s","John Smith") should by rights just print 'J', as the %s should apply to the first element of args, which in this case is 'J'. The correct statement is printf(1,"%s",{"John Smith"}) where no such confusion can arise.
However this is such an easy mistake to make that in Phix it is caught specially (args is a single string and fmt has only one %-format, apart from any %%) and the full name printed.
Note however that Euphoria will just print 'J'.

%q uses backticks in preference to doublequotes when it contains one or more singlequote, doublequote, or (single) backslash, but not when it contains none of them or when it contains other characters which need backslashes, such as '\n', which cannot be represented within backticks. Note that it prefers doublequotes when it would otherwise make no difference - see the "prefer_backtick" setting below.

Unicode is supported via utf-8 strings, which this routine treats exactly the same as ansi.
%c does not support the printing of single unicode characters, but instead performs and_bits(a,#FF) and prints it as a standard ascii character.
To print a single unicode character, held in the integer uchar, I recommend using %s on the result from utf32_to_utf8({uchar}).
To print a utf-32 or utf-16 sequence it should first be passed through utf32_to_utf8() or utf16_to_utf8() respectively.

Note that %x and %X are "the wrong way round" for historical/compatibility reasons.

The argument for %a/%A must be a {base,num} pair, eg printf(1,"%d is %a in base %d",{i,{b,i},b}).
Note that while eg sprintf("%d",i) works like sprintf("%d",{i}), and also sprintf("%d (%08x, %b)",i) works like sprintf("%d (%08x, %b)",{i,i,i}), the same trick(s) cannot be applied to %a/%A.

Should you have gotten used to these in some other language’s version of (s)printf(), be advised that "%#x" is not supported, use "0x%x" instead, likewise replace "%#o" with "0%o", ie explicitly provide any required (fixed) prefix yourself, then print the number normally.
Settings: [Phix only, there is no equivalent for Euphoria]
If called with fn of 0 (stdin, which cannot be printed to anyway) and a fmt of "" (so there wouldn’t be any output anyway), then args is treated as a pair-list of settings, eg

            printf(0,"",{"unicode_align",true,"prefer_backtick",true})

"unicode_align" expects a bool and controls whether utf8_to_utf32() is invoked when padding to the minimum field width, which obviously makes a big difference when aligning unicode strings being displayed to a console/terminal, but is a completely unnecessary overhead in legacy/ansi-only code. In fact, making the default false significantly sped up some of the listing and ex.err file generation in the Phix compiler itself, or perhaps more accurately undid a fairly noticeable and starting-to-get-annoying performance drop, specifically on error, when writing out multi-megabyte ex.err files.

"prefer_backtick" expects a bool and controls whether eg sprintf("%q",{"hello"}) yields `hello` or "hello". Note that doublequotes will still be used if the string contains any characters which must be escaped, for instance '\n' cannot be represented in a backtick-enclosed string. The default is false, ie prefer doublequotes unless backticks are advantageous, as they are in eg `name="pete"`, that is compared to "name=\"pete\"". Obviously this setting has no effect whatsoever on the "%Q" format, or for that matter any others.

At the moment those are the only options implemented. A fatal error occurs if the odd element is not a recognised string or the even element is the wrong type.

Note that settings are not thread-safe (as in setting it on/off applies instantly to all threads, and if one is already in progress it could end up getting done half on/half off). Then again, while I have tried my best to make printf() as thread-safe as possible, it should probably be avoided in a background worker thread if at all possible, or perhaps only performed under the protection of a critical section.
Example 1:
balance = 12347.879
printf(myfile, "The account balance is: %,10.2f\n", balance)
      The acccount balance is:  12,347.88
Example 2:
name = "John Smith"
score = 97
printf(1, "|%15s, %5d |\n", {name, score})
      |     John Smith,    97 |
Example 3:
printf(1, "%-10.4s $ %s", {"ABCDEFGHIJKLMNOP", "XXX"})
      ABCD       $ XXX
Example 4:
printf(1, "error code %d[#%08x]", ERROR_CANCELLED)
      error code 2147943623[#800704C7]
See Also: sprintf, puts, open, and the gnu clib docs, on which the Phix version is partially based, but does not use directly.
puthex32(a) and putsint(i) are low-level equivalents of printf(1,"%08x[\n]",{a}) and printf(1,"%d[\n]",{i}) respectively, see builtins\puts1[h].e
Implementation: See builtins\VM\pprntfN.e (an autoinclude) for details of the actual implementation.
Expand/Shrink