printf
Definition: | printf(integer fn, string fmt, object args={}) |
Description: |
fn: a file or device, typically 1 (stdout) for console apps, or a result from open(). fmt: a format string, eg "Hello %s\n". args: the object(s) to be printed. |
pwa/p2js: | Supported, with some minor differences (see sprintf), and no support for unicode_align (which was really only ever meant to align terminal output anyway). |
Comments: |
If args is an atom then all formats in fmt are applied to it. If args is a sequence, then formats in fmt are applied to successive elements. Note that printf() takes at most 3 arguments, however the length of the last argument, containing the values to be printed, can vary. The basic formats are: %s - print a utf-8 or ansi string or sequence as a string of characters, or print an atom as a single character %q - as %s but enclosed in quotes, using backticks instead of doublequotes when advantageous %Q - as %q but always using doublequotes with backslash-escaped characters when necessary %t - print "true" or "false" from a boolean, from an args[i] of non-zero or zero respectively %n - print "\n" or "" from a boolean, from an args[i] of non-zero or zero respectively %c - print an atom as a single character. Integers #80..#10FFFF are converted to UTF-8 strings. %v - print the string result from sprint(args[i],asCh:=-1) %V - as %v but with characters/asCh:=false. Prints {65,66,67} as {65'A',66'B',67'C'}, whereas %v prints {65,66,67}. %d - print an atom as a decimal integer (nb. truncated rather than rounded) %x - print an atom as a hexadecimal integer (0..9 and A..F) %X - as %x but using lower case(!!) letters (a..f) %o - print an atom as an octal integer %O - as %o but with a leading 0o prefix %b - print an atom as a binary integer %a - print in base 2..36 using 0-9, a-z. Corresponding argument must be a {base,num} pair. %A - print in base 2..62(|36) using 0-9, A-Z, a-z, ditto. %A(2..36)===upper(%a). %e - print an atom as a floating point number with exponential notation %E - as %e but with a capital E %f - print an atom as a floating-point number with a decimal point but no exponent %F - as %f but trailing zeroes/decimal point blanked out, eg "1.1 " not "1.10", "1 " not "1.00", "1000" as-is (all length 4). %g - print an atom as a floating point number using either the %f or %e format, whichever seems more appropriate %G - as %g except %E instead of %e %R - print 1..3999 (nb those be hard limits) as a roman numeral, eg 7 -> VII %r - as %R except lowercase, eg 7 -> vii %% - print the '%' character itself Format specifiers may be explicitly subscripted, which simply modifies the current index into args, which then continues to increment normally, for example "%d %[3]d %d %[7]d %d" is equivalent to "%[1]d %[3]d %[4]d %[7]d %[8]d", and of course both will crash should the length of args be less than 8, aka the index be let run beyond the length of args. Negative subscripts are also valid, eg "%[-2]d %d" == "%[-2]d %[-1]d", and again obviously it will crash should the index be let run to zero. A subscript is specified by an opening '[' immediately after the '%' followed by a fixed integer and then a closing ']'. Whilst not actually implemented in such a manner, an alternative way of describing what happens is that sprintf("%[3]02d/%[2]02d/%[1]4d",args) is functionally equivalent tosprintf("%02d/%02d/%4d",extract(args,{3,2,1})) and of course you are free to
use something more like the latter, however there are situations where that would be more difficult to use.Subscripts [as just shown] can be particularly useful for formatting the output of date() without having to reorder or extract() it, and consequentially they can be very h/dandy as IupTable format strings. Field widths can be added to the basic formats, e.g. %5d, %8.2f, %10.4s. The number before the decimal point is the minimum field width to be used. The number after the decimal point is the precision to be used. For %f and %e, the precision specifies how many digits follow the decimal point character, whereas for %g, the precision specifies how many significant digits to print, and for %s the precision specifies how many (the maximum number of) characters to print. For floating points (%e|f|g), attempts to specify a precision over 20 trigger a format error, since that would exceed the precision limits of IEEE 754 floating point numbers on 64-bit, and any further digits would essentially just be gibberish. See mpfr if you need higher precision than native atoms allow. On 32-bit, precisions between 17 and 20 are quietly truncated to 16, to match the IEEE 754 precision limit on 32-bit, so in that way eg "%24.20f" shows the maximum sensible precision for both 32 and 64 bit (albeit with extra padding on 32 bit, and I have found that in practice a field width of precision+4 can sometimes more nicely align +ve and -ve numbers, however ymmv). If the field width starts with a '0' e.g. %08d then leading zeros are supplied to fill up the field. If the field width starts with a '+' e.g. %+7d then a plus sign is printed for positive values (not %s, %c, or %v). If the field width starts with a '_' e.g. %_7d then a space is printed for non-negative values (ditto, Phix-specific). If the field width starts with a '-' e.g. %-5d then the value is left-justified within the field. Normally it is right-justified. If the field width starts with a '=' e.g. %=8s then it is centred. A '|' is similar except it splits odd padding (eg) 4:3 whereas '=' splits it 3:4. If the field width starts with a ',' then commas are inserted every third character (Phix-specific, %d and %f only). The 'starts with' is not entirely accurate. Note that '-=|' (and none) are mutually exclusive, and cannot co-exist with 0. Likewise '0' and '+' are also mutually exclusive, however '+' (or '_') can co-exist with '-=|' as long as it is specified first, whereas ',' can be used with any ('0+-=|') , as long as it is specified last. You can actually zero-fill the string formats ('s','c', and 'v') if you want, not that I can think of any good reason to. A statement such as printf(1,"%s","John Smith") should by rights just print 'J', as the %s should apply to the first element of args, which in this case is 'J'. The correct statement is printf(1,"%s",{"John Smith"}) where no such confusion can arise. However this is such an easy mistake to make that in Phix it is caught specially (args is a single string and fmt has only one %-format, apart from any %%) and the full name printed. Note however that Euphoria will just/only print the 'J'. %q uses backticks in preference to doublequotes when it contains one or more singlequote, doublequote, or (single) backslash, but not when it contains none of them or when it contains other characters which need backslashes, such as '\n', which cannot be represented within backticks. Note that it prefers doublequotes when it would otherwise make no difference - see the "prefer_backtick" setting below. Unicode is supported via utf-8 strings, which this routine treats exactly the same as ansi. To print a utf-32 or utf-16 sequence it should first be passed through utf32_to_utf8() or utf16_to_utf8() respectively. Note the standard Windows console does not render unicode well, for instance (taking Phix out of the equation completely) run `notepad demo\HelloUTF8.exw` and `type demo\HelloUTF8.exw | more` alongside that and compare what they look like screen-by-screen. Obviously it is simply not possible to make printf(1,<utf8>) achieve anything better on a Windows console. Linux & Browsers do much better. Actually, one thing we can and now do is [the equivalent of] `chcp 65001`: run that before running the type|more to get slightly better but still not exactly stellar results, which Phix should match (but only if you have saved the source code in UTF8 rather than ascii, obviously). Note that %x and %X are "the wrong way round" for historical/compatibility reasons. The argument for %a/%A must be a {base,num} pair, eg printf(1,"%d is %a in base %d",{i,{b,i},b}). Note that while eg sprintf("%d",i) works like sprintf("%d",{i}), and also sprintf("%d (%08x, %b)",i) works like sprintf("%d (%08x, %b)",{i,i,i}), the same trick(s) cannot be applied to %a/%A. Should you have gotten used to these in some other language’s version of (s)printf(), be advised that "%#x" is not supported, use "0x%x" instead, likewise replace "%#o" with "0%o", ie explicitly provide any required (fixed) prefix yourself, then print the number normally. |
Settings: |
[Phix only, there is no equivalent for Euphoria] If called with fn of 0 (stdin, which cannot be printed to anyway) and a fmt of "" (so there wouldn’t be any output anyway), then args is treated as a pair-list of settings, eg printf(0,"",{"unicode_align",true,"prefer_backtick",true}) "unicode_align" expects a bool and controls whether utf8_to_utf32() is invoked when padding to the minimum field width, which obviously makes a big difference when aligning unicode strings being displayed to a console/terminal, but is a completely unnecessary overhead in legacy/ansi-only code. In fact, making the default false significantly sped up some of the listing and ex.err file generation in the Phix compiler itself, or perhaps more accurately undid a fairly noticeable and starting-to-get-annoying performance drop, specifically on error, when writing out multi-megabyte ex.err files. "prefer_backtick" expects a bool and controls whether eg sprintf("%q",{"hello"}) yields `hello` or "hello". Note that doublequotes will still be used if the string contains any characters which must be escaped, for instance '\n' cannot be represented in a backtick-enclosed string. The default is false, ie prefer doublequotes unless backticks are advantageous, as they are in eg `name="pete"`, that is compared to "name=\"pete\"". Obviously this setting has no effect whatsoever on the "%Q" format, or for that matter any others. At the moment those are the only options implemented. A fatal error occurs if the odd element is not a recognised string or the even element is the wrong type. Note that settings are not thread-safe (as in setting it on/off applies instantly to all threads, and if one is already in progress it could end up getting done half on/half off). Then again, while I have tried my best to make printf() as thread-safe as possible, it should probably be avoided in a background worker thread if at all possible, or perhaps only performed under the protection of a critical section. |
Example 1: |
balance = 12347.879 printf(myfile, "The account balance is: %,10.2f\n", balance) The acccount balance is: 12,347.88 |
Example 2: |
name = "John Smith" score = 97 printf(1, "|%15s, %5d |\n", {name, score}) | John Smith, 97 | |
Example 3: |
printf(1, "%-10.4s $ %s", {"ABCDEFGHIJKLMNOP", "XXX"}) ABCD $ XXX |
Example 4: |
printf(1, "error code %d[#%08x]", ERROR_CANCELLED) error code 2147943623[#800704C7] |
See Also: |
sprintf,
puts,
open,
and the
gnu clib docs,
on which the Phix version is partially based, but does not use directly.
puthex32(a) and putsint(i) are low-level equivalents of printf(1,"%08x[\n]",{a}) and printf(1,"%d[\n]",{i}) respectively, see builtins\puts1[h].e |
Implementation: |
See builtins\VM\pprntfN.e (an autoinclude) for details of the actual implementation. The file pwa/p2js.js also contains a completely independent hand-crafted version. |