Expand/Shrink

utf16_to_utf32

Definition: dword_sequence utf32 = utf16_to_utf32(sequence utf16)
-- or --
string utf8 = utf16_to_utf8(sequence utf16)
Description: Convert a UTF-16 sequence to UTF-32.

Returns a dword_sequence.
pwa/p2js: Not supported, mainly because I cannot think of any good reason to support utf16 in a browser 😀.
It would probably be fairly easy, apart from the fact that p2js.js (now) uses codePointAt() in preference to charCodeAt().
Comments: The input should not contain any elements outside the range 0..#FFFF, or any unmatched surrogates (any values in the range #D800..#DFFF must be present in matching pairs), and the output should not contain any elements outside the range 0..#10FFFF, or any values in the range #D800..#DFFF. Any such erroneous values are replaced with the element value #FFFD.

Unlike utf8_to_utf32() there is no optional fail_flag, and when necessary you must manually check for the presence of any #FFFD in the return, rather than a return of -1.

Note, however, the input can legally contain #FFFD, which are returned unaltered, so technically it is more #FFFD in the output than were in the input that constitutes an error.

The function utf16_to_utf8(sequence utf16) is a simple wrapper that returns utf32_to_utf8(utf16_to_utf32(utf16)).
See Also: utfconv, utf8_to_utf32, utf32_to_utf8, utf32_to_utf16