utf16_to_utf32
Definition: |
dword_sequence utf32 = utf16_to_utf32(sequence utf16)
-- or -- string utf8 = utf16_to_utf8(sequence utf16) |
Description: | Convert a UTF-16 sequence to UTF-32.
Returns a dword_sequence. |
pwa/p2js: |
Not supported, mainly because I cannot think of any good reason to support utf16 in a browser 😀. It would probably be fairly easy, apart from the fact that p2js.js (now) uses codePointAt() in preference to charCodeAt(). |
Comments: |
The input should not contain any elements outside the range 0..#FFFF, or any unmatched surrogates (any values in the range
#D800..#DFFF must be present in matching pairs), and the output should not contain any elements outside the range 0..#10FFFF,
or any values in the range #D800..#DFFF. Any such erroneous values are replaced with the element value #FFFD.
Unlike utf8_to_utf32() there is no optional fail_flag, and when necessary you must manually check for the presence of any #FFFD in the return, rather than a return of -1. Note, however, the input can legally contain #FFFD, which are returned unaltered, so technically it is more #FFFD in the output than were in the input that constitutes an error. The function utf16_to_utf8(sequence utf16) is a simple wrapper that returns utf32_to_utf8(utf16_to_utf32(utf16)). |
See Also: | utfconv, utf8_to_utf32, utf32_to_utf8, utf32_to_utf16 |