Ensure proper UTF-8 encoding (1 to 4 bytes).
Handle invalid encodings (return 0xFFFD and consume a single byte)
Individually encoded surrogate code points are accepted.
- add `utf8_scan()` to analyze a byte array for UTF-8 contents
detects invalid encoding, computes number of codepoints and content kind:
plain ASCII, 8-bit, 16-bit or larger codepoints.
- add `utf8_encode_len(c)` to compute the number of bytes to encode `c`
- rename `unicode_to_utf8` as `utf8_encode`
- rename `unicode_from_utf8` as `utf8_decode`
- add `utf8_decode_buf8(dest, size, src, len)` to decode a UTF-8 encoded
byte array known to contain only ASCII and 8-bit codepoints.
- add `utf8_decode_buf16(dest, size, src, len)` to decode a UTF-8 encoded
byte array into an array of 16-bit codepoints using UTF-16 surrogate pairs
for non-BMP1 codepoints.
- add `utf8_encode_buf8(dest, size, src, len)` to encode an array of 8-bit
codepoints as a UTF-8 encoded null terminated string
- add `utf16_encode_buf8(dest, size, src, len)` to decode an array of 16-bit
codepoints (including surrogate pairs) as a UTF-8 encoded null terminated string
- detect invalid UTF-8 encoding in RegExp parser
- simplify `JS_AtomGetStrRT`, `JS_NewStringLen` using the above functions
- simplify UTF-8 decoding and error testing
* Add utility functions, improve integer conversion functions
- move `is_be()` to cutils.h
- add `is_upper_ascii()` and `to_upper_ascii()`
- add extensive benchmark for integer conversion variants in **tests/test_conv.c**
- add `u32toa()`, `i32toa()`, `u64toa()`, `i64toa()` based on register shift variant
- add `u32toa_radix()`, `u64toa_radix()`, `i64toa_radix()` based on length_loop variant
- use direct converters instead of `snprintf()`
- copy NaN and Infinity directly in `js_dtoa1()`
- optimize `js_number_toString()` for small integers
- use `JS_NewStringLen()` instead of `JS_NewString()` when possible
- add more precise conversion tests in microbench.js
- disable some benchmark tests for gcc (they cause ASAN failures)