Strings are primitive values in JavaScript and immutable. That is, string-related operations always produce new strings and never change existing strings.
Literals for strings:
const str1 = 'Don\'t say "goodbye"'; // string literal
const str2 = "Don't say \"goodbye\""; // string literals
assert.equal(
`As easy as ${123}!`, // template literal
'As easy as 123!',
);
Backslashes are used to:
\\
represents a backslash
\n
represents a newline
\r
represents a carriage return
\t
represents a tab
Inside a String.raw
tagged template (line A), backslashes are treated as normal characters:
assert.equal(
String.raw`\ \n\t`, // (A)
'\\ \\n\\t',
);
Convertings values to strings:
> String(undefined)
'undefined'
> String(null)
'null'
> String(123.45)
'123.45'
> String(true)
'true'
Copying parts of a string
// There is no type for characters;
// reading characters produces strings:
const str3 = 'abc';
assert.equal(
str3[2], 'c' // no negative indices allowed
);
assert.equal(
str3.at(-1), 'c' // negative indices allowed
);
// Copying more than one character:
assert.equal(
'abc'.slice(0, 2), 'ab'
);
Concatenating strings:
assert.equal(
'I bought ' + 3 + ' apples',
'I bought 3 apples',
);
let str = '';
str += 'I bought ';
str += 3;
str += ' apples';
assert.equal(
str, 'I bought 3 apples',
);
Example – a grapheme cluster that consists of multiple code points:
const graphemeCluster = '😵💫';
assert.equal(
// 5 JavaScript characters
'😵💫'.length, 5
);
assert.deepEqual(
// Iteration splits into code points
Array.from(graphemeCluster),
['😵', '\u200D', '💫']
);
For more information on how to handle text, see “Atoms of text: code points, JavaScript characters, grapheme clusters” (§22.7).
This subsection gives a brief overview of the string API. There is a more comprehensive quick reference at the end of this chapter.
Finding substrings:
> 'abca'.includes('a')
true
> 'abca'.startsWith('ab')
true
> 'abca'.endsWith('ca')
true
> 'abca'.indexOf('a')
0
> 'abca'.lastIndexOf('a')
3
Splitting and joining:
assert.deepEqual(
'a, b,c'.split(/, ?/),
['a', 'b', 'c']
);
assert.equal(
['a', 'b', 'c'].join(', '),
'a, b, c'
);
Padding and trimming:
> '7'.padStart(3, '0')
'007'
> 'yes'.padEnd(6, '!')
'yes!!!'
> '\t abc\n '.trim()
'abc'
> '\t abc\n '.trimStart()
'abc\n '
> '\t abc\n '.trimEnd()
'\t abc'
Repeating and changing case:
> '*'.repeat(5)
'*****'
> '= b2b ='.toUpperCase()
'= B2B ='
> 'ΑΒΓ'.toLowerCase()
'αβγ'
Plain string literals are delimited by either single quotes or double quotes:
const str1 = 'abc';
const str2 = "abc";
assert.equal(str1, str2);
Single quotes are used more often because it makes it easier to mention HTML, where double quotes are preferred.
The next chapter covers template literals, which give us:
The backslash lets us create special characters:
'\n'
'\r\n'
'\t'
'\\'
The backslash also lets us use the delimiter of a string literal inside that literal:
assert.equal(
'She said: "Let\'s go!"',
"She said: \"Let's go!\"");
JavaScript has no extra data type for characters – characters are always represented as strings.
const str = 'abc';
// Reading a JavaScript character at a given index
assert.equal(str[1], 'b');
// Counting the JavaScript characters in a string:
assert.equal(str.length, 3);
The characters we see on screen are called grapheme clusters. Most of them are represented by single JavaScript characters. However, there are also grapheme clusters (especially emojis) that are represented by multiple JavaScript characters:
> '🙂'.length
2
How that works is explained in “Atoms of text: code points, JavaScript characters, grapheme clusters” (§22.7).
+
If at least one operand is a string, the plus operator (+
) converts any non-strings to strings and concatenates the result:
assert.equal(3 + ' times ' + 4, '3 times 4');
The assignment operator +=
is useful if we want to assemble a string, piece by piece:
let str = ''; // must be `let`!
str += 'Say it';
str += ' one more';
str += ' time';
assert.equal(str, 'Say it one more time');
Concatenating via
+
is efficient
Using +
to assemble strings is quite efficient because most JavaScript engines internally optimize it.
Exercise: Concatenating strings
exercises/strings/concat_string_array_test.mjs
.push()
and .join()
)Occasionally, taking a detour via an Array can be useful for concatenating strings – especially if there is to be a separator between them (such as ', '
in line A):
function getPackingList(isAbroad = false, days = 1) {
const items = [];
items.push('tooth brush');
if (isAbroad) {
items.push('passport');
}
if (days > 3) {
items.push('water bottle');
}
return items.join(', '); // (A)
}
assert.equal(
getPackingList(),
'tooth brush'
);
assert.equal(
getPackingList(true, 7),
'tooth brush, passport, water bottle'
);
Converting values to strings in JavaScript is more complicated than it might seem:
Can you spot the problem in the following code?
class UnexpectedValueError extends Error {
constructor(value) {
super('Unexpected value: ' + value); // (A)
}
}
For some values, this code throws an exception in line A:
> new UnexpectedValueError(Symbol())
TypeError: Cannot convert a Symbol value to a string
> new UnexpectedValueError({__proto__:null})
TypeError: Cannot convert object to primitive value
Read on for more information.
String(v)
v.toString()
'' + v
`${v}`
The following table shows how these operations fare with various values (#4 produces the same results as #3).
String(v) | '' + v | v.toString() |
|
---|---|---|---|
undefined | 'undefined' | 'undefined' | TypeError |
null | 'null' | 'null' | TypeError |
true | 'true' | 'true' | 'true' |
123 | '123' | '123' | '123' |
123n | '123' | '123' | '123' |
"abc" | 'abc' | 'abc' | 'abc' |
Symbol() | 'Symbol()' | TypeError | 'Symbol()' |
{a:1} | '[object Object]' | '[object Object]' | '[object Object]' |
['a'] | 'a' | 'a' | 'a' |
{__proto__:null} | TypeError | TypeError | TypeError |
Symbol.prototype | TypeError | TypeError | TypeError |
() => {} | '() => {}' | '() => {}' | '() => {}' |
Let’s explore why some of these values produce exceptions or results that aren’t very useful.
Symbols must be converted to strings explicitly (via String()
or .toString()
). Conversion via concatenation throws an exception:
> '' + Symbol()
TypeError: Cannot convert a Symbol value to a string
Why is that? The intent is to prevent accidentally converting a symbol property key to a string (which is also a valid property key).
null
prototypesIt’s obvious why v.toString()
doesn’t work if there is no method .toString()
. However, the other conversion operations call the following methods in the following order and use the first primitive value that is returned (after converting it to string):
v[Symbol.toPrimitive]()
v.toString()
v.valueOf()
If none of these methods are present, a TypeError
is thrown.
> String({__proto__: null, [Symbol.toPrimitive]() {return 'YES'}})
'YES'
> String({__proto__: null, toString() {return 'YES'}})
'YES'
> String({__proto__: null, valueOf() {return 'YES'}})
'YES'
> String({__proto__: null}) // no method available
TypeError: Cannot convert object to primitive value
Where might we encounter objects with null
prototypes?
null
prototypes as dictionaries” (§30.9.11.4)
null
prototypes as fixed lookup tables” (§30.9.11.5)
null
prototypes in the standard library” (§30.9.11.6)
Plain objects have default string representations that are not very useful:
> String({a: 1})
'[object Object]'
Arrays have better string representations, but they still hide much information:
> String(['a', 'b'])
'a,b'
> String(['a', ['b']])
'a,b'
> String([1, 2])
'1,2'
> String(['1', '2'])
'1,2'
> String([true])
'true'
> String(['true'])
'true'
> String(true)
'true'
Symbol.prototype
You’ll probably never encounter the value Symbol.prototype
(the object that provides symbols with methods) in the wild but it’s an interesting edge case: Symbol.prototype[Symbol.toPrimitive]()
throws an exception if this
isn’t a symbol. That explains why converting Symbol.prototype
to a string doesn’t work:
> Symbol.prototype[Symbol.toPrimitive]()
TypeError: Symbol.prototype [ @@toPrimitive ] requires that 'this' be a Symbol
> String(Symbol.prototype)
TypeError: Symbol.prototype [ @@toPrimitive ] requires that 'this' be a Symbol
JSON.stringify()
to convert values to stringsThe JSON data format is a text representation of JavaScript values. Therefore, JSON.stringify()
can also be used to convert values to strings. It works especially well for objects and Arrays where the normal conversion to string has significant deficiencies:
> JSON.stringify({a: 1})
'{"a":1}'
> JSON.stringify(['a', ['b']])
'["a",["b"]]'
JSON.stringify()
is OK with objects whose prototypes are null
:
> JSON.stringify({__proto__: null, a: 1})
'{"a":1}'
On major downside is that JSON.stringify()
only supports the following values:
null
NaN
and Infinity
)
For most other values, we get undefined
as a result (and not a string):
> JSON.stringify(undefined)
undefined
> JSON.stringify(Symbol())
undefined
> JSON.stringify(() => {})
undefined
Bigints cause exceptions:
> JSON.stringify(123n)
TypeError: Do not know how to serialize a BigInt
Properties with undefined
-producing values are omitted:
> JSON.stringify({a: Symbol(), b: 2})
'{"b":2}'
Array elements whose values produce undefined
, are stringified as null
:
> JSON.stringify(['a', Symbol(), 'b'])
'["a",null,"b"]'
The following table summarizes the results of JSON.stringify(v)
:
JSON.stringify(v) |
|
---|---|
undefined | undefined |
null | 'null' |
true | 'true' |
123 | '123' |
123n | TypeError |
'abc' | '"abc"' |
Symbol() | undefined |
{a:1} | '{"a":1}' |
['a'] | '["a"]' |
() => {} | undefined |
{__proto__:null} | '{}' |
Symbol.prototype | '{}' |
For more information, see “Details on how data is converted to JSON” (§48.3.1.3).
By default, JSON.stringify()
returns a single line of text. However the optional third parameter enables multiline output and lets us specify how much to indent – for example:
assert.equal(
JSON.stringify({first: 'Robin', last: 'Doe'}, null, 2),
`{
"first": "Robin",
"last": "Doe"
}`
);
JSON.stringify()
JSON.stringify()
is useful for displaying arbitrary strings:
Example:
const strWithNewlinesAndTabs = `
TAB-> <-TAB
Second line
`;
console.log(JSON.stringify(strWithNewlinesAndTabs));
Output:
"\nTAB->\t<-TAB\nSecond line \n"
Alas, there are no good built-in solutions for stringification that work all the time. In this section, we’ll explore a short function that works for all simple use cases, along with solutions for more sophisticated use cases.
toString()
functionWhat would a simple solution for stringification look like?
JSON.stringify()
works well for a lot of data, especially plain objects and Arrays. If it can’t stringify a given value, it returns undefined
instead of a string – unless the value is a bigint. Then it throws an exception.
Therefore, we can use the following function for stringification:
function toString(v) {
if (typeof v === 'bigint') {
return v + 'n';
}
return JSON.stringify(v) ?? String(v); // (A)
}
For values that are not supported by JSON.stringify()
, we use String()
as a fallback (line A). That function only throws for the following two values – which are both handled well by JSON.stringify()
:
{__proto__:null}
Symbol.prototype
The following table summarizes the results of toString()
:
toString() |
|
---|---|
undefined | 'undefined' |
null | 'null' |
true | 'true' |
123 | '123' |
123n | '123n' |
'abc' | '"abc"' |
Symbol() | 'Symbol()' |
{a:1} | '{"a":1}' |
['a'] | '["a"]' |
() => {} | '() => {}' |
{__proto__:null} | '{}' |
Symbol.prototype | '{}' |
JSON.stringify
just without all the double-quotes”
Node.js has several built-in functions that provide sophisticated support for converting JavaScript values to strings – e.g.:
util.inspect(obj)
“returns a string representation of obj
that is intended for debugging”.
util.format(format, ...args)
“returns a formatted string using the first argument as a printf-like format string which can contain zero or more format specifiers”.
These functions can even handle circular data:
import * as util from 'node:util';
const cycle = {};
cycle.prop = cycle;
assert.equal(
util.inspect(cycle),
'<ref *1> { prop: [Circular *1] }'
);
Console methods such as console.log()
tend to produce good output and have few limitations:
console.log({__proto__: null, prop: Symbol()});
Output:
[Object: null prototype] { prop: Symbol() }
However, by default, they only display objects up to a certain depth:
console.log({a: {b: {c: {d: true}}}});
Output:
{ a: { b: { c: [Object] } } }
Node.js lets us specify the depth for console.dir()
– with null
meaning infinite:
console.dir({a: {b: {c: {d: true}}}}, {depth: null});
Output:
{
a: { b: { c: { d: true } } }
}
In browsers, console.dir()
does not have an options object but lets us interactively and incrementally descend into objects.
We can customize the built-in way of stringifying objects by implementing the method .toString()
:
const helloObj = {
toString() {
return 'Hello!';
}
};
assert.equal(
String(helloObj), 'Hello!'
);
We can customize how an object is converted to JSON by implementing the method .toJSON()
:
const point = {
x: 1,
y: 2,
toJSON() {
return [this.x, this.y];
}
}
assert.equal(
JSON.stringify(point), '[1,2]'
);
Strings can be compared via the following operators:
< <= > >=
There is one important caveat to consider: These operators compare based on the numeric values of JavaScript characters. That means that the order that JavaScript uses for strings is different from the one used in dictionaries and phone books:
> 'A' < 'B' // ok
true
> 'a' < 'B' // not ok
false
> 'ä' < 'b' // not ok
false
Properly comparing text is beyond the scope of this book. It is supported via the ECMAScript Internationalization API (Intl
).
Quick recap of “Unicode – a brief introduction (advanced)” (§21):
Code points are the atomic parts of Unicode text. Each code point is 21 bits in size.
Each JavaScript character is a UTF-16 code unit (16 bits). We need one to two code units to encode a code point. Most code points fit into one code unit.
Grapheme clusters (user-perceived characters) represent written symbols, as displayed on screen or paper. One or more code points are needed to encode a single grapheme cluster. Most grapheme clusters are one code point long.
The following code demonstrates that a single code point comprises one or two JavaScript characters. We count the latter via .length
:
// 3 code points, 3 JavaScript characters:
assert.equal('abc'.length, 3);
// 1 code point, 2 JavaScript characters:
assert.equal('🙂'.length, 2);
The following table summarizes the concepts we have just explored:
Entity | Size | Encoded via |
---|---|---|
JavaScript character (UTF-16 code unit) | 16 bits | – |
Unicode code point | 21 bits | 1–2 code units |
Unicode grapheme cluster | 1+ code points |
Indices and lengths of strings are based on JavaScript characters – which are UTF-16 code units.
Code units are accessed like Array elements:
> const str = 'αβγ';
> str.length
3
> str[0]
'α'
str.split('')
splits into code units:
> str.split('')
[ 'α', 'β', 'γ' ]
> 'A🙂'.split('')
[ 'A', '\uD83D', '\uDE42' ]
The emoji 🙂 consists of two code units.
To specify a code unit hexadecimally, we can use a Unicode code unit escape with exactly four hexadecimal digits:
> '\u03B1\u03B2\u03B3'
'αβγ'
ASCII escapes: If the code point of a character is below 256, we can refer to it via an ASCII escape with exactly two hexadecimal digits:
> 'He\x6C\x6Co'
'Hello'
Official name of an ASCII escape: hexadecimal escape sequence
It was the first escape that used hexadecimal numbers.
To get the char code of a character, we can use .charCodeAt()
:
> 'α'.charCodeAt(0).toString(16)
'3b1'
String.fromCharCode()
converts a char code to a string:
> String.fromCharCode(0x3B1)
'α'
Iteration (which is described later in this book) splits strings into code points:
const codePoints = 'A🙂';
for (const codePoint of codePoints) {
console.log(codePoint + ' ' + codePoint.length);
}
Output:
A 1
🙂 2
Array.from()
uses iteration:
> Array.from('A🙂')
[ 'A', '🙂' ]
Therefore, this is how we can count the number of code points in a string:
> Array.from('A🙂').length
2
> 'A🙂'.length
3
A Unicode code point escape lets us specify a code point hexadecimally (1–5 digits). It produces one or two JavaScript characters.
> '\u{1F642}'
'🙂'
.codePointAt()
returns the code point number for a sequence of 1–2 JavaScript characters:
> '🙂'.codePointAt(0).toString(16)
'1f642'
String.fromCodePoint()
converts a code point number to 1–2 JavaScript characters:
> String.fromCodePoint(0x1F642)
'🙂'
If we use the flag /v
for a regular expression, it supports Unicode better and matches code points not code units:
> '🙂'.match(/./g)
[ '\uD83D', '\uDE42' ]
> '🙂'.match(/./gv)
[ '🙂' ]
More information: “Flag /v
: limited support for multi-code-point grapheme clusters ES2024” (§46.11.4).
This is a grapheme cluster that consists of 3 code points:
const graphemeCluster = '😵💫';
assert.deepEqual(
// Iteration splits into code points
Array.from(graphemeCluster),
['😵', '\u200D', '💫']
);
assert.equal(
// 5 JavaScript characters
'😵💫'.length, 5
);
To split a string into grapheme clusters, we can use Intl.Segmenter
– a class that isn’t part of ECMAScript proper, but part of the ECMAScript internationalization API. It is supported by most JavaScript platforms. This is how we can use it:
const segmenter = new Intl.Segmenter('en-US', { granularity: 'grapheme' });
assert.deepEqual(
Array.from(segmenter.segment('A🙂😵💫')),
[
{ segment: 'A', index: 0, input: 'A🙂😵💫' },
{ segment: '🙂', index: 1, input: 'A🙂😵💫' },
{ segment: '😵💫', index: 3, input: 'A🙂😵💫' },
]
);
.segmenter()
returns an iterable over segment objects. We can use it via for-of
, Array.from()
, Iterator.from()
, etc.
The regular expression flag /v
provides some limited support for grapheme clusters – e.g., we can match emojis with potentially multiple code points like this:
> 'A🙂😵💫'.match(/\p{RGI_Emoji}/gv)
[ '🙂', '😵💫' ]
More information: “Flag /v
: limited support for multi-code-point grapheme clusters ES2024” (§46.11.4).
Table 22.1 describes how various values are converted to strings.
x | String(x) |
---|---|
undefined | 'undefined' |
null | 'null' |
boolean | false → 'false' , true → 'true' |
number | Example: 123 → '123' |
bigint | Example: 123n → '123' |
string | x (input, unchanged) |
symbol | Example: Symbol('abc') → 'Symbol(abc)' |
object | Configurable via, e.g., toString() |
Table 22.1: Converting values to strings.
String.fromCharCode()
ES1
.charCodeAt()
ES1
String.fromCodePoint()
ES6
.codePointAt()
ES6
String.prototype.*
: regular expression methodsThe following methods are listed in the quick reference for regular expressions:
String.prototype.match()
String.prototype.matchAll()
String.prototype.replace()
String.prototype.replaceAll()
String.prototype.search()
String.prototype.split()
String.prototype.*
: finding and matchingString.prototype.startsWith(searchString, startPos=0)
ES6
Returns true
if searchString
occurs in the string at index startPos
. Returns false
otherwise.
> '.gitignore'.startsWith('.')
true
> 'abcde'.startsWith('bc', 1)
true
String.prototype.endsWith(searchString, endPos=this.length)
ES6
Returns true
if the string would end with searchString
if its length were endPos
. Returns false
otherwise.
> 'poem.txt'.endsWith('.txt')
true
> 'abcde'.endsWith('cd', 4)
true
String.prototype.includes(searchString, startPos=0)
ES6
Returns true
if the string contains the searchString
and false
otherwise. The search starts at startPos
.
> 'abc'.includes('b')
true
> 'abc'.includes('b', 2)
false
String.prototype.indexOf(searchString, minIndex=0)
ES1
searchString
appears at minIndex
or after: Return the lowest index where it is found. Otherwise: Return -1
.
> 'aaax'.indexOf('aa', 0)
0
> 'aaax'.indexOf('aa', 1)
1
> 'aaax'.indexOf('aa', 2)
-1
String.prototype.lastIndexOf(searchString, maxIndex?)
ES1
searchString
appears at maxIndex
or before: Return the highest index where it is found. Otherwise: Return -1
.
maxIndex
is missing, the search starts at this.length - searchString.length
(assuming that searchString
is shorter than this
).
> 'xaaa'.lastIndexOf('aa', 3)
2
> 'xaaa'.lastIndexOf('aa', 2)
2
> 'xaaa'.lastIndexOf('aa', 1)
1
> 'xaaa'.lastIndexOf('aa', 0)
-1
String.prototype.*
: extractingString.prototype.slice(start=0, end=this.length)
ES3
Returns the substring of the string that starts at (including) index start
and ends at (excluding) index end
. If an index is negative, it is added to .length
before it is used (-1
becomes this.length-1
, etc.).
> 'abc'.slice(1, 3)
'bc'
> 'abc'.slice(1)
'bc'
> 'abc'.slice(-2)
'bc'
String.prototype.at(index: number)
ES2022
index
as a string.
undefined
.
index
is negative, it is added to .length
before it is used (-1
becomes this.length-1
, etc.).
> 'abc'.at(0)
'a'
> 'abc'.at(-1)
'c'
String.prototype.substring(start, end=this.length)
ES1
Use .slice()
instead of this method. .substring()
wasn’t implemented consistently in older engines and doesn’t support negative indices.
String.prototype.*
: combiningString.prototype.concat(...strings)
ES3
Returns the concatenation of the string and strings
. 'a'.concat('b')
is equivalent to 'a'+'b'
. The latter is much more popular.
> 'ab'.concat('cd', 'ef', 'gh')
'abcdefgh'
String.prototype.padEnd(len, fillString=' ')
ES2017
Appends (fragments of) fillString
to the string until it has the desired length len
. If it already has or exceeds len
, then it is returned without any changes.
> '#'.padEnd(2)
'# '
> 'abc'.padEnd(2)
'abc'
> '#'.padEnd(5, 'abc')
'#abca'
String.prototype.padStart(len, fillString=' ')
ES2017
Prepends (fragments of) fillString
to the string until it has the desired length len
. If it already has or exceeds len
, then it is returned without any changes.
> '#'.padStart(2)
' #'
> 'abc'.padStart(2)
'abc'
> '#'.padStart(5, 'abc')
'abca#'
String.prototype.repeat(count=0)
ES6
Returns the string, concatenated count
times.
> '*'.repeat()
''
> '*'.repeat(3)
'***'
String.prototype.*
: transformingString.prototype.toUpperCase()
ES1
Returns a copy of the string in which all lowercase alphabetic characters are converted to uppercase. How well that works for various alphabets, depends on the JavaScript engine.
> '-a2b-'.toUpperCase()
'-A2B-'
> 'αβγ'.toUpperCase()
'ΑΒΓ'
String.prototype.toLowerCase()
ES1
Returns a copy of the string in which all uppercase alphabetic characters are converted to lowercase. How well that works for various alphabets, depends on the JavaScript engine.
> '-A2B-'.toLowerCase()
'-a2b-'
> 'ΑΒΓ'.toLowerCase()
'αβγ'
String.prototype.trim()
ES5
Returns a copy of the string in which all leading and trailing whitespace (spaces, tabs, line terminators, etc.) is gone.
> '\r\n#\t '.trim()
'#'
> ' abc '.trim()
'abc'
String.prototype.trimStart()
ES2019
Similar to .trim()
but only the beginning of the string is trimmed:
> ' abc '.trimStart()
'abc '
String.prototype.trimEnd()
ES2019
Similar to .trim()
but only the end of the string is trimmed:
> ' abc '.trimEnd()
' abc'
String.prototype.normalize(form = 'NFC')
ES6
form
: 'NFC', 'NFD', 'NFKC', 'NFKD'
String.prototype.isWellFormed()
ES2024
Returns true
if a string is ill-formed and contains lone surrogates (see .toWellFormed()
for more information). Otherwise, it returns false
.
> '🙂'.split('') // split into code units
[ '\uD83D', '\uDE42' ]
> '\uD83D\uDE42'.isWellFormed()
true
> '\uD83D\uDE42\uD83D'.isWellFormed() // lone surrogate 0xD83D
false
String.prototype.toWellFormed()
ES2024
Each JavaScript string character is a UTF-16 code unit. One code point is encoded as either one UTF-16 code unit or two UTF-16 code unit. In the latter case, the two code units are called leading surrogate and trailing surrogate. A surrogate without its partner is called a lone surrogate. A string with one or more lone surrogates is ill-formed.
.toWellFormed()
converts an ill-formed string to a well-formed one by replacing each lone surrogate with code point 0xFFFD (“replacement character”). That character is often displayed as a � (a black rhombus with a white question mark). It is located in the Specials Unicode block of characters, at the very end of the Basic Multilingual Plane. This is what Wikipedia says about the replacement character: “It is used to indicate problems when a system is unable to render a stream of data to correct symbols.”
assert.deepEqual(
'🙂'.split(''), // split into code units
['\uD83D', '\uDE42']
);
assert.deepEqual(
// 0xD83D is a lone surrogate
'\uD83D\uDE42\uD83D'.toWellFormed().split(''),
['\uD83D', '\uDE42', '\uFFFD']
);
Exercise: Using string methods
exercises/strings/remove_extension_test.mjs