JavaScript for impatient programmers (beta)
Please support this book: buy it or donate
(Ad, please don’t block.)

18. Strings



Strings are primitive values in JavaScript and immutable. That is, string-related operations always produce new strings and never change existing strings.

18.1. Plain string literals

Plain string literals are delimited by either single quotes or double quotes:

const str1 = 'abc';
const str2 = "abc";
assert.equal(str1, str2);

Single quotes are used more often, because it makes it easier to mention HTML with its double quotes.

The next chapter covers template literals, which give you:

18.1.1. Escaping

The backslash lets you create special characters:

The backslash also lets you use the delimiter of a string literal inside that literal:

'She said: "Let\'s go!"'
"She said: \"Let's go!\""

18.2. Accessing characters and code points

18.2.1. Accessing JavaScript characters

JavaScript has no extra data type for characters – characters are always transported as strings.

const str = 'abc';

// Reading a character at a given index
assert.equal(str[1], 'b');

// Counting the characters in a string:
assert.equal(str.length, 3);

18.2.2. Accessing Unicode code points via for-of and spreading

Iterating over strings via for-of or spreading (...) visits Unicode code points. Each code point is represented by 1–2 JavaScript characters. For more information, see the section on the atoms of text in this chapter.

This is how you iterate over the code points of a string via for-of:

for (const ch of 'abc') {
  console.log(ch);
}
// Output:
// 'a'
// 'b'
// 'c'

And this is how you convert a string into an Array of code points via spreading:

assert.deepEqual([...'abc'], ['a', 'b', 'c']);

18.3. String concatenation via +

If at least one operand is a string, the plus operator (+) converts any non-strings to strings and concatenates the result:

assert.equal(3 + ' times ' + 4, '3 times 4');

The assignment operator += is useful if you want to assemble a string, piece by piece:

let str = ''; // must be `let`!
str += 'Say it';
str += ' one more';
str += ' time';

assert.equal(str, 'Say it one more time');

As an aside, this way of assembling strings is quite efficient, because most JavaScript engines internally optimize it.

  Exercise: Concatenating strings

exercises/strings/concat_string_array_test.js

18.4. Converting to string

These are three ways of converting a value x to a string:

Recommendation: use the descriptive and safe String().

Examples:

assert.equal(String(undefined), 'undefined');
assert.equal(String(null), 'null');

assert.equal(String(false), 'false');
assert.equal(String(true), 'true');

assert.equal(String(123.45), '123.45');

Pitfall for booleans: If you convert a boolean to a string via String(), you can’t convert it back via Boolean().

> String(false)
'false'
> Boolean('false')
true

18.4.1. Stringifying objects

Plain objects have a default representation that is not very useful:

> String({a: 1})
'[object Object]'

Arrays have a better string representation, but it still hides much information:

> String(['a', 'b'])
'a,b'
> String(['a', ['b']])
'a,b'

> String([1, 2])
'1,2'
> String(['1', '2'])
'1,2'

> String([true])
'true'
> String(['true'])
'true'
> String(true)
'true'

Stringifying functions returns their source code:

> String(function f() {return 4})
'function f() {return 4}'

18.4.2. Customizing the stringification of objects

You can override the built-in way of stringifying objects by implementing the method toString():

const obj = {
  toString() {
    return 'hello';
  }
};

assert.equal(String(obj), 'hello');

18.4.3. An alternate way of stringifying values

The JSON data format is a text representation of JavaScript values. Therefore, JSON.stringify() can also be used to stringify data:

> JSON.stringify({a: 1})
'{"a":1}'
> JSON.stringify(['a', ['b']])
'["a",["b"]]'

The caveat is that JSON only supports null, booleans, numbers, strings, Arrays and objects (which it always treats as if they were created by object literals).

Tip: The third parameter lets you switch on multi-line output and specify how much to indent. For example:

console.log(JSON.stringify({first: 'Jane', last: 'Doe'}, null, 2));

This statement produces the following output.

{
  "first": "Jane",
  "last": "Doe"
}

18.5. Comparing strings

Strings can be compared via the following operators:

< <= > >=

There is one important caveat to consider: These operators compare based on the numeric values of JavaScript characters. That means that the order that JavaScript uses for strings is different from the one used in dictionaries and phone books:

> 'A' < 'B' // ok
true
> 'a' < 'B' // not ok
false
> 'ä' < 'b' // not ok
false

Properly comparing text is beyond the scope of this book. It is supported via the ECMAScript Internationalization API (Intl).

18.6. Atoms of text: JavaScript characters, code points, grapheme clusters

Quick recap of the chapter on Unicode:

To represent code points in JavaScript strings, one or two JavaScript characters are used. You can see that when counting characters via .length:

// 3 Unicode code points, 3 JavaScript characters:
assert.equal('abc'.length, 3);

// 1 Unicode code point, 2 JavaScript characters:
assert.equal('🙂'.length, 2);

18.6.1. Working with code points

Let’s explore JavaScript’s tools for working with code points.

Code point escapes let you specify code points hexadecimally. They expand to one or two JavaScript characters.

> '\u{1F642}'
'🙂'

Converting from code points:

> String.fromCodePoint(0x1F642)
'🙂'

Converting to code points:

> '🙂'.codePointAt(0).toString(16)
'1f642'

Iteration honors code points. For example, the iteration-based for-of loop:

const str = '🙂a';
assert.equal(str.length, 3);

for (const codePoint of str) {
  console.log(codePoint);
}

// Output:
// '🙂'
// 'a'

Or iteration-based spreading (...):

> [...'🙂a']
[ '🙂', 'a' ]

Spreading is therefore a good tool for counting code points:

> [...'🙂a'].length
2
> '🙂a'.length
3

18.6.2. Working with code units

Indices and lengths of strings are based on JavaScript characters (i.e., code units).

To specify code units numerically, you can use code unit escapes:

> '\uD83D\uDE42'
'🙂'

And you can use so-called char codes:

> String.fromCharCode(0xD83D) + String.fromCharCode(0xDE42)
'🙂'

To get the char code of a character, use .charCodeAt():

> '🙂'.charCodeAt(0).toString(16)
'd83d'

18.6.3. Caveat: grapheme clusters

When working with text that may be written in any human language, it’s best to split at the boundaries of grapheme clusters, not at the boundaries of code units.

TC39 is working on Intl.Segmenter, a proposal for the ECMAScript Internationalization API to support Unicode segmentation (along grapheme cluster boundaries, word boundaries, sentence boundaries, etc.).

Until that proposal becomes a standard, you can use one of several libraries that are available (do a web search for “JavaScript grapheme”).

18.7. Quick reference: Strings

Strings are immutable, none of the string methods ever modify their strings.

18.7.1. Converting to string

Tbl. 16 describes how various values are converted to strings.

Table 16: Converting values to strings.
x String(x)
undefined 'undefined'
null 'null'
Boolean value false 'false', true 'true'
Number value Example: 123 '123'
String value x (input, unchanged)
An object Configurable via, e.g., toString()

18.7.2. Numeric values of characters and code points

18.7.3. String operators

// Access characters via []
const str = 'abc';
assert.equal(str[1], 'b');

// Concatenate strings via +
assert.equal('a' + 'b' + 'c', 'abc');
assert.equal('take ' + 3 + ' oranges', 'take 3 oranges');

18.7.4. String.prototype: finding and matching

18.7.5. String.prototype: extracting

18.7.6. String.prototype: combining

18.7.7. String.prototype: transforming

18.7.8. String.prototype: chars, char codes, code points

18.7.9. Sources

  Exercise: Using string methods

exercises/strings/remove_extension_test.js

  Quiz

See quiz app.