JavaScript for impatient programmers (beta)
Please support this book: buy it or donate
(Ad, please don’t block.)

16. Strings

Strings are primitive values in JavaScript and immutable. That is, string-related operations always produce new strings and never change existing strings.

16.1. Plain string literals

Plain string literals are delimited by either single quotes or double quotes:

const str1 = 'abc';
const str2 = "abc";
assert.equal(str1, str2);

Single quotes are used more often, because it makes it easier to mention HTML with its double quotes.

The next chapter covers template literals, which give you:

16.1.1. Escaping

The backslash lets you create special characters:

The backslash also lets you use the delimiter of a string literal inside that literal:

'She said: "Let\'s go!"'
"She said: \"Let's go!\""

16.2. Accessing characters and code points

16.2.1. Accessing JavaScript characters

JavaScript has no extra data type for characters – characters are always transported as strings.

const str = 'abc';

// Reading a character at a given index
assert.equal(str[1], 'b');

// Counting the characters in a string:
assert.equal(str.length, 3);

16.2.2. Accessing Unicode code points via for-of and spreading

Iterating over strings via for-of or spreading (...) visits Unicode code points (whose range is 21 bits). Each code point is represented by 1–2 JavaScript characters (whose range is 16 bits). For more information, see the section on code points in this chapter.

This is how you iterate over the code points of a string via for-of:

for (const ch of 'abc') {
  console.log(ch);
}
// Output:
// 'a'
// 'b'
// 'c'

And this is how you convert a string into an Array of code points via spreading:

assert.deepEqual([...'abc'], ['a', 'b', 'c']);

16.3. String concatenation via +

If at least one operand is a string, the plus operator (+) converts any non-strings to strings and concatenates the result:

assert.equal(3 + ' times ' + 4, '3 times 4');

The assignment operator += is useful if you want to assemble a string, piece by piece:

let str = ''; // must be `let`!
str += 'Say it';
str += ' one more';
str += ' time';

assert.equal(str, 'Say it one more time');

As an aside, this way of assembling strings is quite efficient, because most JavaScript engines internally optimize it.

  Exercise: Concatenating strings

exercises/strings/concat_string_array_test.js

16.4. Converting to string

These are three ways of converting a value x to a string:

Recommendation: use the descriptive and safe String().

Examples:

assert.equal(String(undefined), 'undefined');
assert.equal(String(null), 'null');

assert.equal(String(false), 'false');
assert.equal(String(true), 'true');

assert.equal(String(123.45), '123.45');

Pitfall for booleans: If you convert a boolean to a string via String(), you can’t convert it back via Boolean().

> String(false)
'false'
> Boolean('false')
true

16.4.1. Stringifying objects

Plain objects have a default representation that is not very useful:

> String({a: 1})
'[object Object]'

Arrays have a better string representation, but it still hides much information:

> String(['a', 'b'])
'a,b'
> String(['a', ['b']])
'a,b'

> String([1, 2])
'1,2'
> String(['1', '2'])
'1,2'

> String([true])
'true'
> String(['true'])
'true'
> String(true)
'true'

Stringifying functions returns their source code:

> String(function f() {return 4})
'function f() {return 4}'

16.4.2. Customizing the stringification of objects

You can override the built-in way of stringifying objects by implementing the method toString():

const obj = {
  toString() {
    return 'hello';
  }
};

assert.equal(String(obj), 'hello');

16.4.3. An alternate way of stringifying values

The JSON data format is a text representation of JavaScript values. Therefore, JSON.stringify() can also be used to stringify data:

> JSON.stringify({a: 1})
'{"a":1}'
> JSON.stringify(['a', ['b']])
'["a",["b"]]'

The caveat is that JSON only supports null, booleans, numbers, strings, Arrays and objects (which it always treats as if they were created by object literals).

Tip: The third parameter lets you switch on multi-line output and specify how much to indent. For example:

console.log(JSON.stringify({first: 'Jane', last: 'Doe'}, null, 2));

This statement produces the following output.

{
  "first": "Jane",
  "last": "Doe"
}

16.5. Comparing strings

Strings can be compared via the following operators:

< <= > >=

There is one important caveat to consider: These operators compare based on the numeric values of JavaScript characters. That means that the order that JavaScript uses for strings is different from the one used in dictionaries and phone books:

> 'A' < 'B' // ok
true
> 'a' < 'B' // not ok
false
> 'ä' < 'b' // not ok
false

Properly comparing text is beyond the scope of this book. It is supported via the ECMAScript Internationalization API (Intl).

16.6. JavaScript characters vs. Unicode code points

Unicode’s atomic unit of text is called code point. In many ways, code points are Unicode characters, but you occasionally need multiple code points to represent a single text symbol (a so-called grapheme cluster; details soon).

The range of Unicode code points is 21 bits. To represent them in JavaScript strings, one or two JavaScript characters (whose range is 16 bits) are used. You can see that when counting characters via .length:

// 3 Unicode code points, 3 JavaScript characters:
assert.equal('abc'.length, 3);

// 1 Unicode code point, 2 JavaScript characters:
assert.equal('😀'.length, 2);

16.6.1. Working with Unicode code points

Let’s explore JavaScript’s tools for working with code points.

Code point escapes let you specify code points hexadecimally. They expand to one or two JavaScript characters.

> '\u{1F600}'
'😀'

Converting from code points:

> String.fromCodePoint(0x1F600)
'😀'

Converting to code points:

> '😀'.codePointAt(0).toString(16)
'1f600'

Iteration honors code points. For example, the iteration-based for-of loop:

const str = '😀a';
assert.equal(str.length, 3);

for (const codePoint of str) {
  console.log(codePoint);
}

// Output:
// '😀'
// 'a'

Or iteration-based spreading (...):

> [...'😀a']
[ '😀', 'a' ]

Spreading is therefore a good tool for counting code points:

> [...'😀a'].length
2
> '😀a'.length
3

16.6.2. Caveat: grapheme clusters

A grapheme cluster is what corresponds most closely to a symbol displayed on screen or paper. It is defined as “a horizontally segmentable unit of text”. One ore more code points are needed to encode a grapheme cluster.

For example, one emoji of a family is composed of 7 code points – 4 of them are graphemes themselves and they are joined by invisible code points:

Splitting a family emoji into its code points.

Another example is flag emojis:

Splitting a flag emoji into its code points.

For more information, consult “Let’s Stop Ascribing Meaning to Code Points” by Manish Goregaokar.

16.7. Quick reference: Strings

Strings are immutable, none of the string methods ever modify the receiver.

16.7.1. Converting to string

Tbl. 14 describes how various values are converted to strings.

Table 14: Converting values to strings.
x String(x)
undefined 'undefined'
null 'null'
Boolean value false 'false', true 'true'
Number value Example: 123 '123'
String value x (input, unchanged)
An object Configurable via, e.g., toString()

16.7.2. Numeric values of characters and code points

16.7.3. String operators

// Access characters via []
const str = 'abc';
assert.equal(str[1], 'b');

// Concatenate strings via +
assert.equal('a' + 'b' + 'c', 'abc');
assert.equal('take ' + 3 + ' oranges', 'take 3 oranges');

16.7.4. String.prototype.*: finding and matching

16.7.5. String.prototype.*: extracting

16.7.6. String.prototype.*: combining

16.7.7. String.prototype.*: transforming

16.7.8. String.prototype.*: chars, char codes, code points

16.7.9. Sources

  Exercise: Using string methods

exercises/strings/remove_extension_test.js

  Quiz

See quiz app.