Shell scripting with Node.js
You can buy the offline version of this book (HTML, PDF, EPUB, MOBI) and support the free online version.
(Ad, please don’t block.)

8 Working with the file system on Node.js



This chapter contains:

Given that the focus of this book is on shell scripting, we only work with textual data.

8.1 Concepts, patterns and conventions of Node’s file system APIs

8.1.1 Ways of accessing files

  1. We can read or write the whole content of a file via a string.
  2. We can open a stream for reading or a stream for writing and process a file in smaller pieces, one at a time. Streams only allow sequential access.
  3. We can use file descriptors or FileHandles and get both sequential and random access, via an API that is loosely similar to streams.
    • File descriptors are integer numbers that represent files. They are managed via these functions (only the synchronous names are shown, there are also callback-based versions – fs.open() etc.):
      • fs.openSync(path, flags?, mode?) opens a new file descriptor for a file at a given path and returns it.
      • fs.closeSync(fd) closes a file descriptor.
      • fs.fchmodSync(fd, mode)
      • fs.fchownSync(fd, uid, gid)
      • fs.fdatasyncSync(fd)
      • fs.fstatSync(fd, options?)
      • fs.fsyncSync(fd)
      • fs.ftruncateSync(fd, len?)
      • fs.futimesSync(fd, atime, mtime)
    • Only the synchronous API and the callback-based API use file descriptors. The Promise-based API has a better abstraction, class FileHandle, which is based on file descriptors. Instances are created via fsPromises.open(). Various operations are provided via methods (not via functions):
      • fileHandle.close()
      • fileHandle.chmod(mode)
      • fileHandle.chown(uid, gid)
      • Etc.

Note that we don’t use (3) in this chapter – (1) and (2) are enough for our purposes.

8.1.2 Function name prefixes

Functions whose names start with an “l” usually operate on symbolic links:

8.1.2.2 Prefix “f”: file descriptors

Functions whose names start with an “f” usually manage file descriptors:

8.1.3 Important classes

Several classes play important roles in Node’s file system APIs.

8.1.3.1 URLs: an alternative to file system paths in strings

Whenever a Node.js function accepts a file system path in a string (line A), it usually also accepts an instance of URL (line B):

assert.equal(
  fs.readFileSync(
    '/tmp/text-file.txt', {encoding: 'utf-8'}), // (A)
  'Text content'
);
assert.equal(
  fs.readFileSync(
    new URL('file:///tmp/text-file.txt'), {encoding: 'utf-8'}), // (B)
  'Text content'
);

Manually converting between paths and file: URLs seems easy but has surprisingly many pitfalls: percent encoding or decoding, Windows drive letters, etc. Instead, it’s better to use the following two functions:

We don’t use file URLs in this chapter. Use cases for them are described in §7.11.1 “Class URL.

8.1.3.2 Buffers

Class Buffer represents fixed-length byte sequences on Node.js. It is a subclass of Uint8Array (a TypedArray). Buffers are mostly used when working with binary files and therefore of less interest in this book.

Whenever Node.js accepts a Buffer, it also accepts a Uint8Array. Thus, given that Uint8Arrays are cross-platform and Buffers aren’t, the former is preferable.

Buffers can do one thing that Uint8Arrays can’t: encoding and decoding text in various encodings. If we need to encode or decode UTF-8 in Uint8Arrays, we can use class TextEncoder or class TextDecoder. These classes are available on most JavaScript platforms:

> new TextEncoder().encode('café')
Uint8Array.of(99, 97, 102, 195, 169)
> new TextDecoder().decode(Uint8Array.of(99, 97, 102, 195, 169))
'café'
8.1.3.3 Node.js streams

Some functions accept or return native Node.js streams:

Instead of native streams, we can now use cross-platform web streams on Node.js. How is explained in §10 “Using web streams on Node.js”.

8.2 Reading and writing files

8.2.1 Reading a file synchronously into a single string (optional: splitting into lines)

fs.readFileSync(filePath, options?) reads the file at filePath into a single string:

assert.equal(
  fs.readFileSync('text-file.txt', {encoding: 'utf-8'}),
  'there\r\nare\nmultiple\nlines'
);

Pros and cons of this approach (vs. using a stream):

Next, we’ll look into spliting the string we have read into lines.

8.2.1.1 Splitting lines without including line terminators

The following code splits a string into lines while removing line terminators. It works with Unix and Windows line terminators:

const RE_SPLIT_EOL = /\r?\n/;
function splitLines(str) {
  return str.split(RE_SPLIT_EOL);
}
assert.deepEqual(
  splitLines('there\r\nare\nmultiple\nlines'),
  ['there', 'are', 'multiple', 'lines']
);

“EOL” stands for “end of line”. We accept both Unix line terminators ('\n') and Windows line terminators ('\r\n', like the first one in the previous example). For more information, see §8.3 “Handling line terminators across platforms”.

8.2.1.2 Splitting lines while including line terminators

The following code splits a string into lines while including line terminators. It works with Unix and Windows line terminators (“EOL” stands for “end of line”):

const RE_SPLIT_AFTER_EOL = /(?<=\r?\n)/; // (A)
function splitLinesWithEols(str) {
  return str.split(RE_SPLIT_AFTER_EOL);
}

assert.deepEqual(
  splitLinesWithEols('there\r\nare\nmultiple\nlines'),
  ['there\r\n', 'are\n', 'multiple\n', 'lines']
);
assert.deepEqual(
  splitLinesWithEols('first\n\nthird'),
  ['first\n', '\n', 'third']
);
assert.deepEqual(
  splitLinesWithEols('EOL at the end\n'),
  ['EOL at the end\n']
);
assert.deepEqual(
  splitLinesWithEols(''),
  ['']
);

Line A contains a regular expression with a lookbehind assertion. It matches at locations that are preceded by a match for the pattern \r?\n but it doesn’t capture anything. Therefore, it doesn’t remove anything between the string fragments that the input string is split into.

On engines that don’t support lookbehind assertions (see this table), we can use the following solution:

function splitLinesWithEols(str) {
  if (str.length === 0) return [''];
  const lines = [];
  let prevEnd = 0;
  while (prevEnd < str.length) {
    // Searching for '\n' means we’ll also find '\r\n'
    const newlineIndex = str.indexOf('\n', prevEnd);
    // If there is a newline, it’s included in the line
    const end = newlineIndex < 0 ? str.length : newlineIndex+1;
    lines.push(str.slice(prevEnd, end));
    prevEnd = end;
  }
  return lines;
}

This solution is simple, but more verbose.

In both versions of splitLinesWithEols(), we again accept both Unix line terminators ('\n') and Windows line terminators ('\r\n'). For more information, see §8.3 “Handling line terminators across platforms”.

8.2.2 Reading a file via a stream, line by line

We can also read text files via streams:

import {Readable} from 'node:stream';

const nodeReadable = fs.createReadStream(
  'text-file.txt', {encoding: 'utf-8'});
const webReadableStream = Readable.toWeb(nodeReadable);
const lineStream = webReadableStream.pipeThrough(
  new ChunksToLinesStream());
for await (const line of lineStream) {
  console.log(line);
}

// Output:
// 'there\r\n'
// 'are\n'
// 'multiple\n'
// 'lines'

We used the following external functionality:

Web streams are asynchronously iterable, which is why we can use a for-await-of loop to iterate over lines.

If we are not interested in text lines, then we don’t need ChunksToLinesStream, can iterate over webReadableStream and get chunks with arbitrary lengths.

More information:

Pros and cons of this approach (vs. reading a single string):

8.2.3 Writing a single string to a file synchronously

fs.writeFileSync(filePath, str, options?) writes str to a file at filePath. If a file already exists at that path, it is overwritten.

The following code shows how to use this function:

fs.writeFileSync(
  'new-file.txt',
  'First line\nSecond line\n',
  {encoding: 'utf-8'}
);

For information on line terminators, see §8.3 “Handling line terminators across platforms”.

Pros and cons (vs. using a stream):

8.2.4 Appending a single string to a file (synchronously)

The following code appends a line of text to an existing file:

fs.appendFileSync(
  'existing-file.txt',
  'Appended line\n',
  {encoding: 'utf-8'}
);

We can also use fs.writeFileSync() to perform this task:

fs.writeFileSync(
  'existing-file.txt',
  'Appended line\n',
  {encoding: 'utf-8', flag: 'a'}
);

This code is almost the same as the one we used to overwrite existing content (see the previous section for more information). The only difference is that we added the option .flag: The value 'a' means that we append data. Other possible values (e.g. to throw an error if a file doesn’t exist yet) are explained in the Node.js documentation.

Watch out: In some functions, this option is named .flag, in others .flags.

8.2.5 Writing multiple strings to a file via stream

The following code uses a stream to write multiple strings to a file:

import {Writable} from 'node:stream';

const nodeWritable = fs.createWriteStream(
  'new-file.txt', {encoding: 'utf-8'});
const webWritableStream = Writable.toWeb(nodeWritable);

const writer = webWritableStream.getWriter();
try {
  await writer.write('First line\n');
  await writer.write('Second line\n');
  await writer.close();
} finally {
  writer.releaseLock()
}

We used the following functions:

More information:

Pros and cons (vs. writing a single string):

8.2.6 Appending multiple strings to a file via a stream (asynchronously)

The following code uses a stream to append text to an existing file:

import {Writable} from 'node:stream';

const nodeWritable = fs.createWriteStream(
  'existing-file.txt', {encoding: 'utf-8', flags: 'a'});
const webWritableStream = Writable.toWeb(nodeWritable);

const writer = webWritableStream.getWriter();
try {
  await writer.write('First appended line\n');
  await writer.write('Second appended line\n');
  await writer.close();
} finally {
  writer.releaseLock()
}

This code is almost the same as the one we used to overwrite existing content (see the previous section for more information). The only difference is that we added the option .flags: The value 'a' means that we append data. Other possible values (e.g. to throw an error if a file doesn’t exist yet) are explained in the Node.js documentation.

Watch out: In some functions, this option is named .flag, in others .flags.

8.3 Handling line terminators across platforms

Alas, not all platform have the same line terminator characters that mark the end of line (EOL):

To handle EOL in a manner that works on all platforms, we can use several strategies.

8.3.1 Reading line terminators

When reading text, it’s best to recognize both EOLs.

What might that look like when splitting a text into lines? We can include the EOLs (in either format) at the ends. That enables us to change as little as possible if we modify those lines and write them to a file.

When processing lines with EOLs, it’s sometimes useful to remove them – e.g. via the following function:

const RE_EOL_REMOVE = /\r?\n$/;
function removeEol(line) {
  const match = RE_EOL_REMOVE.exec(line);
  if (!match) return line;
  return line.slice(0, match.index);
}

assert.equal(
  removeEol('Windows EOL\r\n'),
  'Windows EOL'
);
assert.equal(
  removeEol('Unix EOL\n'),
  'Unix EOL'
);
assert.equal(
  removeEol('No EOL'),
  'No EOL'
);

8.3.2 Writing line terminators

When it comes to writing line terminators, we have two options:

8.4 Traversing and creating directories

8.4.1 Traversing a directory

The following function traverses a directory and lists all of its descendants (its children, the children of its children, etc.):

import * as path from 'node:path';

function* traverseDirectory(dirPath) {
  const dirEntries = fs.readdirSync(dirPath, {withFileTypes: true});
  // Sort the entries to keep things more deterministic
  dirEntries.sort(
    (a, b) => a.name.localeCompare(b.name, 'en')
  );
  for (const dirEntry of dirEntries) {
    const fileName = dirEntry.name;
    const pathName = path.join(dirPath, fileName);
    yield pathName;
    if (dirEntry.isDirectory()) {
      yield* traverseDirectory(pathName);
    }
  }
}

We used this functionality:

The following code shows traverseDirectory() in action:

for (const filePath of traverseDirectory('dir')) {
  console.log(filePath);
}

// Output:
// 'dir/dir-file.txt'
// 'dir/subdir'
// 'dir/subdir/subdir-file1.txt'
// 'dir/subdir/subdir-file2.csv'

8.4.2 Creating a directory (mkdir, mkdir -p)

We can use the following function to create directories:

fs.mkdirSync(thePath, options?): undefined | string

options.recursive determines how the function creates the directory at thePath:

This is mkdirSync() in action:

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
  ]
);
fs.mkdirSync('dir/sub/subsub', {recursive: true});
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/sub',
    'dir/sub/subsub',
  ]
);

Function traverseDirectory(dirPath) lists all descendants of the directory at dirPath.

8.4.3 Ensuring that a parent directory exists

If we want to set up a nested file structure on demand, we can’t always be sure that the ancestor directories exist when we create a new file. Then the following function helps:

import * as path from 'node:path';

function ensureParentDirectory(filePath) {
  const parentDir = path.dirname(filePath);
  if (!fs.existsSync(parentDir)) {
    fs.mkdirSync(parentDir, {recursive: true});
  }
}

Here we can see ensureParentDirectory() in action (line A):

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
  ]
);
const filePath = 'dir/sub/subsub/new-file.txt';
ensureParentDirectory(filePath); // (A)
fs.writeFileSync(filePath, 'content', {encoding: 'utf-8'});
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/sub',
    'dir/sub/subsub',
    'dir/sub/subsub/new-file.txt',
  ]
);

8.4.4 Creating a temporary directory

fs.mkdtempSync(pathPrefix, options?) creates a temporary directory: It appends 6 random characters to pathPrefix, creates a directory at the new path and returns that path.

pathPrefix shouldn’t end with a capital “X” because some platforms replace trailing Xs with random characters.

If we want to create our temporary directory inside an operating-system-specific global temporary directory, we can use function os.tmpdir():

import * as os from 'node:os';
import * as path from 'node:path';

const pathPrefix = path.resolve(os.tmpdir(), 'my-app');
  // e.g. '/var/folders/ph/sz0384m11vxf/T/my-app'

const tmpPath = fs.mkdtempSync(pathPrefix);
  // e.g. '/var/folders/ph/sz0384m11vxf/T/my-app1QXOXP'

It’s important to note that temporary directories are not automatically removed when a Node.js script terminates. We either have to delete it ourselves or rely on the operating system to periodically clean up its global temporary directory (which it may or may not do).

8.5 Copying, renaming, moving files or directories

8.5.1 Copying files or directories

fs.cpSync(srcPath, destPath, options?): copies a file or directory from srcPath to destPath. Interesting options:

This is the function in action:

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir-orig',
    'dir-orig/some-file.txt',
  ]
);
fs.cpSync('dir-orig', 'dir-copy', {recursive: true});
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir-copy',
    'dir-copy/some-file.txt',
    'dir-orig',
    'dir-orig/some-file.txt',
  ]
);

Function traverseDirectory(dirPath) lists all descendants of the directory at dirPath.

8.5.2 Renaming or moving files or directories

fs.renameSync(oldPath, newPath) renames or moves a file or a directory from oldPath to newPath.

Let’s use this function to rename a directory:

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'old-dir-name',
    'old-dir-name/some-file.txt',
  ]
);
fs.renameSync('old-dir-name', 'new-dir-name');
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'new-dir-name',
    'new-dir-name/some-file.txt',
  ]
);

Here we use the function to move a file:

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/subdir',
    'dir/subdir/some-file.txt',
  ]
);
fs.renameSync('dir/subdir/some-file.txt', 'some-file.txt');
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/subdir',
    'some-file.txt',
  ]
);

Function traverseDirectory(dirPath) lists all descendants of the directory at dirPath.

8.6 Removing files or directories

8.6.1 Removing files and arbitrary directories (shell: rm, rm -r)

fs.rmSync(thePath, options?) removes a file or directory at thePath. Interesting options:

Let’s use fs.rmSync() to remove a file:

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/some-file.txt',
  ]
);
fs.rmSync('dir/some-file.txt');
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
  ]
);

Here we use fs.rmSync() to recursively remove a non-empty directory.

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/subdir',
    'dir/subdir/some-file.txt',
  ]
);
fs.rmSync('dir/subdir', {recursive: true});
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
  ]
);

Function traverseDirectory(dirPath) lists all descendants of the directory at dirPath.

8.6.2 Removing an empty directory (shell: rmdir)

fs.rmdirSync(thePath, options?) removes an empty directory (an exception is thrown if a directory isn’t empty).

The following code shows how this function works:

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/subdir',
  ]
);
fs.rmdirSync('dir/subdir');
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
  ]
);

Function traverseDirectory(dirPath) lists all descendants of the directory at dirPath.

8.6.3 Clearing directories

A script that saves its output to a directory dir, often needs to clear dir before it starts: Remove every file in dir so that it is empty. The following function does that.

import * as path from 'node:path';

function clearDirectory(dirPath) {
  for (const fileName of fs.readdirSync(dirPath)) {
    const pathName = path.join(dirPath, fileName);
    fs.rmSync(pathName, {recursive: true});
  }
}

We used two file system functions:

This is an example of using clearDirectory():

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/dir-file.txt',
    'dir/subdir',
    'dir/subdir/subdir-file.txt'
  ]
);
clearDirectory('dir');
assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
  ]
);

8.6.4 Trashing files or directories

The library trash moves files and folders to the trash. It works on macOS, Windows, and Linux (where support is limited and help is wanted). This is an example from its readme file:

import trash from 'trash';

await trash(['*.png', '!rainbow.png']);

trash() accepts either an Array of strings or a string as its first parameter. Any string can be a glob pattern (with asterisks and other meta-characters).

8.7 Reading and changing file system entries

8.7.1 Checking if a file or directory exists

fs.existsSync(thePath) returns true if a file or directory exists at thePath:

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/some-file.txt',
  ]
);
assert.equal(
  fs.existsSync('dir'), true
);
assert.equal(
  fs.existsSync('dir/some-file.txt'), true
);
assert.equal(
  fs.existsSync('dir/non-existent-file.txt'), false
);

Function traverseDirectory(dirPath) lists all descendants of the directory at dirPath.

8.7.2 Checking the stats of a file: Is it a directory? When was it created? Etc.

fs.statSync(thePath, options?) returns an instance of fs.Stats with information on the file or directory at thePath.

Interesting options:

Properties of instances of fs.Stats:

In the following example, we use fs.statSync() to implement a function isDirectory():

function isDirectory(thePath) {
  const stats = fs.statSync(thePath, {throwIfNoEntry: false});
  return stats !== undefined && stats.isDirectory();
}

assert.deepEqual(
  Array.from(traverseDirectory('.')),
  [
    'dir',
    'dir/some-file.txt',
  ]
);

assert.equal(
  isDirectory('dir'), true
);
assert.equal(
  isDirectory('dir/some-file.txt'), false
);
assert.equal(
  isDirectory('non-existent-dir'), false
);

Function traverseDirectory(dirPath) lists all descendants of the directory at dirPath.

8.7.3 Changing file attributes: permissions, owner, group, timestamps

Let’s briefly look at functions for changing file attributes:

Functions for working with hard links:

Functions for working with symbolic links:

The following functions operate on symbolic links without dereferencing them (note the name prefix “l”):

Other useful functions:

Options of functions that affect how symbolic links are handled:

8.9 Further reading