Shell scripting with Node.js
You can buy the offline version of this book (HTML, PDF, EPUB, MOBI) and support the free online version.
(Ad, please don’t block.)

16 Parsing command line arguments with util.parseArgs()



In this chapter, we explore how to use the Node.js function parseArgs() from module node:util to parse command line arguments.

16.1 Imports that are implied in this chapter

The following two imports are implied in every example in this chapter:

import * as assert from 'node:assert/strict';
import {parseArgs} from 'node:util';

The first import is for test assertions we use to check values. The second import is for function parseArgs() that is the topic of this chapter.

16.2 The steps involved in processing command line arguments

The following steps are involved in processing command line arguments:

  1. The user inputs a text string.
  2. The shell parses the string into a sequence of words and operators.
  3. If a command is called, it gets zero or more words as arguments.
  4. Our Node.js code receives the words via an Array stored in process.argv. process is a global variable on Node.js.
  5. We use parseArgs() to turn that Array into something that is more convenient to work with.

Let’s use the following shell script args.mjs with Node.js code to see what process.argv looks like:

#!/usr/bin/env node
console.log(process.argv);

We start with a simple command:

% ./args.mjs one two
[ '/usr/bin/node', '/home/john/args.mjs', 'one', 'two' ]

If we install the command via npm on Windows, the same command produces the following result on the Windows Command shell:

[
  'C:\\Program Files\\nodejs\\node.exe',
  'C:\\Users\\jane\\args.mjs',
  'one',
  'two'
]

No matter how we invoke a shell script, process.argv always starts with the path of the Node.js binary that is used to run our code. Next is the path of our script. The Array ends with the actual arguments the were passed to the script. In other words: The arguments of a script always start at index 2.

Therefore, we change our script so that it looks like this:

#!/usr/bin/env node
console.log(process.argv.slice(2));

Let’s try more complicated arguments:

% ./args.mjs --str abc --bool home.html main.js
[ '--str', 'abc', '--bool', 'home.html', 'main.js' ]

These arguments consist of:

Two styles of using arguments are common:

Written as a JavaScript function call, the previous example would look like this (in JavaScript, options usually come last):

argsMjs('home.html', 'main.js', {str: 'abc', bool: false});

16.3 Parsing command line arguments

16.3.1 The basics

If we want parseArgs() to parse an Array with arguments, we first need to tell it how our options work. Let’s assume our script has:

We describe these options to parseArgs() as follows:

const options = {
  'verbose': {
    type: 'boolean',
    short: 'v',
  },
  'color': {
    type: 'string',
    short: 'c',
  },
  'times': {
    type: 'string',
    short: 't',
  },
};

As long as a property key of options is a valid JavaScript identifier, it is up to you if you want to quote it or not. Both have pros and cons. In this chapter, they are always quoted. That way, options with non-identifier names such as my-new-option look the same as those with identifier names.

Each entry in options can have the following properties (as defined via a TypeScript type):

type Options = {
  type: 'boolean' | 'string', // required
  short?: string, // optional
  multiple?: boolean, // optional, default `false`
};

The following code uses parseArgs() and options to parse an Array with arguments:

assert.deepEqual(
  parseArgs({options, args: [
    '--verbose', '--color', 'green', '--times', '5'
  ]}),
  {
    values: {__proto__:null,
      verbose: true,
      color: 'green',
      times: '5'
    },
    positionals: []
  }
);

The prototype of the object stored in .values is null. That means that we can use the in operator to check if a property exists, without having to worry about inherited properties such as .toString.

As mentioned before, the number 5 that is the value of --times, is processed as a string.

The object we pass to parseArgs() has the following TypeScript type:

type ParseArgsProps = {
  options?: {[key: string], Options}, // optional, default: {}
  args?: Array<string>, // optional
    // default: process.argv.slice(2)
  strict?: boolean, // optional, default `true`
  allowPositionals?: boolean, // optional, default `false`
};

This is the type of the result of parseArgs():

type ParseArgsResult = {
  values: {[key: string]: ValuesValue}, // an object
  positionals: Array<string>, // always an Array
};
type ValuesValue = boolean | string | Array<boolean|string>;

Two hyphens are used to refer to the long version of an option. One hyphen is used to refer to the short version:

assert.deepEqual(
  parseArgs({options, args: ['-v', '-c', 'green']}),
  {
    values: {__proto__:null,
      verbose: true,
      color: 'green',
    },
    positionals: []
  }
);

Note that .values contains the long names of the options.

We conclude this subsection by parsing positional arguments that are mixed with optional arguments:

assert.deepEqual(
  parseArgs({
    options,
    allowPositionals: true,
    args: [
      'home.html', '--verbose', 'main.js', '--color', 'red', 'post.md'
    ]
  }),
  {
    values: {__proto__:null,
      verbose: true,
      color: 'red',
    },
    positionals: [
      'home.html', 'main.js', 'post.md'
    ]
  }
);

16.3.2 Using options multiple times

If we use an option multiple times, the default is that only the last time counts. It overrides all previous occurrences:

const options = {
  'bool': {
    type: 'boolean',
  },
  'str': {
    type: 'string',
  },
};

assert.deepEqual(
  parseArgs({
    options, args: [
      '--bool', '--bool', '--str', 'yes', '--str', 'no'
    ]
  }),
  {
    values: {__proto__:null,
      bool: true,
      str: 'no'
    },
    positionals: []
  }
);

If, however, we set .multiple to true in the definition of an option, parseArgs() gives us all option values in an Array:

const options = {
  'bool': {
    type: 'boolean',
    multiple: true,
  },
  'str': {
    type: 'string',
    multiple: true,
  },
};

assert.deepEqual(
  parseArgs({
    options, args: [
      '--bool', '--bool', '--str', 'yes', '--str', 'no'
    ]
  }),
  {
    values: {__proto__:null,
      bool: [ true, true ],
      str: [ 'yes', 'no' ]
    },
    positionals: []
  }
);

16.3.3 More ways of using long and short options

Consider the following options:

const options = {
  'verbose': {
    type: 'boolean',
    short: 'v',
  },
  'silent': {
    type: 'boolean',
    short: 's',
  },
  'color': {
    type: 'string',
    short: 'c',
  },
};

The following is a compact way of using multiple boolean options:

assert.deepEqual(
  parseArgs({options, args: ['-vs']}),
  {
    values: {__proto__:null,
      verbose: true,
      silent: true,
    },
    positionals: []
  }
);

We can directly attach the value of a long string option via an equals sign. That is called an inline value.

assert.deepEqual(
  parseArgs({options, args: ['--color=green']}),
  {
    values: {__proto__:null,
      color: 'green'
    },
    positionals: []
  }
);

Short options can’t have inline values.

16.3.4 Quoting values

So far, all option values and positional values were single words. If we want to use values that contain spaces, we need to quote them – with double quotes or single quotes. The latter is not supported by all shells, however.

16.3.4.1 How shells parse quoted values

To examine how shells parse quoted values, we again use the script args.mjs:

#!/usr/bin/env node
console.log(process.argv.slice(2));

On Unix, these are the differences between double quotes and single quotes:

The following interaction demonstrates option values that are doube-quoted and single-quoted:

% ./args.mjs --str "two words" --str 'two words'
[ '--str', 'two words', '--str', 'two words' ]

% ./args.mjs --str="two words" --str='two words'
[ '--str=two words', '--str=two words' ]

% ./args.mjs -s "two words" -s 'two words'
[ '-s', 'two words', '-s', 'two words' ]

In the Windows Command shell single quotes are not special in any way:

>node args.mjs "say \"hi\"" "\t\n" "%USERNAME%"
[ 'say "hi"', '\\t\\n', 'jane' ]

>node args.mjs 'back slash\' '\t\n' '%USERNAME%'
[ "'back", "slash\\'", "'\\t\\n'", "'jane'" ]

Quoted option values in the Windows Command shell:

>node args.mjs --str 'two words' --str "two words"
[ '--str', "'two", "words'", '--str', 'two words' ]

>node args.mjs --str='two words' --str="two words"
[ "--str='two", "words'", '--str=two words' ]

>>node args.mjs -s "two words" -s 'two words'
[ '-s', 'two words', '-s', "'two", "words'" ]

In Windows PowerShell, we can quote with single quotes, variable names are not interpolated inside quotes and single quotes can’t be escaped:

> node args.mjs "say `"hi`"" "\t\n" "%USERNAME%"
[ 'say hi', '\\t\\n', '%USERNAME%' ]
> node args.mjs 'backtick`' '\t\n' '%USERNAME%'
[ 'backtick`', '\\t\\n', '%USERNAME%' ]
16.3.4.2 How parseArgs() handles quoted values

This is how parseArgs() handles quoted values:

const options = {
  'times': {
    type: 'string',
    short: 't',
  },
  'color': {
    type: 'string',
    short: 'c',
  },
};

// Quoted external option values
assert.deepEqual(
  parseArgs({
    options,
    args: ['-t', '5 times', '--color', 'light green']
  }),
  {
    values: {__proto__:null,
      times: '5 times',
      color: 'light green',
    },
    positionals: []
  }
);

// Quoted inline option values
assert.deepEqual(
  parseArgs({
    options,
    args: ['--color=light green']
  }),
  {
    values: {__proto__:null,
      color: 'light green',
    },
    positionals: []
  }
);

// Quoted positional values
assert.deepEqual(
  parseArgs({
    options, allowPositionals: true,
    args: ['two words', 'more words']
  }),
  {
    values: {__proto__:null,
    },
    positionals: [ 'two words', 'more words' ]
  }
);

16.3.5 Option terminators

parseArgs() supports so-called option terminators: If one of the elements of args is a double hyphen (--), then the remaining arguments are all treated as positional.

Where are option terminators needed? Some executables invoke other executables, e.g. the node executable. Then an option terminator can be used to separate the caller’s arguments from the callee’s arguments.

This is how parseArgs() handles option terminators:

const options = {
  'verbose': {
    type: 'boolean',
  },
  'count': {
    type: 'string',
  },
};

assert.deepEqual(
  parseArgs({options, allowPositionals: true,
    args: [
      'how', '--verbose', 'are', '--', '--count', '5', 'you'
    ]
  }),
  {
    values: {__proto__:null,
      verbose: true
    },
    positionals: [ 'how', 'are', '--count', '5', 'you' ]
  }
);

16.3.6 Strict parseArgs()

If the option .strict is true (which is the default), then parseArgs() throws an exception if one of the following things happens:

The following code demonstrates each of these cases:

const options = {
  'str': {
    type: 'string',
  },
};

// Unknown option name
assert.throws(
  () => parseArgs({
      options,
      args: ['--unknown']
    }),
  {
    name: 'TypeError',
    message: "Unknown option '--unknown'",
  }
);

// Wrong option type (missing value)
assert.throws(
  () => parseArgs({
      options,
      args: ['--str']
    }),
  {
    name: 'TypeError',
    message: "Option '--str <value>' argument missing",
  }
);

// Unallowed positional
assert.throws(
  () => parseArgs({
      options,
      allowPositionals: false, // (the default)
      args: ['posarg']
    }),
  {
    name: 'TypeError',
    message: "Unexpected argument 'posarg'. " +
      "This command does not take positional arguments",
  }
);

16.4 parseArgs tokens

parseArgs() processes the args Array in two phases:

We can get access to the tokens if we set config.tokens to true. Then the object returned by parseArgs() contains a property .tokens with the tokens.

These are the properties of tokens:

type Token = OptionToken | PositionalToken | OptionTerminatorToken;

interface CommonTokenProperties {
    /** Where in `args` does the token start? */
  index: number;
}

interface OptionToken extends CommonTokenProperties {
  kind: 'option';

  /** Long name of option */
  name: string;

  /** The option name as mentioned in `args` */
  rawName: string;

  /** The option’s value. `undefined` for boolean options. */
  value: string | undefined;

  /** Is the option value specified inline (e.g. --level=5)? */
  inlineValue: boolean | undefined;
}

interface PositionalToken extends CommonTokenProperties {
  kind: 'positional';

  /** The value of the positional, args[token.index] */
  value: string;
}

interface OptionTerminatorToken extends CommonTokenProperties {
  kind: 'option-terminator';
}

16.4.1 Examples of tokens

As an example, consider the following options:

const options = {
  'bool': {
    type: 'boolean',
    short: 'b',
  },
  'flag': {
    type: 'boolean',
    short: 'f',
  },
  'str': {
    type: 'string',
    short: 's',
  },
};

The tokens for boolean options look like this:

assert.deepEqual(
  parseArgs({
    options, tokens: true,
    args: [
      '--bool', '-b', '-bf',
    ]
  }),
  {
    values: {__proto__:null,
      bool: true,
      flag: true,
    },
    positionals: [],
    tokens: [
      {
        kind: 'option',
        name: 'bool',
        rawName: '--bool',
        index: 0,
        value: undefined,
        inlineValue: undefined
      },
      {
        kind: 'option',
        name: 'bool',
        rawName: '-b',
        index: 1,
        value: undefined,
        inlineValue: undefined
      },
      {
        kind: 'option',
        name: 'bool',
        rawName: '-b',
        index: 2,
        value: undefined,
        inlineValue: undefined
      },
      {
        kind: 'option',
        name: 'flag',
        rawName: '-f',
        index: 2,
        value: undefined,
        inlineValue: undefined
      },
    ]
  }
);

Note that there are three tokens for option bool because it is mentioned three times in args. However, due to phase 2 of parsing, there is only one property for bool in .values.

In the next example, we parse string options into tokens. .inlineValue has boolean values now (it is always undefined for boolean options):

assert.deepEqual(
  parseArgs({
    options, tokens: true,
    args: [
      '--str', 'yes', '--str=yes', '-s', 'yes',
    ]
  }),
  {
    values: {__proto__:null,
      str: 'yes',
    },
    positionals: [],
    tokens: [
      {
        kind: 'option',
        name: 'str',
        rawName: '--str',
        index: 0,
        value: 'yes',
        inlineValue: false
      },
      {
        kind: 'option',
        name: 'str',
        rawName: '--str',
        index: 2,
        value: 'yes',
        inlineValue: true
      },
      {
        kind: 'option',
        name: 'str',
        rawName: '-s',
        index: 3,
        value: 'yes',
        inlineValue: false
      }
    ]
  }
);

Lastly, this is an example of parsing positional arguments and an option terminator:

assert.deepEqual(
  parseArgs({
    options, allowPositionals: true, tokens: true,
    args: [
      'command', '--', '--str', 'yes', '--str=yes'
    ]
  }),
  {
    values: {__proto__:null,
    },
    positionals: [ 'command', '--str', 'yes', '--str=yes' ],
    tokens: [
      { kind: 'positional', index: 0, value: 'command' },
      { kind: 'option-terminator', index: 1 },
      { kind: 'positional', index: 2, value: '--str' },
      { kind: 'positional', index: 3, value: 'yes' },
      { kind: 'positional', index: 4, value: '--str=yes' }
    ]
  }
);

16.4.2 Using tokens to implement subcommands

By default, parseArgs() does not support subcommands such as git clone or npm install. However, it is relatively easy to implement this functionality via tokens.

This is the implementation:

function parseSubcommand(config) {
  // The subcommand is a positional, allow them
  const {tokens} = parseArgs({
    ...config, tokens: true, allowPositionals: true
  });
  let firstPosToken = tokens.find(({kind}) => kind==='positional');
  if (!firstPosToken) {
    throw new Error('Command name is missing: ' + config.args);
  }

  //----- Command options

  const cmdArgs = config.args.slice(0, firstPosToken.index);
  // Override `config.args`
  const commandResult = parseArgs({
    ...config, args: cmdArgs, tokens: false, allowPositionals: false
  });

  //----- Subcommand

  const subcommandName = firstPosToken.value;

  const subcmdArgs = config.args.slice(firstPosToken.index+1);
  // Override `config.args`
  const subcommandResult = parseArgs({
    ...config, args: subcmdArgs, tokens: false
  });

  return {
    commandResult,
    subcommandName,
    subcommandResult,
  };
}

This is parseSubcommand() in action:

const options = {
  'log': {
    type: 'string',
  },
  color: {
    type: 'boolean',
  }
};
const args = ['--log', 'all', 'print', '--color', 'file.txt'];
const result = parseSubcommand({options, allowPositionals: true, args});

const pn = obj => Object.setPrototypeOf(obj, null);
assert.deepEqual(
  result,
  {
    commandResult: {
      values: pn({'log': 'all'}),
      positionals: []
    },
    subcommandName: 'print',
    subcommandResult: {
      values: pn({color: true}),
      positionals: ['file.txt']
    }
  }
);