NPM registry internals

Jan 24, 2018 • packagecloud

TL;DR

This blog post dives in to the NPM registry metadata format and core APIs. We’ll cover the different metadata files that make up an NPM registry, what the files mean, and show how a user can inspect metadata using command lines tools.

What is an NPM registry?

An NPM registry is a collection of Node.js packages, metadata, and several API endpoints that are primarily accessed via the npm command line tool. Most Node.js programmers will be familiar with running npm install for install Node.js packages.

Setup your own NPM registry for free.

Publishing Node.js packages to NPM registries

Node.js packages are published to NPM by running the npm publish command from the command line.

The npm publish command will upload the Node.js package along with some JSON metadata which describes the package. The package and its metadata will be used by the NPM registry to generate a series of JSON files that npm can use in the future when installing a package.

You can read more about npm publish command by checking the official documentation of the command.

NPM registry metadata

Each package published to an NPM registry will cause the registry to generate JSON metadata describing the package.

This metadata is accessed by the npm install and is used to:

  • Install the requested version of the Node.js package
  • Install the required dependencies needed by the package

among several other operations.

Manually examine NPM registry metadata

You can manually examine NPM registry metadata by using the curl command. It is strongly recommended to use a JSON pretty printer (like json_pp or jq) so that the results will be more readable.

To examine the metadata for the popular Node.js package chalk, you would run the following command:

% curl https://registry.npmjs.org/chalk

It is strongly recommended to pipe this output to json_pp, like so:

% curl https://registry.npmjs.org/chalk | json_pp

The JSON formatted metadata returned is known as the “full” package metadata, described in the official NPM registry documentation.

Some useful fields in the metadata include:

  • Name
  • Versions
  • Authors
  • Maintainers
  • License
  • Checksums
  • URL for the package file itself
  • Creation and updated times
  • And much, much more.

This metadata can be quite large as it contains a lot of information. Luckily, NPM also provides a second set of metadata for use by npm install that contains fewer fields and is known as “abbreviated” metadata.

Abbreviated NPM metadata

The abbreviated metadata generated by an NPM registry includes fewer fields than the full metadata and is aimed at providing just enough data so that npm install commands will work.

You can request the abbreviated by metadata from an NPM registry by specifying an HTTP Accept header that is set to application/vnd.npm.install-v1+json.

In order to do this with curl, you would issue the following command:

% curl -H "Accept: application/vnd.npm.install-v1+json" https://registry.npmjs.org/chalk

The resulting metadata returned will be significantly shorter, but will provide enough information for npm install to install the specified package version and any required dependencies.

The abbreviated metadata format is described in the official NPM registry documentation.

Scoped Node.js package metadata

NPM provides a way to namespace packages under a common identifier called a scope. These packages are called scoped packages.

Scoped packages start with an ‘@’ and have the scope and package name separated by a ‘/’. So, a package with scope “scope” and name “example-pkg” would be written as: @scope/example-pkg.

In order to retrieve metadata about packages named this way, requests for metadata against an NPM registry must ensure the ‘/’ in the name is URL encoded as %2F. For example, in order to examine the metadata for @elastic/eui you would issue the following command to request the full metadata:

% curl https://registry.npmjs.org/@elastic%2Feui

Similarly, if you’d like to request the abbreviated metadata for a scoped Node.js package, you’d issue the same request, but with an HTTP Accept header set to application/vnd.npm.install-v1+json:

% curl -H "Accept: application/vnd.npm.install-v1+json" https://registry.npmjs.org/@elastic%2Feui

Core NPM workflow and APIs

There are several APIs that an NPM registry may provide to ensure the user can take advantage of all the workflow features of NPM via the npm command line client. We’ll examine a few that are particularly interesting and play a key role in most NPM workflows:

Package publish

The package publish API is typically used by running npm publish on the command line in the directory of the package to be published to the remote NPM registry.

This API endpoint is accessed with the PUT HTTP verb against the NPM registry. The request body includes the package file itself (base64 encoded), as well as JSON metadata describing the package.

The NPM registry receiving this request will process the data and generate the metadata described above for use by npm install.

Package unpublish

The package unpublish API is typically used by running npm unpublish on the command line.

It is recommended that if you are removing a publicly accessible package on the official NPM registry, that you instead use the deprecate API (described below). This is because unpublishing a package removes it from the registry, thereby breaking any other package which depends on the version that was unpublished. For private repositories where dependencies are closely monitored, this may pose less of an issue.

This API endpoint is accessed with the PUT HTTP verb against the NPM registry. The request contains the package metadata with the version to be unpublished removed. If unpublishing all versions is desired the HTTP verb DELETE is used instead.

Note that on the official NPM registry, npm unpublish is only available during the first 24 hours after a package is published.

Package deprecation

The package deprecate API is typically used by running npm deprecate on the command line.

This is preferrable to unpublish as it leaves the package available for anything that depends on it, but informs the user that a newer version should be used instead.

This API endpoint is accessed with the PUT HTTP verb against the npm registry. The request body contains the package metadata with an added ‘deprecate’ field to be added.

Distribution tags (dist tags)

The NPM distribution tags (dist tags) API is used in two ways:

  1. Automatically when publishing a package. The most recently published package is marked with the dist tag “latest.”
  2. Manually with npm dist-tag on the command line. Distribution tags can be added, removed, or listed using the npm dist-tag command.

Dist tags allow package authors to associate strings with particular package versions. One common use case for this is assocaiting the string ‘beta’ with a beta version of a package.

Once a dist tag has been associated with a version, that dist tag may be used during npm install. For example, in order to install a package name ‘example’ at the version marked with the ‘beta’ dist tag, you would run:

% npm install example@beta

The NPM command line client, internally, will request the package metadata. The package metadata contains the dist tags and the versions mapped to them, so the NPM client will look up the version associated with the dist tag you’ve requested in order to install the desired version.

When you run npm install package, the dist tag latest is implied.

Conclusion

NPM registries are a combination of JSON metadata and API endpoints that when implemented by a remote NPM registry provide a rich workflow for developers to create, update, and manipulate Node.js packages.

Related posts

Never miss an update!

Subscribe to our RSS feed