Document module loading
This commit is contained in:
parent
35e32225d1
commit
ff39d413a3
@ -21,13 +21,13 @@ one-to-one correspondence. As an example, `foo.js` loads the module
|
||||
|
||||
The contents of `foo.js`:
|
||||
|
||||
var circle = require('./circle');
|
||||
var circle = require('./circle.js');
|
||||
console.log( 'The area of a circle of radius 4 is '
|
||||
+ circle.area(4));
|
||||
|
||||
The contents of `circle.js`:
|
||||
|
||||
var PI = 3.14;
|
||||
var PI = Math.PI;
|
||||
|
||||
exports.area = function (r) {
|
||||
return PI * r * r;
|
||||
@ -39,78 +39,285 @@ The contents of `circle.js`:
|
||||
|
||||
The module `circle.js` has exported the functions `area()` and
|
||||
`circumference()`. To export an object, add to the special `exports`
|
||||
object. (Alternatively, one can use `this` instead of `exports`.) Variables
|
||||
local to the module will be private. In this example the variable `PI` is
|
||||
private to `circle.js`. The function `puts()` comes from the module `'util'`,
|
||||
which is a built-in module. Modules which are not prefixed by `'./'` are
|
||||
built-in modules--more about this later.
|
||||
object.
|
||||
|
||||
### Module Resolving
|
||||
Variables
|
||||
local to the module will be private. In this example the variable `PI` is
|
||||
private to `circle.js`.
|
||||
|
||||
### Core Modules
|
||||
|
||||
Node has several modules compiled into the binary. These modules are
|
||||
described in greater detail elsewhere in this documentation.
|
||||
|
||||
The core modules are defined in node's source in the `lib/` folder.
|
||||
|
||||
Core modules are always preferentially loaded if their identifier is
|
||||
passed to `require()`. For instance, `require('http')` will always
|
||||
return the built in HTTP module, even if there is a file by that name.
|
||||
|
||||
### File Modules
|
||||
|
||||
If the exact filename is not found, then node will attempt to load the
|
||||
required filename with the added extension of `.js`, and then `.node`.
|
||||
|
||||
`.js` files are interpreted as JavaScript text files, and `.node` files
|
||||
are interpreted as compiled addon modules loaded with `dlopen`.
|
||||
|
||||
A module prefixed with `'/'` is an absolute path to the file. For
|
||||
example, `require('/home/marco/foo.js')` will load the file at
|
||||
`/home/marco/foo.js`.
|
||||
|
||||
A module prefixed with `'./'` is relative to the file calling `require()`.
|
||||
That is, `circle.js` must be in the same directory as `foo.js` for
|
||||
`require('./circle')` to find it.
|
||||
|
||||
Without the leading `'./'`, like `require('assert')` the module is searched
|
||||
for in the `require.paths` array. `require.paths` on my system looks like
|
||||
this:
|
||||
Without a leading '/' or './' to indicate a file, the module is either a
|
||||
"core module" or is loaded from a `node_modules` folder.
|
||||
|
||||
`[ '/home/ryan/.node_modules' ]`
|
||||
### Loading from `node_modules` Folders
|
||||
|
||||
That is, when `require('foo')` is called Node looks for:
|
||||
If the module identifier passed to `require()` is not a native module,
|
||||
and does not begin with `'/'`, `'../'`, or `'./'`, then node starts at the
|
||||
parent directory of the current module, and adds `/node_modules`, and
|
||||
attempts to load the module from that location.
|
||||
|
||||
* 1: `/home/ryan/.node_modules/foo`
|
||||
* 2: `/home/ryan/.node_modules/foo.js`
|
||||
* 3: `/home/ryan/.node_modules/foo.node`
|
||||
* 4: `/home/ryan/.node_modules/foo/index.js`
|
||||
* 5: `/home/ryan/.node_modules/foo/index.node`
|
||||
If it is not found there, then it moves to the parent directory, and so
|
||||
on, until either the module is found, or the root of the tree is
|
||||
reached.
|
||||
|
||||
interrupting once a file is found. Files ending in `'.node'` are binary Addon
|
||||
Modules; see 'Addons' below. `'index.js'` allows one to package a module as
|
||||
a directory.
|
||||
For example, if the file at `'/home/ry/projects/foo.js'` called
|
||||
`require('bar.js')`, then node would look in the following locations, in
|
||||
this order:
|
||||
|
||||
Additionally, a `package.json` file may be used to treat a folder as a
|
||||
module, if it specifies a `'main'` field. For example, if the file at
|
||||
`./foo/bar/package.json` contained this data:
|
||||
* `/home/ry/projects/node_modules/bar.js`
|
||||
* `/home/ry/node_modules/bar.js`
|
||||
* `/home/node_modules/bar.js`
|
||||
* `/node_modules/bar.js`
|
||||
|
||||
{ "name" : "bar",
|
||||
"version" : "1.2.3",
|
||||
"main" : "./lib/bar.js" }
|
||||
This allows programs to localize their dependencies, so that they do not
|
||||
clash.
|
||||
|
||||
then `require('./foo/bar')` would load the file at
|
||||
`'./foo/bar/lib/bar.js'`. This allows package authors to specify an
|
||||
entry point to their module, while structuring their package how it
|
||||
suits them.
|
||||
#### Optimizations to the `node_modules` Lookup Process
|
||||
|
||||
Any folders named `"node_modules"` that exist in the current module path
|
||||
will also be appended to the effective require path. This allows for
|
||||
bundling libraries and other dependencies in a 'node_modules' folder at
|
||||
the root of a program.
|
||||
When there are many levels of nested dependencies, it is possible for
|
||||
these file trees to get fairly long. The following optimizations are thus
|
||||
made to the process.
|
||||
|
||||
To avoid overly long lookup paths in the case of nested packages,
|
||||
the following 2 optimizations are made:
|
||||
First, `/node_modules` is never appended to a folder already ending in
|
||||
`/node_modules`.
|
||||
|
||||
1. If the module calling `require()` is already within a `node_modules`
|
||||
folder, then the lookup will not go above the top-most `node_modules`
|
||||
directory.
|
||||
2. Node will not append `node_modules` to a path already ending in
|
||||
`node_modules`.
|
||||
Second, if the file calling `require()` is already inside a `node_modules`
|
||||
heirarchy, then the top-most `node_modules` folder is treated as the
|
||||
root of the search tree.
|
||||
|
||||
So, for example, if the file at
|
||||
`/usr/lib/node_modules/foo/node_modules/bar.js` were to do
|
||||
`require('baz')`, then the following places would be searched for a
|
||||
`baz` module, in this order:
|
||||
For example, if the file at
|
||||
`'/home/ry/projects/foo/node_modules/bar/node_modules/baz/quux.js'`
|
||||
called `require('asdf.js')`, then node would search the following
|
||||
locations:
|
||||
|
||||
* 1: `/usr/lib/node_modules/foo/node_modules`
|
||||
* 2: `/usr/lib/node_modules`
|
||||
* `/home/ry/projects/foo/node_modules/bar/node_modules/baz/node_modules/asdf.js`
|
||||
* `/home/ry/projects/foo/node_modules/bar/node_modules/asdf.js`
|
||||
* `/home/ry/projects/foo/node_modules/asdf.js`
|
||||
|
||||
`require.paths` can be modified at runtime by simply unshifting new
|
||||
paths onto it, or at startup with the `NODE_PATH` environmental
|
||||
variable (which should be a list of paths, colon separated).
|
||||
### Folders as Modules
|
||||
|
||||
The second time `require('foo')` is called, it is not loaded again from
|
||||
disk. It looks in the `require.cache` object to see if it has been loaded
|
||||
before.
|
||||
It is convenient to organize programs and libraries into self-contained
|
||||
directories, and then provide a single entry point to that library.
|
||||
There are three ways in which a folder may be passed to `require()` as
|
||||
an argument.
|
||||
|
||||
The first is to create a `package.json` file in the root of the folder,
|
||||
which specifies a `main` module. An example package.json file might
|
||||
look like this:
|
||||
|
||||
{ "name" : "some-library",
|
||||
"main" : "./lib/some-library.js" }
|
||||
|
||||
If this was in a folder at `./some-library`, then
|
||||
`require('./some-library')` would attempt to load
|
||||
`./some-library/lib/some-library.js`.
|
||||
|
||||
This is the extent of Node's awareness of package.json files.
|
||||
|
||||
If there is no package.json file present in the directory, then node
|
||||
will attempt to load an `index.js` or `index.node` file out of that
|
||||
directory. For example, if there was no package.json file in the above
|
||||
example, then `require('./some-library')` would attempt to load:
|
||||
|
||||
* `./some-library/index.js`
|
||||
* `./some-library/index.node`
|
||||
|
||||
### Caching
|
||||
|
||||
Modules are cached after the first time they are loaded. This means
|
||||
(among other things) that every call to `require('foo')` will get
|
||||
exactly the same object returned, if it would resolve to the same file.
|
||||
|
||||
### All Together...
|
||||
|
||||
To get the exact filename that will be loaded when `require()` is called, use
|
||||
the `require.resolve()` function.
|
||||
|
||||
Putting together all of the above, here is the high-level algorithm
|
||||
in pseudocode of what require.resolve does:
|
||||
|
||||
require(X)
|
||||
1. If X is a core module,
|
||||
a. return the core module
|
||||
b. STOP
|
||||
2. If X begins with `./` or `/`,
|
||||
a. LOAD_AS_FILE(Y + X)
|
||||
b. LOAD_AS_DIRECTORY(Y + X)
|
||||
3. LOAD_NODE_MODULES(X, dirname(Y))
|
||||
4. THROW "not found"
|
||||
|
||||
LOAD_AS_FILE(X)
|
||||
1. If X is a file, load X as JavaScript text. STOP
|
||||
2. If X.js is a file, load X.js as JavaScript text. STOP
|
||||
3. If X.node is a file, load X.node as binary addon. STOP
|
||||
|
||||
LOAD_AS_DIRECTORY(X)
|
||||
1. If X/package.json is a file,
|
||||
a. Parse X/package.json, and look for "main" field.
|
||||
b. let M = X + (json main field)
|
||||
c. LOAD_AS_FILE(M)
|
||||
2. LOAD_AS_FILE(X/index)
|
||||
|
||||
LOAD_NODE_MODULES(X, START)
|
||||
1. let DIRS=NODE_MODULES_PATHS(START)
|
||||
2. for each DIR in DIRS:
|
||||
a. LOAD_AS_FILE(DIR/X)
|
||||
b. LOAD_AS_DIRECTORY(DIR/X)
|
||||
|
||||
NODE_MODULES_PATHS(START)
|
||||
1. let PARTS = path split(START)
|
||||
2. let ROOT = index of first instance of "node_modules" in PARTS, or 0
|
||||
3. let I = count of PARTS - 1
|
||||
4. let DIRS = []
|
||||
5. while I > ROOT,
|
||||
a. if PARTS[I] = "node_modules" CONTINUE
|
||||
c. DIR = path join(PARTS[0 .. I] + "node_modules")
|
||||
b. DIRS = DIRS + DIR
|
||||
6. return DIRS
|
||||
|
||||
### Loading from the `require.paths` Folders
|
||||
|
||||
In node, `require.paths` is an array of strings that represent paths to
|
||||
be searched for modules when they are not prefixed with `'/'`, `'./'`, or
|
||||
`'../'`. For example, if require.paths were set to:
|
||||
|
||||
[ '/home/micheil/.node_modules',
|
||||
'/usr/local/lib/node_modules' ]
|
||||
|
||||
Then calling `require('bar/baz.js')` would search the following
|
||||
locations:
|
||||
|
||||
* 1: `'/home/micheil/.node_modules/bar/baz.js'`
|
||||
* 2: `'/usr/local/lib/node_modules/bar/baz.js'`
|
||||
|
||||
The `require.paths` array can be mutated at run time to alter this
|
||||
behavior.
|
||||
|
||||
It is set initially from the `NODE_PATH` environment variable, which is
|
||||
a colon-delimited list of absolute paths. In the previous example,
|
||||
the `NODE_PATH` environment variable might have been set to:
|
||||
|
||||
/home/micheil/.node_modules:/usr/local/lib/node_modules
|
||||
|
||||
#### **Note:** Please Avoid Modifying `require.paths`
|
||||
|
||||
For compatibility reasons, `require.paths` is still given first priority
|
||||
in the module lookup process. However, it may disappear in a future
|
||||
release.
|
||||
|
||||
While it seemed like a good idea at the time, and enabled a lot of
|
||||
useful experimentation, in practice a mutable `require.paths` list is
|
||||
often a troublesome source of confusion and headaches.
|
||||
|
||||
##### Setting `require.paths` to some other value does nothing.
|
||||
|
||||
This does not do what one might expect:
|
||||
|
||||
require.paths = [ '/usr/lib/node' ];
|
||||
|
||||
All that does is lose the reference to the *actual* node module lookup
|
||||
paths, and create a new reference to some other thing that isn't used
|
||||
for anything.
|
||||
|
||||
##### Putting relative paths in `require.paths` is... weird.
|
||||
|
||||
If you do this:
|
||||
|
||||
require.paths.push('./lib');
|
||||
|
||||
then it does *not* add the full resolved path to where `./lib`
|
||||
is on the filesystem. Instead, it literally adds `'./lib'`,
|
||||
meaning that if you do `require('y.js')` in `/a/b/x.js`, then it'll look
|
||||
in `/a/b/lib/y.js`. If you then did `require('y.js')` in
|
||||
`/l/m/n/o/p.js`, then it'd look in `/l/m/n/o/p/lib/y.js`.
|
||||
|
||||
In practice, people have used this as an ad hoc way to bundle
|
||||
dependencies, but this technique is brittle.
|
||||
|
||||
##### Zero Isolation
|
||||
|
||||
There is (by regrettable design), only one `require.paths` array used by
|
||||
all modules.
|
||||
|
||||
As a result, if one node program comes to rely on this behavior, it may
|
||||
permanently and subtly alter the behavior of all other node programs in
|
||||
the same process. As the application stack grows, we tend to assemble
|
||||
functionality, and it is a problem with those parts interact in ways
|
||||
that are difficult to predict.
|
||||
|
||||
## Addenda: Package Manager Tips
|
||||
|
||||
If you were to build a package manager, the tools above provide you with
|
||||
all you need to very elegantly set up modules in a folder structure such
|
||||
that they get the required dependencies and do not conflict with one
|
||||
another.
|
||||
|
||||
Let's say that we wanted to have the folder at
|
||||
`/usr/lib/<some-program>/<some-version>` hold the contents of a specific
|
||||
version of a package.
|
||||
|
||||
Packages can depend on one another. So, in order to install
|
||||
package `foo`, you may have to install a specific version of package `bar`.
|
||||
The `bar` package may itself have dependencies, and in some cases, these
|
||||
dependencies may even collide or form cycles.
|
||||
|
||||
Since Node looks up the `realpath` of any modules it loads, and then
|
||||
looks for their dependencies in the `node_modules` folders as described
|
||||
above, this situation is very simple to resolve with the following
|
||||
architecture:
|
||||
|
||||
* `/usr/lib/foo/1.2.3/` - Contents of the `foo` package, version 1.2.3.
|
||||
* `/usr/lib/bar/4.3.2/` - Contents of the `bar` package that `foo`
|
||||
depends on.
|
||||
* `/usr/lib/foo/1.2.3/node_modules/bar` - Symbolic link to
|
||||
`/usr/lib/bar/4.3.2/`.
|
||||
* `/usr/lib/bar/4.3.2/node_modules/*` - Symbolic links to the packages
|
||||
that `bar` depends on.
|
||||
|
||||
Thus, even if a cycle is encountered, or if there are dependency
|
||||
conflicts, every module will be able to get a version of its dependency
|
||||
that it can use.
|
||||
|
||||
When the code in the `foo` package does `require('bar')`, it will get
|
||||
the version that is symlinked into
|
||||
`/usr/lib/foo/1.2.3/node_modules/bar`. Then, when the code in the `bar`
|
||||
package calls `require('quux')`, it'll get the version that is symlinked
|
||||
into `/usr/lib/bar/4.3.2/node_modules/quux`.
|
||||
|
||||
Furthermore, to make the module lookup process even more optimal, rather
|
||||
than putting packages directly in `/usr/lib`, we could put them in
|
||||
`/usr/lib/node_modules/<name>/<version>`. Then node will not bother
|
||||
looking for missing dependencies in `/usr/node_modules` or
|
||||
`/node_modules`.
|
||||
|
||||
In order to make modules available to the node repl, it might be useful
|
||||
to also add the `/usr/lib/node_modules` folder to the `NODE_PATH`
|
||||
environment variable. Since the module lookups using `node_modules`
|
||||
folders are all relative, and based on the real path of the files
|
||||
making the calls to `require()`, the packages themselves can be anywhere.
|
||||
|
Loading…
x
Reference in New Issue
Block a user