Getting started with node.js

A brief introduction

Craig R Webster

Senior Software Engineer, Agency.com

Hi, I'm Craig and I'm a senior software engineer at Agency.com. Mostly I look after the Rails projects there, but when Rails is your hammer everything looks like a thumb. Rails is nice but it isn't always the right tool for the job. It's good to have a decent idea of the technologies available outside the Rails and Ruby world.

I've been using Javascript to support work that I do for the past 8 or so years but I started to get really interested in what Javascript could do about three years ago when I saw Rails spit out some mangled code to make a link POST some data... Basically thought "this is horrible, there has to be a better way". Through that I got into unobtrusive enhancement using Javascript and from there into reverse-http for more complex communication than would normally be available to a web browser. I'd seen a few applications use Javascript in environments outside the browser but hadn't really played with it.

Recently I've been hearing whispers about node.js being the new hotness and figured I'd take a look. I had originally assumed it was something like reverse-http. Totally wrong.

I've been using node.js for a few weeks now for toy projects and I'd like to show you how I got started and what I found out - a very brief introduction.

What is node.js?

This talk is about node.js so a description of node.js would be handy. Unfortunately there's no one-line slogan on the nodejs.org site and the documentation is a little unclear about exactly what node.js is, prefering instead to talk about what it allows us to do, and giving examples of how to do that. After a little digging I found out that it's a Javascript framework geared towards building servers. I guess it's a little like Erlang/OTP's gen_server behavior... just without process supervision and hot code swapping.

It's pretty simple to learn and use. The documentation is easily read and generally understandable in less than 30 minutes.

Node can be extended pretty easily too. By moving towards using the CommonJS specification for modules, node.js is allowing wider re-use of code developed for node.js applications elsewhere - and easing the process of bringing that code to be brought into node.js projects.

One node process can easily handle many hundreds or thousands of connections because it uses evented I/O.

Evented I/O

Traditional I/O:

data = file.read("/foo/bar");
  // wait...
  doSomething(data);

Evented I/O:

file.read("/foo/bar", function(data) {
  // called after data is read
  doSomething(data);
});
otherThing(); // execute immediately;

In case you haven't come across evented I/O before, here's difference between that and regular I/O.

With traditional I/O when you read a file you have to wait for the data to be read before you can do anything else. You ask for the data, you wait for it to be read, you do something with the data.

By using evented I/O you can register callbacks which are called when an event - such as data having been read - has completed. After registering a callback the program continues with no wait. When an event occurs node.js wakes up and executes the appropriate callback for that event. Callbacks are developer defined Javascript functions - nice and simple.

It's not impossible to do this with most languages, but it can get pretty complex and CPU or memory intensive. Node.js uses epoll, kqueue, or /dev/poll) and non-blocking I/O calls like select so it doesn't need a bunch of native or green threads to avoid blocking. You don't have to worry about handling all that stuff.

In node.js most of the I/O operations emit something called a promise which you then define your callbacks on.

Promises, promises

Naive rename and delete:

posix.rename("/tmp/hello", "/tmp/world");
posix.unlink("/tmp/world");

Might fail - no guarantee which operation will complete first.

Instead:

var promise = posix.rename("/tmp/hello", "/tmp/world");
promise.addCallback(function() {
  posix.unlink("/tmp/world");
};

Since most operations are no-blocking there's no guarantee which operation will complete first so the first example here may fail because the unlink completes first, before the rename operation completes.

rename returns a promise though, and we can set the callback on that to unlink the file once the rename has completed - whenever that may be.

In general a promise is an object that will do something when the operation it was created from has completed or failed - a promise to do something at some point later.

Anyway, evented I/O isn't particularly big news if you've used EventMachine (or Twisted I believe), so what makes node.js worth looking at? Why shouldn't you just use Erlang/OTP, EventMachine, Twisted or whatever? Those tackle a similar set of problems, are well tested and generally well regarded in this area.

Why is it exciting?

Well, mainly it's about Javascript. The others are seen as a hard to get into because they're different to what most people are used to working with. For Javascript this isn't a problem since many people already use Javascript day-to-day.

When using Javascript you're used to the idea of closures (in Javascript: anonymous functions) being passed around.

Thanks to Ajax most Javascript developers have at least a basic idea of what programming asynchronously is like, and unlike other languages Javascript isn't plagued by an abundance of libraries that rely on blocking calls.

Javascript training courses are readily available.

There's also very little to worry about in terms of company adoption since Javascript is most likely already used elsewhere in the company. It's a smaller step to adopt Javascript as a server-side language than it is to introduce something like Go or Erlang.

Since it's Javascript we should hopefully have less of a divide between those who work on the client-side and those who work on the server-side. We're all talking the same language now, and while we'll continue to have specialities there's no reason that a client-side developer can't work side-by-side with a server-side developer.

CommonJS Modules

A simple, standardised way of providing Javascript libraries.

Define methods to make available to the world using exports.

exports.sum = function(a, b) {
  return a + b;
}

Import modules using require.

var math = require("./math");
math.sum(1, 2); // => 3

In a previous slide I mentionded that node.js is easily extensible because it makes use of CommonJS. So what's that?

CommonJS is an attempt to expand the standard library provided by the Javascript specification into something that is useful beyond the browser-based world that Javascript is normally confined to. It's been around for about a year and V8 is just one of many places that it's being implemented.

This code example shows how CommonJS can be used. require is a construct used to import Javascript modules. It takes as an argument the module name that you wish to load. sys is a built-in node.js module for printing to standard out ("puts"), printing debugging information (which it does synchronously unlike most other I/O operations), inspection of objects, and external command execution.

There are other built-in modules for handling things like file I/O (the POSIX module), HTTP servers (there's a server and a client implementation in the HTTP module), generic TCP servers (the TCP module), DNS lookups (the DNS module) and several others. It's also possible to write your own according to the CommonJS specification.

To write your own you define properties on the exports object which are then returned from a require() on that file. This slide is a very simple example of how that would work. There are plenty of libraries out there already, and node.js comes with a few built-in.

Hello, World!

Canonical first program is pretty easy to implement.

var sys = require('sys');
sys.puts('Hello, World!');

Running it is simple too.

node hello-world.js

Now we've got an idea of how CommonJS works, let's take a look at a node.js program. This is a typical first program which simply prints "Hello, World!" on the screen.

Node.js programs are run pretty much the same as shell scripts.

Hello, Interwebs!

Node.js helps us build servers. Here's a simple HTTP server.

var sys  = require("sys"),
    http = require("http");
http.createServer(function(request, response) {
  var headers = { "Content-Type": "text/plain" };
  response.sendHeader(200, headers);
  response.sendBody("Hello, World!\n");
  response.finish();
}).listen(8000);
sys.puts("> Running at http://127.0.0.1:8000/");

node hello-interwebs.js
> Running at http://127.0.0.1:8000/

Printing to the screen is a bit boring. Node.js is meant to help us build networked programs. Here's how node.js can be used to produce a simple web server. Run the code and you'll get a server listening on port 8000 that'll say "Hello, World!" to anyone that connects. Simple, huh?

Already it can handle silly amounts of connections - I exceeded the number of available outgoing ports when testing it and it didn't break a sweat.

Hello, Internets!

Here's a simple TCP server (the echo service).

var tcp  = require("tcp");
tcp.createServer(function(socket) {
  socket.addListener('receive', function(data) {
    socket.send(data);
  })
}).listen(8001);

node hello-interwebs.js

Networked programs are a little more than just HTTP servers though. There's service called echo that is meant to just repeat back whatever data you send it. Here's how to implement it in node.js.

We've seen HTTP and TCP servers built easily using built-in modules. These modules can also create clients, and other built-in modules include POSIX (for working with files), Multipart for parsing submitted form-data, DNS for working with names and IP addresses, Assert for writing tests, path for working with filesystem paths, URL for parsing URIs and Query String for doing the same with query-strings. I won't demonstrate all of those, but you can see that there's plenty of helpful things there for building your own services and it's easy to add more.

Installing

Well, that's the brief run-down. Hopefully you're thinking that it seems pretty simple to get started building network services with node. If so, installing node.js takes just a couple of minutes. Clone the repository, do the traditional compilation dance and you're good to go. All dependencies are included in the repository. Please do let me know how you get on.

Questions and contact details

Any questions?

Twitter
@craigwebster
Email
[email protected]
My blog
http://barkingiguana.com/
These slides
http://barkingiguana.com/~craig/talks/2010/javascript-london/getting-started-with-node-js
All my talks (version controlled!)
git clone http://barkingiguana.com/~craig/talks.git

I've only been using node.js for a very short while so I'm not sure I'll be able to answer them, but are there any questions?