Stratified Node.js: IO performance

December 21, 2010 by Alexander Fritze

The number one topic we get asked about is performance. This is particularly true for our upcoming server-side StratifiedJS implementation. So here is the first of a couple of blogposts taking a closer look at the current state of affairs of StratifiedJS performance on the server-side.

In this first post we look at file IO performance, and in the next one we'll look at the HTTP stack. But let me start with some background first:

Server-side StratifiedJS

We've recently started work on a server-side StratifiedJS implementation based on NodeJS. You can track our progress at my node github fork. The code consists of a few C++ and JS patches into the canonical node sources, as well as a bunch of additional library files that get compiled into the node executable.

While the implementation is still at a very early stage, we've 'stratified' the REPL and amended the module system to load both 'normal' JS files and StratifiedJS files (files with extension *.sjs). The patched executable runs normal JS code exactly like the standard one - with no performance degradation or changes in semantics. All the standard node modules (http, dns, fs, stream, ...) are included in the executable and function just like you'd expect.

While you can use the standard builtin modules from StratifiedJS just like you would use them from normal JS, the whole point of having StratifiedJS on the server-side is to program in a callback-less blocking style. So for some of these modules we've started making 'stratified' counterparts (see here for the current state of things). These stratified modules have the same name as the non-stratified ones, but are prefixed with a '$' (e.g. $dns instead of dns). With these abstractions you can then write code like this:

var dns = require('$dns');

var google = dns.lookup("google.com");
console.log("google.com is " + google.address);
console.log(google.address + " is " + dns.reverse(google.address));

File read performance

Without further ado, let's look at some performance figures. We're using the io.js benchmark and its stratified counterpart, io.sjs.

Firstly, we're examining the case of reading fixed-size records from a file. The stratified version of the code looks something like this:

var fs = require('$fs');

function readtest(record_size) {
  var s = fs.openInStream(path, 'r', 0644);
  var buf = new Buffer(record_size);
  var start = Date.now();
  while (s.readBuf(buf) > 0)
    /* use record here */;
  var time_taken = Date.now() - start;
}

We'll benchmark this against the following normal callback-style nodejs code:

var fs = require('fs');

function readtest(record_size, done) {
  var s = fs.createReadStream(path, {'flags': 'r', 'encoding': 'binary', 
                                     'mode': 0644, 'bufferSize': bsize});
  var start = Date.now();
  s.addListener("data", function (record) {
    /* use record here */
  });
  s.addListener('close', function() {
    var time_taken = Date.now() - start;
    done();
  });
}

When running this code on a 1GB file vs various record sizes, we get the following:

Hmm, interesting - so for small record sizes the StratifiedJS version is up to 3 times faster than the normal JS version!

What's going on here? Well, when creating a ReadStream in nodejs with a custom bufferSize, what you are actually setting is the size of the buffer into which the OS read() call will copy its data. Consequently, the buffer passed into the data event listener might not actually be of the size that you requested - there is no real out-of-the-box way of reading buffers of a given fixed size with node's fs.js without adding your own buffering logic on top. The stratified streams, on the other hand, interface to the OS with a fixed IO buffer from which data gets copied into buffer you provide to the readBuf() call.

Take the performance degradation experienced by node in the case of small buffers with a grain of salt then - if anything it shows that node's libeio interactions are quite slow (a problem that extends to our stratified nodejs, too).

What I think this benchmark shows nicely though is that StratifiedJS easily performs as well as normal callback-style code: even though it does more under hood (buffering & chopping up the data into the size we request - all logic that you'd have to add manually on top of node's stream implementation), there is no performance degradation vs the straight nodejs implementation at any buffer size.

File write performance

Let's look at writing fixed records next. The stratified code looks something like this:

function writetest(file_size, record_size) {
  var s = fs.openOutStream(path, 'w', 0644);
  var buf = createBuffer(record_size);
  var start = Date.now();
  for (var remain = file_size; remain>0; remain -= record_size)
    s.writeBuf(buf);
  s.close();
  var time_taken = Date.now() - start;
}

And the normal JS version:

function writetest(file_size, record_size, done) {
  var s = fs.createWriteStream(path, {'flags': 'w', 'mode': 0644});
  var remaining = file_size;
  var buf = createBuffer(record_size);
  var start = Date.now();

  function dowrite() {
    do {
      var rv = s.write(buf);
      remaining -= buf.length;
      if (remaining <= 0) {
        s.emit('done')
        s.end();
        return;
      }
    } while (rv);
  }

  s.on('drain', dowrite);
  s.addListener('close', function() {
    var time_taken = Date.now() - start;
    done();
  });

  dowrite();
}

When running this code for a filesize of 1GB, again vs various record sizes, we get:

So now the situation is somewhat reversed to that of the file read example. For small buffer sizes, the straight nodejs implementation wins by a small margin. Why is this? From profiling, it seems that the same slow libeio performance that causes nodejs to perform so poorly in the file read example is now biting the stratified implementation; we get hit a little harder than nodejs, because our interaction with libeio watchers is still not quite up to scratch. In the grand scheme of things it's not something that matters much, but we'll be looking at improving in this area soon.

Conclusion

I think what these benchmarks show is that the performance of hand-coded asynchronous code can be matched by high-level stratified programs. Performance of stratified code is pretty much indistinguishable to that of straight nodejs code apart from 'pathological' cases, where sometimes stratified code wins and sometimes straight-up nodejs code wins.

In an upcoming blogpost we'll be comparing the performance of our stratified http stack vs that of the normal nodejs http stack. Stay tuned!