Web Worker Concurrency with StratifiedJS

February 21, 2012 by Alexander Fritze

One of the most common questions we get about our StratifiedJS engine Apollo is whether the implementation uses something like web workers behind the scenes. "True" concurrency, so the thinking goes, surely must require some "true" preemptive concurrency mechanism. Or, if Apollo doesn't use web workers at present, at least it should be improvable by leveraging such a mechanism.

Wrong. The short answer is no, StratifiedJS doesn't depend on web workers or any other preemptive concurrency mechanism, and that is by design. Even if web workers (or threads or something similar) were available on every platform targeted by Apollo (they aren't), the implementation still wouldn't make use of them.

That doesn't mean that StratifiedJS can't be used to control web workers, but before I get into that, let me say a few work about why it's a good idea to have concurrency constructs that by themselves aren't "really" concurrent.

Injecting Concurrency vs Orchestrating Concurrency

StratifiedJS builds on the observation that there are two complementary concerns to concurrency: that of injecting concurrency into a program, and that of orchestrating this concurrency into an overall narrative.

Mainstream programming languages generally have good facilities for injecting concurrency, but they lack facilities for orchestrating concurrency. Mechanisms that inject concurrency into a program include asynchronous I/O constructs (such as XMLHttpRequest), timers (window.setTimeout) and event listeners (onclick handlers, element.addEventListener).

Let's say we want to implement a program with the following logic:

  • Contact two servers (say CNN and BBC) simultaneously
  • As soon as the first response comes in, cancel the other request, and display response to user

In plain JavaScript it might look something like this:

without StratifiedJS
var cnn_req = sendAsyncRequest('http://cnn.com', handleCNNResponse);
var bbc_req = sendAsyncRequest('http://bbc.com', handleBBCResponse);

function handleCNNResponse(resp) { 
  bbc_req.cancel();
  display(resp);
}

function handleBBCResponse(resp) { 
  cnn_req.cancel();
  display(resp);
}

Firstly, note how, even though no web workers or threads or similar constructs are used here, this is still a "real" concurrent program: the BBC and CNN servers are doing some simultaneous work for us. This work is taking place on remote servers and not our local machine, but that's besides the point: we are still juggling two concurrent computations that are unrelated to each other. Both are being processed simultaneously and which of the two will complete first is completely nondeterministic, and our program needs to orchestrate this concurrency.

Secondly, note how injecting this concurrency into our program was easy (by calling sendAsyncRequest, which would use something like XMLHttpRequest behind the scenes). The tricky bit is the orchestration: we have to use awkward continuation-passing style, which, in this case, doesn't look too bad, but if we had a bigger concurrent program with some more involved control flow logic, we would quickly end up with a huge unmaintainable rat's nest of callbacks.

Many people think that to fix this problem, you somehow need to leverage web workers, threads, actors, Erlang-style processes, Go routines or similar conventional concurrency mechanisms. But think about it: would any of these help to make the above control flow any easier? Not really. All of these mechanism inject even more concurrency and nondeterminism into our program. In the best case this added concurrency is just redundant; in the worst case it is downright counterproductive.

E.g. with web workers, you would have code like this:

without StratifiedJS
//---------------------------
// in top-level HTML file:
var bbc_worker = new Worker('worker.js');
var cnn_worker = new Worker('worker.js');
bbc_worker.onmessage = handleBBCResponse;
cnn_worker.onmessage = handleCNNResponse;
bbc_worker.postMessage('http://bbc.com');
cnn_worker.postMessage('http://cnn.com');

function handleCNNResponse(resp) { 
  bbc_worker.postMessage('cancel');
  display(resp);
}

function handleBBCResponse(resp) { 
  cnn_worker.postMessage('cancel');
  display(resp);
}

//---------------------------
// worker.js
var req;
onmessage = function(event) {
  req = sendAsyncRequest(event.data, handleResponse);
  onmessage = function(event) {
    req.cancel();
  };
};

function handleResponse(resp) {
  postMessage(resp);
}

The orchestration logic still looks pretty much like before; all that web workers have introduced is a redundant layer and more code to write. Contrast this to what the program would look like in StratifiedJS:

StratifiedJS
var response;
waitfor {
  response = http.get('http://bbc.com');
} 
or {
  response = http.get('http://cnn.com');
}
display(response);

Unlike threads or actor-like abstractions, StratifiedJS doesn't introduce any further concurrency; it just adds orchestration facilities (like the waitfor/or combinator, above) to help you program with existing concurrency in your program in a straightforward sequential style. To put it whimsically: StratifiedJS helps you juggle those concurrent balls you have flying around in your application. Where you got those concurrent balls from is outside of the scope of StratifiedJS.

So what about web workers, threads, etc? They might not be much help in orchestrating concurrency, but they do of course have their use cases. E.g. your program might perform long-running calculations and you need true preemption, so as not to block the UI while the calculation is being performed. Or you might have a program that needs to perform more work in a certain time than can be handled on one core. The point is, web workers et al are primarily mechanisms for injecting concurrency into a program, not for orchestrating it.

This implies that, from the point-of-view of orchestration, web workers are not much different than other concurrency-injecting mechanism. A web worker performs concurrent work on your local machine, an XMLHttpRequest performs work on some remote server. Not that different when you think about it. To StratifiedJS web workers, events, timers, or requests are all just slightly different ways of throwing another concurrent ball into play. And to illustrate this point, let me now get to the part about controlling web workers with StratifiedJS.

Orchestrating Web Workers with StratifiedJS

Let's say we want to write an interactive program for calculating Pi to a (large) user-specific number of digits. As this calculation might take a considerable amount of time, we really want some preemptive mechanism here that doesn't block the UI while the calculation is in progress. Also, if we want to run several such computations at the same time, it is a good idea to have these calculations proceed in parallel on all cores that our hardware has available. These two requirements make this an example where web workers are an appropriate mechanism.

This is what our web worker code might look like:

// worker_pi.js

// calculate pi to 'digits' digits; return string with digits
function calcPi(digits) {
  var d = Math.floor(digits/4+4)*14;
  var rv = "", carry = 0, arr = [], sum, i, j;
  for (i = d; i > 0; i -= 14) {
    sum = 0;
    for (j = i; j > 0; --j) {
      sum = sum * j + 10000 * (arr[j] === undefined ? 2000 : arr[j]);
      arr[j] = sum % (j * 2 - 1);
      sum = Math.floor(sum/(j * 2 - 1));
    }
    var s = "" + Math.floor(carry + sum/10000);
    while (s.length < 4) s = "0" + s;
    rv += s;
    carry = sum % 10000;
  }
  if (rv.length > digits)
    rv = rv.substr(0,digits);
  // pass message with result to caller: 
  self.postMessage(rv);
  self.close();
}

self.onmessage = function(e) { calcPi(e.data); };

From plain JavaScript, to interface to the web worker, we'd have to write callback code like this:

without StratifiedJS
function showResult(pi) {
  ...
}

function calculatePi(digits, callback) {
  var worker = new Worker('worker_pi.js');
  worker.onmessage = function(ev) { callback(ev.data); };
  worker.postMessage(digits);
}

calculatePi(digits, showResult);

With StratifiedJS, we can be a bit cleverer and get rid off the callback:

StratifiedJS
function calculatePi(digits) {
  var worker = new Worker('worker_pi.js');
  waitfor (var ev) {
    worker.onmessage = resume;
    worker.postMessage(digits);
  }
  retract {
    worker.terminate();
  }
  return ev.data;
}

Here we've used StratifiedJS's suspending waitfor() construct to make a blocking function out of the web worker call. We can use this function like a 'normal' function: to display the result to the user we can now write code like this:

StratifiedJS
showResult(calculatePi(digits));

This doesn't look too different from the plain JavaScript callback version, so the question is, have we actually gained something by throwing StratifiedJS into the mix? Let's see how we'd have to change the code to add a "cancel" button and some UI feedback while the calculation is in progress. In StratifiedJS, this is straightforward:

StratifiedJS
waitfor {
  showResult(calculatePi(digits));
}
or {
  while (1) {
    ... update ui ...
    hold(1000);
  }
}
or {
  dom.waitforEvent('click', cancel_button);
}

Here, we're using StratifiedJS's waitfor/or combinator to perform three things simultaneously. As soon as the first one of these finishes, it cancels the remaining ones. So if the Pi calculation finishes first, the UI updating loop and the event listener on the cancel button are both aborted. If, on the other hand, the cancel button is clicked before the calculation is done, the third clause of the waitfor/or finishes first, aborting the other two clauses.

So how does the web worker actually get aborted in that case? StratifiedJS sends something like an exception into the aborted clauses - we call this a 'cancellation'. This cancellation can be caught with a retract{} block, and if you look at the calculatePi definition above, that is where we terminate the web worker.

For comparison, this is how a plain JavaScript version of our orchestration logic would look:

without StratifiedJS
function calculatePi(digits, callback) {
  var worker = new Worker('worker_pi.js');
  worker.onmessage = function(ev) { callback(ev.data); };
  worker.postMessage(digits);
  return worker;
}

var timer;
function showTimer() {
  ... update ui ...
  timer = setTimeout(showTimer, 1000);
}

function cancelCalculation() {
  worker.terminate();
  cancel_button.removeEventListener('click', cancelCalculation, true);
  window.clearTimeout(timer);
}

var worker = calculatePi(digits, function(pi) {
  cancel_button.removeEventListener('click', cancelCalculation, true);
  window.clearTimeout(timer);
  showResult(pi);
});
showTimer();
cancel_button.addEventListener('click', cancelCalculation, true);

This code still doesn't look too bad, but if you extrapolate to some more complicated orchestration logic with maybe several interplaying web workers, it should give you a good idea of why StratifiedJS is a handy tool.

If you want to play around with the Pi example yourself, you can find the working (and slightly embellished) code here (requires a modern web workers capable browser):