Developing Async Sense in JavaScript — Part II: Callbacks

Nemanja Stojanovic
Enki Blog
Published in
9 min readOct 5, 2018

--

This series of articles is inspired by Kyle, Jake Archibald, and Douglas Crockford.

If you didn’t get a chance to, I suggest you to read the previous article in this series dealing with the concepts of sync vs async, threads and the JavaScript Event Loop since this article will refer back to it.

A callback function is the essential building block of asynchronous JavaScript. It is a function that we intend to execute once a specific async operation is finished.

Before seeing how we can combine callbacks with async code, it’s important to understand that a callback by itself is not async in nature. It has no concept of asynchronicity. Using callbacks in a sync manner is, in fact, common, for example in the Array.prototype.forEach method:

The reason a function can be passed into other functions such as forEach is because functions in JavaScript are first-class citizens. We can store them into variables, return them or pass them around to other functions, just like any other value.

When merged with native constructs such as the setTimeout or requestAnimationFrame functions, event system in the browser or NodeJS (event callbacks are usually called handlers), or the XMLHttpRequest, callbacks enable us to provide a function to invoke when an async operation is finished.

By combining callbacks with async mechanisms, we syntactically split the code into a part that runs now (sync) and a part that runs later (async).

But how does this actually work?

Let’s see how callbacks and the Event Loop work together by using one of the oldest async callback mechanisms in JavaScript, the setTimeout function.

To get started, can you guess the output of the following code?

The answer is 'A', 'C', 'B' because the callback that will log 'B' is supposed to be called after 1000ms (1s).

How about if we set the delay to 0ms?

The answer doesn’t change.

What is going on here?

Well, the process starts as with any function invocation. The call to setTimeout will get parsed and put on the Call Stack (see the previous article for details).

Unlike a regular function though, setTimeout is provided by the environment in which JavaScript is running and it gets to communicate with the native APIs, hidden behind the scenes. When setTimeout is finished running (it gets popped off the Call Stack), it will give the task of handling the callback to a specific native API that, among other things, knows how to measure time in milliseconds. When the given millisecond timeout runs out, that native API will put the callback in the appropriate Task Queue, to be processed by the Event Loop.

This has an interesting consequence. The time given to setTimeout isn’t actually the time in which the callback will be called. It is the minimum time in which the callback will be put in a Task Queue. The reason it is the minimum time is that, as explained in the previous article, many things could be attempting to add Tasks to the Task Queues and they might get a run in the Event Loop before our callback.

In other words, setTimeout will execute after a given delay, and once the Event Loop is free to execute it.

Example of setTimeout taking longer than the given timeout

The flip-side to this is that, no matter what delay we use, the callback passed to setTimeout will execute asynchronously, after our synchronous code is done.

Here’s the flow visualized for the two setTimeout examples from above:

Note: 0ms is not the minimum possible value for setTimeout. It’s actually 4 ms.

Usage of setTimeout has historically been the main apparatus in adding concurrency to JavaScript for the sake of efficiency.

It allows us to do things like:

  • Add heavy-weight code to the end of the Event Loop
  • Defer code execution until the Call Stack is empty
  • Allow the browser to re-render and/or process events
    before executing our code

Ok, callbacks seem pretty cool thus far. They enable us to pass functions around to be invoked after specific events, asynchronously. They empower us to write more efficient code.

Why are there other constructs like promises then? Why was it necessary to innovate past callbacks?

Problems with callbacks

The most known callback issue, sometimes (IMHO) wrongly referred to as callback hell, is the “pyramid of doom”.

We can solve this deep nesting issue fairly easily by extracting the functions into standalone units.

Is that all? Turns out, there’s more.

The following two fundamental issues with callbacks would be a more appropriate explanation for the term “callback hell”:

  1. Non-Sequential Flow
  • Execution order in callbacks is difficult to track and maintain. This leads to code that is counter-intuitive, as this popular StackOverflow post demonstrates.
  • Callbacks mix input and output — they couple an async operation with our reaction to it.

2. Unreliable Execution

  • Callbacks have no built-in protection against being called too many times, being called too early or swallowing errors or exceptions — they require manual error handling, i.e. have no built-in error-handling semantics like try/catch.

Note: the rest of the article will use the term “parallel” a bit freely to mean “pretty much at the same time”. That’s enough for our purposes but if you’re interested in learning about parallelism and concurrency to a greater detail, I suggest starting with this awesome talk by Rob Pike.

Non-Sequential Flow

This section will try to demonstrate the control flow issues in async callbacks that lead to code that is bug-prone, difficult to reason about and hard to maintain.

Note: Try to solve any of the problems presented as you’re following along. It will help you to better actualize the stated callback issues.

To get started, let’s take the following snippet that invokes two asynchronous functions taking callbacks and needs to run a third callback when both of the async operations are finished:

How would we solve it?

Stop here and try it yourself.

Pretty much any solution would require us to create some type of global state that is shared by both callbacks. We can then execute processValues when one of the callbacks receives its result if the other callback already previously received its result (callbacks would communicate through the global state).

What are some of the issues with this approach?

We can (somewhat) improve the code by fixing the DRY-ness issue (remove repetition):

But that doesn’t remove the existing flow problems:

Code is DRY but still Non-Sequential

What if we were getting more than 2 things in parallel?

How would we write a callback-based code that will run N asynchronous calls to httpGet in parallel (based on an array of urls) and call processValues when all are finished?

Stop here and try it yourself.

Not easy right?

Here’s one approach:

The code is DRY and generalized for N parallel callbacks but the same control-flow problems still persist:

The code is DRY and generalized but still Non-Sequential

You can start to notice the pattern a little bit, right? How many jumps did your eyes had do to even follow the execution in the snippets above?

Let’s see a few more examples.

What if we switch the problem such that we still want to make parallel requests for the data but process the data in the order it’s received?

Let’s say we want to request value1 and value2 at the same time but process them in order (value1 then value2).

Stop here and try it yourself.

Here’s what I got:

The code above will process value1 then value2 no matter which value actually arrives first.

value 1 received first
value 2 received first

This code doesn’t look particularly flexible though. If we had to change our reaction order in the future, we’d have to change almost all of the code increasing the chances of us introducing a bug.

How would we generalize this approach such that we make any amount of parallel requests using httpGet but still process the values in order, following the order of a given array of urls?

Stop here and try it yourself.

This one’s even tougher right?

Here’s how I went about it:

Similar control-flow problems persist:

Non-Sequential flow is still there

Callback code gets significantly harder to follow or maintain the more callbacks are involved.

Unreliability

Besides enforcing non-sequential and tightly-coupled coding patterns, callbacks are also inherently unreliable.

Unreliable Execution

Consider the following snippet:

Just by looking at this code, can we guarantee that getStore won’t call its callback twice which would call getBooks twice and then send two purchase orders?

Not really. If this code came from some external library, we would have to go read their documentation or even read their code in order to confidently guarantee its behavior. And even then we would probably have to make sure this behavior is well tested and doesn’t cause regressions through library updates. It would be nice if there was a built-in protection layer for such situations (stay tuned cause there is).

One way we could think of solving this specific situation is to create a callback that runs once:

But again, the stated issues do not go away:

Non-Sequential flow

We could extract this behavior into a helper function,

but we still need to maintain and share these callback-protection tools across our system, increasing our code complexity and price of maintenance.

Unreliable error handling

What about error handling with callbacks? Are there any built-in constructs that solve the non-intuitive control flow behavior or improve reliability?

Let’s first see what are the main error-handling approaches generally used with callbacks.

One pattern popularized by jQuery is to separate the error-handling and success-handling into distinct flows:

This pattern (somewhat) alleviates the issue of Non-Sequential Flow by letting us decouple error handling from success handling (happy path) but it actually clones the unreliability problem. It doubles our need for protection against unreliable execution since now we have two callbacks to worry about instead of one.

The control-flow separation here is purely semantic. It is no different than prefixing class properties with an underscore to denote them as “private”. Correct behavior is in no way guaranteed or supported by the language itself and is entirely based on trusting the programmer to follow the convention and (hopefully) write bug-free code.

This holds true for any callback-based code.

Another error-handling pattern, popularized by NodeJS, is known as the error-first callback approach.

A prominent disadvantage of this pattern is that it bloats the code by requiring explicit error handling logic in every callback and can lead to a lot of branching. This can steer the programmer to write cluttered, deeply nested (aka “pyramid of doom”) or heavily branched code, making it hard to reason about and prone to bugs.

Note: as before, we can fix the pyramid-of-doom with modularization.

Consider the following snippet that creates a getJSON function and handles errors first:

Does it actually handle errors properly? Look carefully.

What if res is an invalid JSON? JSON.parse would throw an error.

Let’s fix it:

What if cb itself throws an error? Due to the inherent coupling of the input and input of callbacks, we cannot completely protect against this.

The code above now handles most of the errors but is it easy to reason about?

Wouldn’t it be nice if we could just wrap the async mechanism into a structure that inherently guarantees predictable behavior and can be shared around our system independently of when the internal async operation actually finishes?

Perhaps there’s a native construct we can use to jointly solve the issues of Non-Sequential Flow and Unreliable Execution of callbacks?

Next up: Developing Async Sense Part III — Promises

--

--

https://nem035.com — Mostly software. Sometimes I play the 🎷. Education can save the world. @EnkiDevs