The JavaScript Event Loop [Presentation]

Thomas Hunter II
I gave a talk this morning on the JavaScript Event Loop at Penguicon 2013. Even though I had used JavaScript for several years, I didn’t completely comprehend how the Event Loop works until a few months ago. When the opportunity came to present at Penguicon, I figured this was as good of a topic as any. You can download the presentation below (or view it in your browser), and I’ll throw all the individual slides and the gist of what I said about them on this page.
Download as a Keynote, Powerpoint, or HTML presentation.

 Introduction

Slide 2/14: Credibility

I’ve been a web developer for a while, starting at some smaller mom and pop shops (not listed), to a couple fortune 50’s, before finally ending up at smaller and smaller (and quicker and more advanced) companies. For most of that time I was doing procedural PHP and MySQL programming, before eventually moving to mostly JavaScript (both frontend and backend).
I’m currently working with Packt to get a book on Backbone.js published (which is a frontend JavaScript framework for building Single Page Applications). Be sure to keep an eye out for it and purchase several copies, even if you don’t intend on reading them.

Slide 3/14: MultiThreaded

Let me first begin the presentation by talking about something mostly unrelated to JavaScript; MultiThreaded programming. If an application is built to be MultiThreaded, it will make use of several of your CPU cores simultaneously. This means it can do number crunching in different places at the same time and we refer to this as Concurrency. An application built in this manner can be a single process within the Operating System. The Operating System itself usually gets to choose which cores an application will run on (even which core a single threaded application will run on).
One way to fake MultiThreaded-ness in SingleThreaded languages is to simply run several different processes and have them communicate with each other.
For the longest time, CPUs were getting faster and faster, but then Moore’s Law caught up, and we sorta hit a wall with how fast our CPUs can get. So, to make hardware faster, we now throw more CPU cores at the computer. In order to truly scale and use the hardware to its fullest, one needs to build applications which make use of all CPU cores.
MultiThreading isn’t all butterflies and puppy tails though. There can be some big issues with this type of code, particularly Deadlocks and Race Conditions. One such example of these kinds of issues is that if an application is running on two separate threads, both threads reads a variable from memory at the same time, and both attempt to update the value by adding 2 to it. If the existing value is 10, and thread A adds 2, it does so by writing 12 to the memory location. If thread B also wants to add 2, it still thinks the value is 10, and writes 12. The programmer would expect it to be 14 and ends up with 12, and there are no errors. This type of bug can be very hard to track down, and the worst part is that it will happen in an unpredictable way.

Slide 4/14: SingleThreaded


Now that you know what MultiThreaded means, lets talk about how JavaScript is not MultiThreaded. A JavaScript engine exists in a single OS process, and consumes a single thread. This means that when your application is running, CPU execution is never performed in parallel. By running the JavaScript engine in this method, it is impossible for users to get the Deadlocks and Race Conditions which plague MultiThreaded applications.
Developers often refer to their callbacks running in an unexpected order as a Race Condition, however it is not the same thing that happens to MultiThreaded applications, and can usually be solved and tracked down easily enough (e.g., use another callback).

Slide 5/14: Implementation


There are three important features of a JavaScript engine that deserve mention. These are the Stack, the Heap, and the Queue. Now, different browsers have different JavaScript engines (e.g. Chrome has V8, Firefox has OdinMonkey, and IE has something written in BASIC called Chakra (just kidding!)) and each browser will implement these features differently, but this explanation should work for all of them.
Heap: The simplest part of this is the Heap. This is a bunch of memory where your objects live (e.g. variables and functions and all those things you instantiate). In the presentation I refer to this as Chaotic, only because the order doesn’t really matter and there’s no guarantee with how they will live. In this heap, different browsers will perform different optimizations, e.g., if an object is duplicated many times, it may only exist in memory once, until a change needs to happen, at which point the object is copied.
Stack: This is where the currently running functions get added. If function A() runs function B(), well you’re two levels deep in the stack. Each time one of these functions is added to the stack, it is called a frame. These frames contain pointers to the functions in the heap, as well as the objects available to the function depending on its current scope, and of course the arguments to the function itself. Different JavaScript engines likely have different maximum stack sizes, and unless you have a runaway recursive function, you’ve probably never hit this limit. Once a function call is complete, it gets removed from the stack. Once the stack is empty, we’re ready for the next item in the Queue.
Queue: This is where function calls which are queued up for the future go. If you perform a setTimeout(function() { console.log('hi'); }, 10);, that anonymous function is living in the next available queue slot. No items in the queue will be run until the current stack is complete. So, if you have some work that might be slow that you want to run after you get your data, try a setTimeout() with a delay of 0ms. Future items which rely on I/O to complete, or a long running timer, are somehow in that queue as well, although I’m not exactly sure how that is implemented.
It’s worth mentioning Garbage Collection here as well. In JavaScript it’s easy to create tons of objects all willy nilly like. These get added to the Heap. But, once there is no scope remaining that needs those objects, it’s safe to throw them away. JavaScript can keep an eye on the current stack and the items in the Queue, and see what objects in the Heap are being pointed to. If an object no longer has pointers to it, it is safe to assume that object can be thrown away. If you aren’t careful with how you manage your code, it’s easy to not have those pointers disappear, and we call this wasted memory a Memory Leak.

Slide 6/14: Implementation Example

This code-run is an example of the previous slide. So, the very first thing that happens is that function a() and b() are “hoisted” to the top of the script, and are added to the heap. We then run the first message log “Adding code to the queue” in the current stack. After that we run a setTimeout, and the anonymous function in there is added to the Queue. Then we do another log, and run the a() function with an argument of 42. We are now one level deep in the stack, and that frame knows about the a() function, the b() function, and its argument of 42. Within a() we run b(), and we are now two levels deep in our stack. We print more messages, leave b(), leave a(), and print a final message. At that point, our stack is empty and we’ve run all of our code, and are now ready for the next item in the queue.
Once we’re in the next queue item, we run the anonymous function (which exists in the Heap somewhere), and display our message.
At first glance, one might assume the message “Running next code from queue” could have been run earlier, perhaps after the first message. If this were a MultiThreaded application, that message could have been run at any point in time, randomly placed between any of the outputted messages. But, since this is JavaScript, it is guaranteed to run after the current stack has completed.

Slide 7/14: Sleeping


I come from a background in writing PHP/MySQL applications. When a PHP script runs, it performs a bunch of work, and then probably runs a MySQL query. Once that call is made to the external server, the application falls asleep. It literally halts everything it is doing and waits for a response from the database server. Once the result comes back, it does some further processing, and then it might perform another I/O function, such as calling an RSS feed. And, as you might guess, it falls asleep again.
Now, what if the call to the RSS feed doesn’t require any of the data we gain from the database call? Then the order of the two calls might not have mattered. But, more importantly, the two calls could have been run simultaneously! The application is as slow as the two calls combined, instead of being as slow as the slowest of the two.
Node.js does something pretty cool, where every I/O request it makes is a non blocking call. This means that the call can end the current stack, and the callback can be called later on in a separate Queue. If we’re performing a bunch of I/O operations, they can be run in parallel. The application will still sleep, but it won’t be blocking.
The web browser is the same. Most of the time it is doing nothing, perhaps waiting for a user to click on something, or waiting for an AJAX request to finish up.

Slide 8/14: Sequential vs Parallel I/O


This is a great graphic I adapted from the CodeSchool Real-Time Web with Node.js course. It shows how the I/O operations for sequential I/O compares to parallel I/O. The sequential graph represents calls make in a more traditional language such as PHP, whereas the parallel graph represents calls made in an EventLoop driven language with non blocking I/O, or even MultiThreaded applications. Notice that the application is only as slow as the slowest I/O operation, instead of as slow as all I/O operations combined.

Slide 9/14: Other Language Event Loops

JavaScript isn’t the only language that can have an Event Loop. They can be implemented in the more traditional procedural languages as well. However, by having it built into the language, it’ll surely be quicker and have a nicer syntax.
Also, when it is implemented in another language, you lose out on the special features if your I/O is blocking, so you’ll have to be careful with which libraries you choose.
Some examples of Event Loops in other languages include Ruby’s EventMachine, Python’s Twisted and Tornado, and PHP’s ReactPHP.

Slide 10/14: Other Language Event Loop Example

Here’s an apples to oranges comparison of the Event Loop working in Node.js to perform a simple TCP echo example, and the (I’m assuming) same application working in Ruby’s EventMachine. I took the Node example from the homepage of nodejs.org, and the EventMachine example from their GitHub readme. They’ve been altered slightly to use the same text and hopefully perform the same function (I honestly don’t know Ruby though).
Notice that the syntax for JavaScript is less terse.

Slide 11/14: Event Loops are Awesome

There you have it folks, Event Loops are awesome. They don’t have the race conditions or deadlock issues that MultiThreaded applications have. Most web applications waste time waiting on I/O, and this is a good way around it. There is no special syntax for it to work in JavaScript; it is built in. It’s pretty easy to build stateful web applications (whereas if this were PHP you’d need a database to store shared data, in JS you could just use a local variable).

Slide 12/14: Event Loops aren’t Awesome


There you have it folks, Event Loops aren’t awesome. If you perform a bunch of CPU intensive work, it will block your process and only use one core. Unless, of course, you use Node.js and offload work to another process. Or, if you’re in a browser, read the next slide. Memory leaks are also possible, as you’re running an application for a long time instead of temporarily. Unless, of course, you program cleanly and are able to avoid those issues.

Slide 13/14: Web Workers

Well, now that I spent this whole time telling you how JavaScript is a SingleThreaded application and you can’t make use of multiple cores, I’ll apologize for being a liar. The core of JavaScript is single threaded, and it’s been that way for many years. However, there’s this cool new thing that came out in the last few years called Web Workers. It will allow your browser (doesn’t exist in Node) to offload work to a separate thread. This feature is available in every modern web browser, so feel free to offload your work today.
How it works is you create a script, and throw some specifically formatted code in there. The main script loads it with var worker = new Worker('task.js');, where task.js is an existing JavaScript file. You also attach a bunch of event handlers to the created worker object, and interact with the worker that way. The script will run in its own instance of the JavaScript engine, and cannot share memory with the main thread (which has the nice side effect of preventing those race conditions).
When you want to pass information to and from the worker, you use something called message passing. This allows you to pass simple JSON objects around, but not complex objects that contain functions or anything referencing the DOM. A great use-case for Web Workers would be calculating a SHA1 hash or performing some map/reduce computations. Basically, anything that involves a ton of number crunching and isn’t all DOM operations.

Slide 14/14: Conclusion


There you have it, the JavaScript Event Loop. It is great for I/O bound applications, and horrible for CPU bound applications. Many people think the engine is MultiThreaded, or at least that it can do things in parallel. Turns out it can do I/O in parallel, but not CPU computations (unless using a separate process with Node.js or a Web Worker in the browser).

Post a Comment

[disqus][blogger][facebook]

Afrogalaxy

Contact Form

Name

Email *

Message *

Powered by Blogger.
Javascript DisablePlease Enable Javascript To See All Widget