Vertical Scaling with Cluster

The next topic, it's called cluster. This module allows you to scale your processes to take the utilize your CPU on each machine to the maximum capacity. So the way it works, you would have one master and then you would have multiple workers that would do the job. Each of them would be a separate process. So by doing this, you can basically execute maybe some heavy CPU [[00:00:30]] processing on those workers because each of them, even if one of them is blocked, others would be processing your requests. And then the master would monitor if some of the worker crashes it can restart it, or it could spawn new workers as well, if you need more scaling.

And here's the code, which is very, very simple. First, you would import your module, [[00:01:00]] which is a core module. Again, we don't need to install anything. And then you would put a false condition, actually two a false conditions. cluster.isMaster, so you would check if this is a master, and then you would basically use a for loop to create workers. You would use .fork and the number of CPUs, that's coming from a different module, OS module, but you could also get that information from within your [[00:01:30]] node.js process very easily with just a few lines of code. And then we fall back to the cluster.isWorker. If it's not a master, it must be a worker. So one code is shared by both the master and the worker, and that's where your actual code goes in, in the worker. So typically it would be a server. And nice thing about cluster is that the workers, they would be listening on the same [[00:02:00]] port and in a way you would get load balancing...load balancer by implementing the clusters, because the load will be more or less evenly distributed among different workers. So that's a nice thing. Not so nice thing is that you need to modify your source code, but as you can see it's very, very easy to do.

So let's go and do the demo. There's a code, it's called cluster.js. You can execute it as any [[00:02:30]] other node.js process, node cluster.js. And then I'm using this tool, it's called loadtest. You can get it with npm install -g or --global loadtest. One word, no spaces, no hyphens. So loadtest, it's similar to jmeter or apache ab. It's a load testing tool or stress testing tool that will submit multiple requests to our server. And with each request on the server you [[00:03:00]] would see a different result, different process ids. And then once we finish it, when we terminate the server, you would also see the number of requests per each process. So you can compare if it's evenly distributed or not very evenly.

Some of the other libraries I mentioned...The advantage of cluster is that it's part of the core. That's pretty much it's only advantage. There are better, and should I say, [[00:03:30]] more feature reach, libraries. For example, pm2, it's widely used. Pm stands for process manager. Highly recommend one of those tools. You would get a load balancer, zero reload down time. That means basically, you can replace your process of data with a newer code and your system will be forever alive. It also has a good test coverage.

So this is [[00:04:00]] a pm2 example. As you can see, we don't modify the source ode. There is no isMaster, isWorker. Pm2 just magically takes its source code and scales it, create multiple processes, they are listening at the same port, so you get the load balancer. Wonderful, wonderful tool. It's also runs in the foreground...sorry in the background. It runs in the background so you can just launch your pm2, it will start the processes automatically, as many processes as the [[00:04:30]] CPUs you have, and then it will be running. It will get you back to the command line. And you can go back to your pm2, see the list of process by doing pm2 list. It will show you nice table.

So again, highly recommend pm2 or maybe StrongLoop cluster control as well. Or if you need a custom solution then you can also go and use cluster, the core module.