Scaling hundreds of thousands of database clusters on Kubernetes
By Brian Morrison II |
Containers have made an incredible impact when it comes to making, deploying, and distributing applications.
When building containerized applications, you no longer have to be concerned about dependency mismatches or the age-old “works on my machine” argument. And Kubernetes has simplified deploying and scaling these containerized applications. If a container crashes, Kubernetes can easily spin a new one up to handle the load!
But have you ever wondered if you can run a database in Kubernetes?
The short answer is “yes.” In fact, we utilize Kubernetes extensively at PlanetScale to support hundreds of thousands of databases all over the world. When deploying a database workload to Kubernetes, special considerations need to be made regardless of how big the workload is.
Let's explore how you might want to deploy databases on Kubernetes, and how PlanetScale does it.
A crash course on Kubernetes
Kubernetes is a container orchestration tool used by some of the largest enterprises around the world to manage their fleet of containerized applications.
In a Kubernetes cluster, multiple servers (nodes in Kubernetes parlance) are configured to work together to ensure that the containers deployed to them are always online and available. The smallest deployable unit in Kubernetes is known as a pod, which represents one container or a collection of containers. When a pod crashes for whatever reason, the environment is smart enough to spin up a new instance of that pod to keep the application online, whether that's on the same node or a different one.
This process is known as the Control Loop.
The Control Loop is managed by the Kubernetes Scheduler. It utilizes configuration files (written in YAML) that define what pods need to be running for a given application to stay online. The scheduler does this by comparing what's defined in the config file vs. what’s actually deployed on Kubernetes and taking the necessary steps to reconcile the differences.
This type of setup works great for applications, but what happens when your database is running on a pod that crashes?