In today’s fast-paced development landscape, building software that is fast, scalable, and that resists outages can make or break the success of any application. Containerized systems with orchestration layers like Kubernetes enable this robustness at the software level. Implementing these protections into your database, however, can often require an entire team of developers and database administrators dedicated to managing multiple database replicas with custom sharding logic to protect from outages and enable high availability.
PlanetScale is able to offer these features by using Vitess to power all of the databases on our platform. In this article, I’ll explain what Vitess is, how it works, and why you should care.
Vitess is an open-source, database clustering system for MySQL. At its core, it is a collection of systems that work together to enable MySQL to be more resilient, scalable, and performant. It was originally built by the team at YouTube in 2010 to address the increasing database scaling demands required by the platform. Today, it continues to scale massive companies like GitHub and Slack. The project is very actively maintained, with contributions from PlanetScale, Google, GitHub, Slack, Square, Stripe, and several more data-heavy companies.
Resiliency. Scalability. Performance. Read any modern database platform’s whitepapers and you’ll likely notice a lot of the same buzzwords, but let’s break down how Vitess ACTUALLY delivers these benefits.
At the heart of it, MySQL is an application just like any other. It is definitely more specialized than most others, but still has some of the same attributes. One of the best ways to increase the resiliency of any application is to add more instances of it. This way, if one goes down, the others can pick up the slack.
Vitess does this by running multiple instances of MySQL (on one or more servers) and uses a lightweight proxy, known as VTGate, to intelligently route queries to the proper MySQL instance. Vitess can also automatically detect when a MySQL instance goes offline and determine the best candidate to take its place as the primary MySQL process to serve queries for a given table.
Vitess allows you to scale massive MySQL databases via horizontal sharding with minimal application changes. It can split tables up across multiple MySQL instances to balance the load across multiple servers. When a query is received by the VTGate, the system will automatically determine which MySQL instances a row or set of rows lives in, will adjust the query to simultaneously grab the rows from these instances, and return the data just as if you were querying data from a single database. All of this is completely transparent to the developer — and perhaps more importantly, the user!
The points made in the previous two sections alone would massively increase the performance of MySQL simply by balancing the load across multiple servers, but Vitess has a few other enhancements built in to squeeze out as much performance as possible. One of those enhancements is the way that Vitess manages connections between the different subsystems.
The various Vitess components are written with Go and internally communicate with one another over gRPC. With the concurrency features built into the Go language, Vitess is able to easily handle thousands of clients simultaneously. Every client (GUI, application, etc) that connects to a Vitess instance establishes a lightweight connection to the VTGate instead of MySQL directly. VTGate understands the MySQL protocol and performs that intelligent query routing mentioned earlier based on the current Vitess infrastructure. To avoid creating too many connections, each instance of MySQL has an associated process called the VTTablet, to which VTGate sends the query.
Vitess takes the lightweight connections established by each client to VTGate and maps them to a smaller pool of MySQL connections managed by VTTablet. This process in turn helps to avoid overloading the individual MySQL processes, resulting in lower resource utilization since only VTTablet needs to connect to the underlying MySQL process.
PlanetScale prides itself in being the only MySQL-compatible serverless database that both scales and increases developer velocity, and Vitess is at the very center of it. Every single database created through PlanetScale spins up all of this infrastructure, with all the aforementioned benefits, in mere seconds for you to start building on. The end result is that developers who build on our platform get a MySQL database that truly has the capabilities to resist outages and scale to any size, without having to worry about managing the underlying infrastructure.