What makes up a PlanetScale Vitess database?

By Brian Morrison II | August 23, 2023

Hosting and providing performant, reliable, and scalable databases is no easy feat. PlanetScale strives to provide a straightforward and easy to use interface, but behind the scene it takes a lot of complicated technology to create that simplicity. In this article, we’re going to explore what exactly makes up a PlanetScale database and how each component is leveraged by PlanetScale.

It all begins with Vitess and Kubernetes

Before we can dive into what a PlanetScale database is under the hood, there are a few concepts we need to explore first. Specifically, we’re going to touch on what Vitess is, what Kubernetes is, and how they contribute to your database on PlanetScale.

What is Vitess?

Vitess is a horizontally-scaling, MySQL-compatible database platform originally designed to address scalability at YouTube. In a nutshell, Vitess allows you to run multiple instances of MySQL across multiple servers and have them appear as a single MySQL instance to your application. Along with scalability, Vitess offers a number of other benefits:

Vitess can automatically rewrite and optimize queries that may impact the performance of your database.
Each Vitess cluster has a load balancer that can handle millions of simultaneous connections.
Caching logic is built-in to handle situations where identical queries are sent in at the same time. Instead of querying the database multiple times, the same dataset is returned to each client.

Throughout the rest of the article, we'll dive a bit deeper into how Vitess is used by PlanetScale, along with other features built on top of Vitess that enhance our users' experience. That said, the biggest takeaway is:

Every database and branch on PlanetScale is an independent cluster.

Vitess components used by PlanetScale

There are several primary components used by PlanetScale within each Vitess cluster we host:

Each MySQL instance in a Vitess cluster is called a Tablet, which is a mysqld instance with a sidecar process called vttablet. Each cluster has at least one and it's where your data lives.
The vtgate is responsible for accepting queries and routing them to the proper tablet, breaking up the query, and dispatching it to multiple tablets if needed.
The entire cluster is controlled by a vtctld instance, a management interface that our internal systems communicate with to perform administrative operations.

While this section of the article provides a brief introduction to some of the Vitess components used by PlanetScale, you can find a more in-depth explanation in our What is Vitess: resiliency, scalability, and performance article, or in the Vitess documentation.

Vitess on Kubernetes

Containers provide a way to host applications and systems in an easily distributable format, regardless of the host system they may be running on. Kubernetes is an orchestration tool that provides a way for infrastructure engineers to define the containers and resources they need to run those applications and keep them online. When running in a cluster, Kubernetes will attempt to automatically distribute the load of your application across multiple hosts, as well as handle outages by automatically spinning up new resources to address offline ones as needed. Individual resources in Kubernetes are known as pods, which are collections of one or more containers running within the cluster.

Each of the Vitess components described in the previous section can be run within a Kubernetes environment, combining the benefits of resilient infrastructure with the horizontal scalability of Vitess for each database on PlanetScale.

How PlanetScale uses Vitess

Kubernetes is used by PlanetScale internally to spin up the necessary resources to host databases within Vitess clusters. Each database within PlanetScale gets its own dedicated Vitess cluster to run your database, including all of the necessary infrastructure required to keep it online and resistant to failures. When you create a database, we signal to our own Kubernetes environment that we need to create a new Vitess cluster. Once your database is created, it has at least one vttablet pod to store data and serve queries, a vtctld pod used for controlling the Vitess cluster, one or more vtgate pods to proxy traffic to the vttablet pod(s), load balancing as needed.

Vitess and database branching

If every database in PlanetScale utilizes all of the above tech when it's created, then you might ask yourself “How does PlanetScale handle branching my database?”. Here’s the neat part: each branch in PlanetScale IS its own database, meaning that it also gets its very own Vitess cluster as earlier in this article. The only key difference is when you create a branch, we’ll spin up a new Vitess cluster for it and (using the vtctld component of the two clusters) apply the schema of the source database branch with the one you just created!

Advanced edge connectivity and routing

The MySQL protocol was designed and built way before the era of cloud computing. Typically MySQL servers were hosted in the same network as the applications connecting to them and not over the public internet. Connections to the database can easily be broken if the quality of the connection suffers, resulting in application outages or data loss. This can be especially problematic when connecting to your database over large geographical distances.

In order to help solve this issue, PlanetScale has an edge routing infrastructure across supported cloud providers and regions around the world. When applications connect to databases in PlanetScale, connections are established at the node closest to them. Those connections are then proxied over a global network back to your database in its home region, where the actual data lives.

Since TLS is terminated closer to the code and our own internal network is used when routing traffic to the database, the quality of the connections to the database are improved, resulting in lower latency and faster data access for your application.

Supercharging your database

At this point, you have a powerful, highly available database ready to go, but we don't stop there. Let's look at some of the additional functionality we tack on to your database.

Online schema change infrastructure

Making schema changes to a database can be stressful as some changes may cause tables to be locked, which results in your application stalling since it needs to wait for the changes to be applied to the database. Online schema change tools are used to perform schema changes in various methods that avoid table locking. These tools, however, also require a certain skill set to manage properly.

Instead of having to implement online schema change tools such as gh-ost or pt-online-schema-change yourself, PlanetScale provides this functionality out of the box. By using the power of Vitess, we can schedule and execute online schema changes, as well as clean up old tables that are no longer being used after the migration process. This allows us to support the concept of database branching and deploy requests.

Database branches and deploy requests

When using a PlanetScale database, you can use database branches to apply and test schema changes without affecting your production database. When you want to merge changes between two branches, you'd open a deploy request to review changes and apply them to the upstream branch. Deploy requests also have a number of other benefits. Our system can detect if changes to a table will be destructive to the underlying data which avoids accidental data loss. A series of other checks are run to ensure that the changes being made are compatible with the overall system and won't cause any issues with the target database branch.

Schema reverts

When a deploy request is merged, we use a concept called a "shadow table", which is effectively a hidden table that contains the updated schema of the original table. During the process, data is synchronized between the live and shadow tables. When the deploy request is applied, we flip the status of the two tables, so the shadow table becomes the live table, and vice versa.

Schema reverts give you a window in which the old live table (now the shadow table) will remain in the system so it is available to be utilized in a situation where the applied schema changes cause an issue with your application. By reverting the schema, the status of both tables is flipped once more. Since data is still being synchronized between the two tables, any writes that have occurred during the revert window will be retained, but with the old schema. This gives you peace of mind that even if unintended changes occur to your database schema, you can quickly recover your application with no data loss.

Automated, validated backups

All PlanetScale databases come preconfigured with automated, encrypted backups. You can also configure additional backup schedules as needed.

Whenever a backup occurs, the previous backup for that database is restored to a separate MySQL instance in the database architecture, then replicates the changed data into that instance before taking a new backup. This ensures that all backups are validated in PlanetScale as well to prevent data loss due to corrupted backups.

Primary and replicas

In database architecture, replicas are used to add high availability and improve performance by providing an additional copy of your database to read from. However, adding and maintaining replicas can be challenging in a traditional MySQL configuration.

PlanetScale handles this complexity for you. Every single production branch of a database in PlanetScale comes with at least one additional replica.

Availability zones and automated failovers

Databases on our Scaler Pro tier are set up with a more advanced infrastructure configuration that provides increased resiliency by distributing the Vitess components of a given database across multiple availability zones in the selected region.

In cloud infrastructure, availability zones (AZ) are separate data centers in a given geographical region. For example, whenever you create an EC2 instance in AWS in the us-east-1 region, you are prompted to select from 6 different AZs provided none others are enabled.

Spanning a single service across multiple AZs is considered a best practice for high availability. Replicas of your database are automatically stored in separate AZs to avoid disasters that may occur at a single data center. In the situation where a disaster does occur and renders your primary MySQL node inaccessible, the Vitess instance running your database branch will automatically failover to one of the replicas and elect it as the new primary, preventing any downtime that would otherwise occur.

A further breakdown of this is available in our Architecture doc.

Built-in monitoring with Insights

Monitoring the performance of queries sent to any database is critical to identifying which queries are not performing optimally, or worse yet, those that will overload your database resulting in slow application performance. To solve this issue, every database in PlanetScale comes with built-in query performance monitoring.

Some databases in PlanetScale receive millions of queries per second. Instead of tracking the statistics on individual queries, we run all queries through a normalization process that allows them to be identified based on their patterns. This allows us to fingerprint a specific query pattern and emit aggregated telemetry for that pattern such as the total execution time, number of queries, rows read, rows written, etc. Through this normalization process, the query data is anonymized by default so we don't track query parameters or the data itself.

Beyond aggregated statistics, Insights also captures details about individual executions of queries that take more than 1 second to execute, read more than 10k rows, or result in an error. If "complete query collection" is enabled we also record the raw SQL for these queries to provide additional context for debugging. Keeping a log of expensive or errored queries makes it easier to troubleshoot problematic queries and keep a handle on errors that your application may encounter.

The Insights tab of your database provides summary statistics and interactive time series graphs for your database as a whole, and for individual query patterns. With this data in hand, it’s easy to identify which queries your application is executing at any given time and keep an eye on query performance.

Availability monitoring

Monitoring how queries affect the performance of your database is just one important thing that needs to be considered when designing robust database infrastructure. Another consideration would be ensuring that the database can send and receive network traffic using the infrastructure external to the database.

For databases hosted with PlanetScale, we also have custom monitoring systems built to ensure that traffic is flowing properly and the database can respond to queries as intended. This helps in avoiding situations where the database itself may be functioning perfectly fine, but upstream networking components or other infrastructure are impacted.

Security and access

Connecting your application to your database branch works exactly like connecting to any other MySQL instance; with a connection string. Connection strings can also have roles assigned to them, ranging from read-only to full schema-changing capabilities.

Administering your PlanetScale database has additional security features that are accessible in an easy-to-use dashboard. User accounts can have granular permissions assigned to them on an organization and database level. Single sign-on is available to enable users to log in with their existing ID provider.

Service tokens offer a way to automate the administration of your PlanetScale organization or database programatically using our API or pscale CLI. Service tokens also have granular permissions, enabling you to lock them down based on their intended use.

We are also a GitHub Secrets Scanning partner. If service tokens or connection strings are accidentally published into a repository, the permissions for that given secret will automatically be revoked and you'll be notified.

Automatic MySQL version updates

Using the combined power of Vitess and Kubernetes, we're able to keep your database up to date with the latest version of MySQL, removing the difficulty of having to perform minor or major updates when new versions of MySQL are released. We're also able to make sure updates are applied successfully and easily roll back if needed.

Conclusion

We've made it very simple to create a database on PlanetScale, but as you can see there is so much more that goes into the process of both creating databases, as well as building in common and sometimes necessary tooling to keep your database running optimally. If you have any questions or comments about what you've read, feel free to send us a note on Twitter at @planetscale!