Introducing sharding on PlanetScale with workflows
By Benjamin Dicken |
We just released our new workflows functionality, which provides you with recipes that run a series of predefined steps to perform actions on your database. Our first workflow enables you to horizontally scale your databases by moving tables to a sharded keyspace, all from within the PlanetScale dashboard.
If you're familiar with Vitess, this first workflow is similar to MoveTables
, a critical component to managing your data distribution in Vitess. We are excited to now offer this functionality directly in PlanetScale.
Workflows functionality is available on all PlanetScale plans.
Why did we build this?
At PlanetScale, our goal is to be the most reliable and scalable platform for running relational databases. One of the keys to achieving this is sharding — a proven architecture used by some of the web's largest properties to scale their databases to several millions of queries per second.
Up until recently, much of the functionality for creating and configuring sharded databases was only accessible via our Enterprise plan.
The recent release of the Cluster configuration functionality allowed users to create their own sharded keyspaces as well as configure custom VSchema and routing rules. With the addition of workflows, you can also easily migrate existing data to sharded keyspaces and smoothly switch production traffic between them with no downtime.
Over the past few decades, many companies have faced significant challenges scaling their databases, especially at the point where a single server cannot handle all traffic. Being able to easily and safely transition to a sharded environment shifts this phase of a company's existence from an extreme pain point to a smooth, well-tuned process. We want to empower our users with the tools they need for scaling their database to meet any demands.
Let's take a look at how this works.
How to shard your tables with PlanetScale
You can refer to the Sharding quickstart for a full end-to-end tutorial. If you prefer video walkthroughs, check out our latest video:
Before starting a workflow, you'll want to ensure you have a sharded keyspace set up in addition to your unsharded source keyspace. You can view, modify, and create keyspaces from our recently-released Cluster configuration interface. Navigate to “Cluster configuration” and create kepspaces as needed. In this example database, we already have two such keyspaces: gymtracker
and gymtracker-sharded
:
You can use a workflow to move one or more tables from the unsharded keyspace into a sharded one. To start a workflow, navigate to the “Workflow” UI:
Click “New workflow” to start a new workflow. In this menu, you'll be asked to give the workflow a name, select the source and target keyspaces, and select the table(s) you want to move.
After setting everything up, click “Validate”. PlanetScale will not allow you to start the workflow until all validation checks pass.
After clicking “Create workflow”, you enter into the "copying phase" and can monitor the progress of the workflow. Initially, PlanetScale will migrate your data from the source keyspace to the target. After the initial bulk migration completes, it will continue to replicate any new rows that come in to the target. When you're ready to proceed, we recommend you first Verify data (to ensure everything migrated correctly), and then you can Switch traffic.
After switching traffic, the new sharded table is configured to handle all traffic for the migrated tables! We also give you an option to switch traffic back to the unsharded database, providing an escape hatch in case any unexpected problems arise from the sharded configuration.
With just a few clicks, you can create a sharded keyspace with however many shard you'd like, move existing tables to that keyspace, and switch production traffic to be served from the sharded keyspace.
Vitess workflows
Every PlanetScale database is powered by Vitess — which supports a number of workflows to facilitate managing your database cluster. These include:
MoveTables
— Allows you to move tables between keyspaces (between logical databases within your cluster)Reshard
— Facilitates modifying the way that your data is sharded. Allows you to spread your data across more shards or less shards, depending on demand.Materialize
— Allows you to create copies, aggregations, or views of the tables in your Vitess cluster.LookupVindex
— Helps with the creation and population of Lookup Vindexes (lookup index tables to help queries execute faster).Migrate
— Allows you to move tables between distinct Vitess clusters.
For this first release, we focused on supporting MoveTables
, specifically when migrating a table from an unsharded keyspace to a sharded one. We believe that this is one of the most important workflows for our users, as it unlocks the ability to horizontally scale existing unsharded databases with minimal friction.
We intend to support more types of workflows in the future. For example, the ability to reshard is also important to allow users to self-manage their growing database systems on PlanetScale. We also plan to integrate the Migration
workflow into PlanetScale to help with migrations from outside sources.
Workflows resources
We have a number of resources to help you get up and running with PlanetScale workflows.
We also have detailed documentation that walks you through important concepts and instructions for running your own unsharded to sharded workflows.
- Sharding quickstart
- Workflows
- Avoiding cross-shard queries
- Vindexes
- Sharding workflow state reference
- Pre-sharding checklist
- Targeting the correct keyspace
- What is a keyspace?
If you have questions or feedback about workflows, contact us. We'd love to chat.