Vitess
Vitess is a battle-hardened open source technology invented at YouTube for deploying, scaling and managing large clusters of database instances. It now powers some of the largest sites on the Internet.
Why choose Vitess to scale your database?
- Leverage vertical and horizontal sharding
- Distribute data across 10s, 100s, or 1000s of servers
- Presented as a single, logical database to application
Read about sharding
- Configure and manage replicated MySQL servers
- VTOrchestrator monitors health of all nodes in cluster
- Automatic failover for near-zero downtime
Read about high availability in Vitess
- Connections are pooled via VTGate and VTTablet layers
- Break through connection limitations of vanilla MySQL
- Support massive application connection demands
Read more about connection pooling
- Integrates with
pt-online-schema-change
andgh-ost
- Start, monitor, and cancel migrations with confidence
- Command-line interface for migration interactions
Read about online schema migrations
- Compatible with most MySQL features and syntax
- Migrate existing MySQL databases into Vitess
- Great for both small and large MySQL-backed applications
Read more about connection pooling
- Developed at YouTube to run on Borg
- Ideally suited for running databases in Kubernetes
- PlanetScale maintains an Operator for Vitess
Check out the operator
What do engineers have to say about Vitess?
Resources to get the most out of Vitess
Video course
Watch PlanetScale's Learn Vitess course to learn more about it's architecture, how to shard, and how to use with Kubernetes all in less than 2 hours of watch time.
Official documentation
Check out the official Vites documentation to learn more about Vitess' architectural details, starter tutorials, design docs, and explanation of concepts.
Slack community
Hop on to the Vitess Slack workspace to ask questions, get updates, and contribute your own database expertise to the community.
Latest release
Take a look at the Vitess v20 release blog to see what new features have been introduced.
Check out the source
Visit the Vitess GitHub page to see the source, browse open issues, and see what features are under active development.
Follow on X
Follow Vitess on X to stay up to date with the latest announcements from the Vitess community.
How does Vitess work?
Vitess is one of the best ways to scale a MySQL database cluster. Vitess utilizes vanilla MySQL, but adds proxy, query routing, monitoring, and control plane components to make scaling MySQL feasible. However, Vitess itself is a complex piece of software, with many separate components that work together to keep things operating smoothly. Let's take a look at what each component contributes to the bigger picture.
When an application needs to connect to a Vitess cluster, it does not make connections directly to MySQL. Instead, connections are made to a VTGate. The VTGate layer acts as the entry point to the cluster, proxies connections, and handles routing of incoming queries to the appropriate MySQL instances. A Vitess cluster will typically have at least three VTGates, each in a different availability zone. Large clusters may have hundreds or thousands of gates. The VTGates will then forward queries on to the appropriate keyspace or shard.
A Vitess cluster can be unsharded or sharded. In an unsharded cluster, all tables for a given logical database (keyspace) live on the same server. Each keyspace typically has at least two replica servers, used for high availability and for handling some of the read traffic. A sharded cluster is one where the tables of the database (keyspace) are spread across many servers. Each shard will have a primary and replicas.
Another critical component of Vitess is the VTTablet. When a VTGate needs to forward a query to MySQL, it sends it to a VTTablet. VTTablet mediates all communication between VTGates and MySQL. This is needed so that Vitess can pool connections to the MySQL instances, allowing connections to have less of a memory impact. It also monitors the health and resource usage of the underlying MySQL instances.
Replication from the primary to the replicas is handled by MySQL's built-in replication engine. If Vitess detects that the primary server goes down or is having connectivity issues, it can automatically and quickly fail over, reassigning the primary role to one of the replicas. This, combined with Vitess' ability to buffer queries at the VTGate layer, allows for failover with minimal impact to connected applications.
Vitess also has a sophisticated control plane used for augmenting the cluster and monitoring health.
Every Vitess cluster must have a running topology server used to store metadata about the clusters configuration. Vitess recommends using fellow CNCF project ETCD for this component.
vtorc is used to automatically detect faults and make repairs to components not running correctly. This component is critical for a highly-available cluster. vtctld is a server that is responsible for handling cluster changes and workflows. This, paired with the client vtctl gives you powerful command-line control over your cluster. VTAdmin is a web-based application that can be used to monitor your cluster.
Latest blog posts
Zero downtime migrations at petabyte scale
Building data pipelines with Vitess
Optimizing aggregation in the Vitess query planner
Announcing Vitess 20
Achieving data consistency with the consistent lookup Vindex
Summer 2023: Fuzzing Vitess at PlanetScale
- Over 300 TB of data managed by Vitess clusters
- 500,000 queries per second from primaries
- 5,000,000 queries per second from replicas
Watch the conference talk
- 38 TB managed by 80+ Vitess clusters and 200+ shards
- Over 1,000,000 queries per second served by Vitess
- Uses VDiff and VReplication for easy data movement
Watch the conference talk
Want to save time and money running Vitess?
Every PlanetScale database runs Vitess under the hood. Let us hold the pager, manage Vitess upgrades, set up sharding for you, and more.