Vitess

Vitess is a battle-hardened open source technology invented at YouTube for deploying, scaling, and managing large clusters of database instances. It now powers some of the largest sites on the Internet.

Get managed Vitess Download latest version

Why choose Vitess to scale your database?

Scalability with sharding

Leverage vertical and horizontal sharding
Distribute data across 10s, 100s, or 1000s of servers
Presented as a single, logical database to application

Read about sharding

High-availability

Configure and manage replicated MySQL servers
VTOrchestrator monitors health of all nodes in cluster
Automatic failover for near-zero downtime

Read about high-availability in Vitess

Connection pooling

Connections are pooled via VTGate and VTTablet layers
Break through connection limitations of vanilla MySQL
Support massive application connection demands

Read about connection pooling

Online schema migrations

Integrates with pt-online-schema-change and gh-ost
Start, monitor, and cancel migrations with confidence
Command-line interface for migration interactions

Learn about online schema migrations

MySQL compatibility

Compatible with most MySQL features and syntax
Migrate existing MySQL databases into Vitess
Great for both small and large MySQL-backed applications

Read about MySQL compatibility

Kubernetes-native

Developed at YouTube to run on Borg
Ideally suited for running databases in Kubernetes
PlanetScale maintains an Operator for Vitess

Read about the operator

No database storage system other than Vitess truly fit all of Slack's needs

— Michael Demmer, Principal Engineer @Slack

How Slack leverages Vitess to keep up with its ever-growing storage needs

Vitess has paved the way for us to unify all of our data storage infrastructure and our microservices infrastructure onto Kubernetes, and it's giving us a blueprint for what the rest of our data stores might look like on Kubernetes. That's been a great win for us as an infrastructure team.

— Alex Charis, Senior Software Engineer @HubSpot

How HubSpot manages their sharded cluster with Vitess

Vitess resources

Vitess video course

Watch PlanetScale's Learn Vitess course to learn more about its architecture, how to shard, and how to use it with Kubernetes all in less than 2 hours of watch time.

Watch the course

Official documentation

Check out the official Vitess documentation to learn more about Vitess' architectural details, starter tutorials, design docs, and explanation of concepts.

Visit the docs

Slack community

Join the Vitess Slack workspace to ask questions, get updates, and contribute your own database expertise to the community.

Join the Slack community

Check out the source

Visit the Vitess GitHub page to see the source, browse open issues, and see what features are under active development.

Go to GitHub

Latest release

Take a look at the Vitess v21 release blog to see what new features have been introduced.

Read the blog post

Follow on X

Follow Vitess on X to stay up to date with the latest announcements from the Vitess community.

Follow Vitess

How does Vitess work?

Vitess is one of the best ways to scale a MySQL database cluster. Vitess utilizes vanilla MySQL, but adds proxy, query routing, monitoring, and control plane components to make scaling MySQL feasible. However, Vitess itself is a complex piece of software, with many separate components that work together to keep things operating smoothly. Let's take a look at what each component contributes to the bigger picture.

When an application needs to connect to a Vitess cluster, it does not make connections directly to MySQL. Instead, connections are made to a VTGate. The VTGate layer acts as the entry point to the cluster, proxies connections, and handles routing of incoming queries to the appropriate MySQL instances. A Vitess cluster will typically have at least three VTGates, each in a different availability zone. Large clusters may have hundreds or thousands of gates. The VTGates will then forward queries on to the appropriate keyspace or shard.

A Vitess cluster can be unsharded or sharded. In an unsharded cluster, all tables for a given logical database (keyspace) live on the same server. Each keyspace typically has at least two replica servers, used for high-availability and for handling some of the read traffic. A sharded cluster is one where the tables of the database (keyspace) are spread across many servers. Each shard will have a primary and replicas.

Vitess architecture

Another critical component of Vitess is the VTTablet. When a VTGate needs to forward a query to MySQL, it sends it to a VTTablet. VTTablet mediates all communication between VTGates and MySQL. This is needed so that Vitess can pool connections to the MySQL instances, allowing connections to have less of a memory impact. It also monitors the health and resource usage of the underlying MySQL instances.

Replication from the primary to the replicas is handled by MySQL's built-in replication engine. If Vitess detects that the primary server goes down or is having connectivity issues, it can automatically and quickly fail over, reassigning the primary role to one of the replicas. This, combined with Vitess' ability to buffer queries at the VTGate layer, allows for failover with minimal impact to connected applications.

Vitess shard

Vitess also has a sophisticated control plane used for augmenting the cluster and monitoring health.

Every Vitess cluster must have a running topology server used to store metadata about the cluster's configuration. Vitess recommends using fellow CNCF project ETCD for this component.

vtorc is used to automatically detect faults and make repairs to components not running correctly. This component is critical for a highly-available cluster. vtctld is a server that is responsible for handling cluster changes and workflows. This, paired with the client vtctl, gives you powerful command-line control over your cluster. VTAdmin is a web-based application that can be used to monitor your cluster.

Vitess control plane / topology server