Let's explore what happens as your database grows through different stages and some things to consider before implementing a scaling strategy. I'll break this discussion up into three (broad) categories of databases: small, medium, and large. These are hand-wavy classes, but by small, I mean something that can fit and function well on a single server and where the data size is in the single digit, 10s, or even 100s of gigabytes. By medium, I mean databases that reach a point where operating on a single server is no longer realistic. These databases have a high query load and may have data sizes in the 100s of gigabytes and even into the terabytes. Last, by large databases, I mean ones requiring 10s of thousands of simultaneous network connections, powering applications with millions of users and many terabytes of data.
At this level, ensuring your database is tuned correctly can be a huge win. There is no need to spend more money on a bigger server and fancy scaling techniques when simple query optimizations, indexes, and performance tuning will suffice. An excellent first step to take is identifying slow queries. After you've identified these, you need to do some root cause analysis and attempt to optimize these to reduce the load on the DB. If you're using PlanetScale, we have a handsome Insights platform that lets you quickly identify and drill in on such queries without digging through big text log files.
If you've identified problematic queries, you should invest some time into tuning these queries and the database environment to improve performance.
After trying out optimizations like this, you can also take the steps to scale vertically as demand grows. This involves upgrading to higher-end servers when you reach the capacities of current provisioning.
In the medium databases stage, there are several things you can do if scalability is a concern.
One option, of course, is to continue to scale vertically. Add more CPUs and RAM, and purchase a bigger hard drive. However, you can only grow in this direction so far, and if you continue to scale on this path, you may eventually reach the maximum server specs with no more room to scale up.
An excellent additional option is to break workloads into read-write and read-only databases. This is where replication comes into play. Replication allows you to synchronize your data across multiple MySQL servers, providing additional resources to process the queries from your application.
Suppose your database has gotten so big that no amount of query optimization or indexing can be further added. In that case, consider diving into the other strategies outlined in this course. This requires horizontal scaling, in other words, splitting your database across multiple machines.
In this video course, I'll cover many of these concepts and, in some cases, show how we can use these techniques to help scale Juniper Studios' newly released game GalaxySurfer!
No matter what stage of the database you are currently at, this course can benefit you. If you're currently only operating with a small database, there is still high value in knowing what options are available to you ahead of time. It can help you make better decisions on what to do with your database as it grows.