Monitoring your database is especially important in a sharded environment since you’ll have many servers to manage. You’ll need to ensure that each physical shard in the environment is monitored for things like general availability, resource over-utilization (CPU, Memory, Storage, etc), network usage and query performance.
It's important to keep in mind that one of the goals of a sharded database should be to balance the load across the hardware. If one or a small number of servers are consistently used more than the others, this denotes a hotspot. If you determine that your sharded database has a hotspot, you may need to reconsider the selected shard key for that table. This can be caused by a bad selection of the shard key during the planning phase, but also because your application has grown organically and the data has become unevenly distributed.
The resharding of a table in a sharded database environment can be performed as follows:
- New physical shards are created
- Data is migrated to the new shards. This can be performed either online or offline. If performed online, data changes will need to be streamed into the new sharding configuration as they are made to keep the shards up to date.
- Once the new shards are populated with data and everything is accurate, the application will swap to querying the new shards.
- Finally, the old shards are discarded.
Other common maintenance tasks required by both sharded and "regular" databases include:
Backups - Follow common best practices for backing up data like following the 3-2-1 rule and validating that backups are not corrupted by regularly testing the backups. For clarity, the 3-2-1 rule states that your data should be: in three different locations, on two different mediums, with at least one off site. Of course, you may need to modify this rule based on the durability requirements of your data.
Pruning data - Clearing out archived data from the sharded environment can speed up query times or even allow you to scale down the number of shards you have, reducing infrastructure costs. If your data retention policies allow, keeping your data well pruned can save you money, and improve DB performance.
Index optimization - Ensure that queries are optimized to use indexes, creating indexes as needed based on your data access patterns.
Schema changes - When you need to make schema changes (adding/removing columns, changing column types, etc), those operations may need to be performed on many shards.
PlanetScale uses Vitess, which has a superb architecture for managing shards. Vitess uses VTGates, which act as a layer that sits between your shards and connections to the db. These, along with several other components of Vitess, are able to manage much of the complexities of sharding data and distributing queries across the shards.