Partitioning example

Want an easy-to-use, performant, MySQL database? Try PlanetScale now

Walking through a partitioning example

Let's experiment with partitioning Juniper Studio's message table for the GalaxySurfer game. As a reminder, here's what the table looks like in the original game's schema:

CREATE TABLE IF NOT EXISTS message (
  id INT AUTO_INCREMENT NOT NULL,
  to_id INT NOT NULL,
  from_id INT NOT NULL,
  message varchar(512),
  created_at DATETIME NOT NULL,
  edited_at DATETIME,
  deleted_at DATETIME,
  flagged BOOLEAN,
  PRIMARY KEY (id),
  FOREIGN KEY (to_id) REFERENCES player(id),
  FOREIGN KEY (from_id) REFERENCES player(id)
);

If we eventually end up with one million users, each user has an average of 10 friends they message, and each pair has an average of 100 messages, this table would reach 1 billion rows! Since this table could get large, it is a candidate to be partitioned.

Example query

Say I want to see all messages sent for the month of October 2020. To do this, I might write a query like this:

SELECT * FROM message WHERE created_at BETWEEN '2020-10-01 00:00:00' AND '2020-10-31 23:59:59';

If I don't have an index on the created_at column, this would be slow for a large table. You can use EXPLAIN and EXPLAIN ANALYZE to further investigate how the query would be executed without partitioning.

Partitioning the message table

The following commands can be run to change the table to be broken up into ten partitions:

ALTER TABLE message DROP CONSTRAINT message_ibfk_1;
ALTER TABLE message DROP CONSTRAINT message_ibfk_2;
ALTER TABLE message DROP PRIMARY KEY, ADD PRIMARY KEY(id, created_at);
ALTER TABLE message PARTITION BY RANGE ( year(created_at) ) (
    PARTITION p1 VALUES LESS THAN (2015),
    PARTITION p2 VALUES LESS THAN (2016),
    PARTITION p3 VALUES LESS THAN (2017),
    PARTITION p4 VALUES LESS THAN (2018),
    PARTITION p5 VALUES LESS THAN (2019),
    PARTITION p6 VALUES LESS THAN (2020),
    PARTITION p7 VALUES LESS THAN (2021),
    PARTITION p8 VALUES LESS THAN (2022),
    PARTITION p9 VALUES LESS THAN (2023),
    PARTITION p10 VALUES LESS THAN MAXVALUE
);

There are a few things to take note of here. For one, I had to drop the foreign key constraints since this is not allowed with a partition like this. Secondly, I added created_at to the primary key, as the partition column must be a part of a PK.

Another thing to keep in mind here is that there are options that don't require you to do disk management from within MySQL. Suppose you want to spread data in a table across multiple physical disks. In that case, this can also be done by configuring a RAID or LVM setup of the disks outside of MySQL and then placing the database on one logical volume, though there may be multiple physical volumes under the hood that MySQL need not bother with. RAID / LVM may be a better route, depending on your use case.

Partitions with `EXPLAIN`

Now that we have a partition, let's look at the explanation for the query.

mysql> EXPLAIN SELECT * FROM message;
+----+-------------+---------+-------------------------------+------+---------------+------+---------+------+---------+----------+-------+
| id | select_type | table   | partitions                    | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra |
+----+-------------+---------+-------------------------------+------+---------------+------+---------+------+---------+----------+-------+
|  1 | SIMPLE      | message | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9 | ALL  | NULL          | NULL | NULL    | NULL | 9923747 |   100.00 | NULL  |
+----+-------------+---------+-------------------------------+------+---------------+------+---------+------+---------+----------+-------+
1 row in set, 1 warning (0.00 sec)

This shows us that the partitions would be used in the scan. If we instead execute a query that better leverages the partitioning, such as getting all of the messages for a single month in 2015, we can see different results:

mysql> EXPLAIN SELECT * FROM message WHERE created_at BETWEEN '2015-10-01 00:00:00' AND '2015-10-31 00:00:00';
+----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | message | p5         | ALL  | NULL          | NULL | NULL    | NULL | 322965 |    11.11 | Using where |
+----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

Notice that it only needed to look at one partition!

Partition location

When creating the partitioned table, you can specify a different location for each partition if desired. First, ensure any additional locations you want to store them in are added to the innodb_directories list. You can change this option in the MySQL configuration file. For example, in my case, it would be:

innodb_directories="/Users/bddicken/Desktop"

Restart MySQL, and try re-running the partitioning command. This time, you can specify a DATA DIRECTORY to send a partition to, as long as it is in the default data dir or one of the paths in the innodb_directories configuration option.

....
PARTITION p0 VALUES LESS THAN (2015) DATA DIRECTORY = '/Some/path',
....

About this lesson

This lessons covers how to use the partitioning feature of mysql, and how to specify where a partition should live in a file system.

13:50

February 29, 2024

Feedback or questions? Reach out to our team at education@planetscale.com.