The Slotted Counter Pattern
By Sam Lambert |
It is a common database pattern to increment an INT
column when an event happens, such as a download or page view.
You can go far with this pattern until bursts of these types of events happen in parallel and you experience contention on a single row. When multiple transactions are trying to update the counter, you are essentially forcing these transactions to run serially, which is bad for concurrency and can cause deadlocks. You can also see dramatic increases in query time when bursts like this occur.
You can check if you are experiencing contention by running the following:
SHOW ENGINE INNODB STATUS\G
In the output, you’ll see some information about granting a lock:
---TRANSACTION 79853106, ACTIVE 5 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s) MySQL thread id 24, OS thread handle 6281670656, query id 107 localhost root updating UPDATE slotted_counters SET count = count + 1 WHERE id = 1 ------- TRX HAS BEEN WAITING 5 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 2 page no 4 n bits 184 index PRIMARY of table `github`.`downloads` trx id 79853106 lock_mode X locks rec but not gap waiting Record lock, heap no 2 PHYSICAL RECORD: n_fields 7; compact format; info bits 0 0: len 4; hex 80000001; asc ;; 1: len 6; hex 000004c27630; asc v0;; 2: len 7; hex 020000017d0ce9; asc } ;; 3: len 4; hex 8000007b; asc {;; 4: len 4; hex 800001c8; asc ;; 5: len 4; hex 80000019; asc ;; 6: len 4; hex 8230df9b; asc 0 ;;
You can see that this transaction has been waiting a significant amount of time to acquire a lock to increment the counter on this single row. It is clashing with other competing transactions.
MySQL is the main database for GitHub.com
, and back in the day, when a number of PlanetScale folks worked there, we had to do this kind of counting differently. We decided on using a separate table with a schema similar to this:
CREATE TABLE `slotted_counters` ( `id` int(11) NOT NULL AUTO_INCREMENT, `record_type` int(11) NOT NULL, `record_id` int(11) NOT NULL, `slot` int(11) NOT NULL DEFAULT '0', `count` int(11) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `records_and_slots` (`record_type`,`record_id`,`slot`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8
record_type
— The type of counter (allows us to keep the table generic)record_id
— Identifies whatever we are counting, it could map to a repository id for exampleslot
— The slot we are going to incrementcount
— The count for each slot
A typical increment query would look like:
INSERT INTO slotted_counters(record_type, record_id, slot, count) VALUES (123, 456, RAND() * 100, 1) ON DUPLICATE KEY UPDATE count = count + 1;
The idea here is that instead of incrementing a single row for a counter, we are now picking a slot and incrementing the count in that slot. This means instead of hammering a single row, we are spreading the updates across 100 rows and reducing the potential for contention.
Once we have run the above INSERT
a few times, we can see the counter rows:
mysql> select * from slotted_counters; +----+-------------+-----------+------+-------+ | id | record_type | record_id | slot | count | +----+-------------+-----------+------+-------+ | 1 | 123 | 456 | 2 | 21 | | 2 | 123 | 456 | 52 | 99 | | 3 | 123 | 456 | 55 | 321 | | 4 | 123 | 456 | 0 | 442 | | 7 | 123 | 456 | 48 | 69 | | 8 | 123 | 456 | 20 | 661 | | 9 | 123 | 456 | 56 | 62 | | 10 | 123 | 456 | 18 | 371 | | 11 | 123 | 456 | 22 | 127 | | 12 | 123 | 456 | 58 | 33 | | 13 | 123 | 456 | 23 | 322 | +----+-------------+-----------+------+-------+ 11 rows in set (0.00 sec)
Getting the count for record_id
456 is as simple as this SELECT
query:
SELECT SUM(count) as count FROM slotted_counters WHERE (record_type = 123 AND record_id = 456);
Now we can have requests executing counter increments in parallel without causing contention and effecting concurrency.
There are a few different ways you can implement this pattern, but it comes down to the architecture of your app. One way would be to query the slotted_counters
table to roll up the data and update a column stored with the rest of the data.