When designing a database, it is often necessary to store "flags," or true/false values. One way to do this is to create a separate column for each value, but as the number of flags increases, this approach becomes impractical. One solution is to use a JSON column to store all the flags in a single field. However, this approach comes at the cost of increased space usage. In this video, we explore an alternative solution that uses a tiny integer column to store multiple bits of information in a single field.
Suppose we have a users
table and we want to add a column to store true/false flags. Instead of using a JSON column, we can use a tiny integer column to store multiple bits of information in a single field. Let's see how this works.
First, we need to add a column to the users
table:
ALTER TABLE users ADD COLUMN flags TINYINT UNSIGNED DEFAULT 0;
Notice that we're setting the default value to 0 and making the column tinyint unsigned
to store only positive integers.
After adding the column, we can store multiple flags in a single integer. To do this, we need to understand how bits and bytes work.
A byte is a unit of digital information that consists of eight bits. A bit is a binary digit that can be either 0 or 1. By manipulating the bits in an integer, we can store multiple flags in a single field.
For example, suppose we have the following eight flags:
- dark_mode
- super_admin
- notification_opt_in
- metered_billing
- rollout_chat
- experiment_blue
- log_verbose
- new_legal_disclaimer
We can assign each flag to a bit in the integer column as follows:
00000000
│││││││└─ Bit 1: dark_mode
││││││└── Bit 2: super_admin
│││││└─── Bit 3: notification_opt_in
││││└──── Bit 4: metered_billing
│││└───── Bit 5: rollout_chat
││└────── Bit 6: experiment_blue
│└─────── Bit 7: log_verbose
└──────── Bit 8: new_legal_disclaimer
Using this mapping, we can store up to eight flags in a single byte. To store multiple flags, we need to turn on the corresponding bits.
For example, suppose a user has turned on the dark_mode
and rollout_chat
flags. To represent this, we need to turn on bits 1 and 5. Using binary notation, this would look like this: 00010100
.
┌───── Bit 5: rollout_chat ┐
│ ├─ These are set to 1, therefore true
│ ┌─ Bit 1: dark_mode ┘
00010001
│││ ││└── Bit 2: super_admin ┐
│││ │└─── Bit 3: notification_opt_in │
│││ └──── Bit 4: metered_billing ├─ These are set to 0, therefore false
││└────── Bit 6: experiment_blue │
│└─────── Bit 7: log_verbose │
└──────── Bit 8: new_legal_disclaimer ┘
To convert this to an integer value, we can add up the decimal value of each bit that is turned on. In this case, that would be 1 + 16 = 17.
00010001
│ └─ dark_mode = 1
└───── rollout_chat = 16
────
17
We can store this value, 17, in the flags
column for this user.
To query the flags, we can use bitwise operators. For example, to find all users who have the dark_mode
flag turned on, we can use the following SQL statement:
SELECT * FROM users WHERE flags & 1 = 1;
In this statement, we use the bitwise AND operator (&
) to compare the value of the flags
column with the binary value of 1 (00000001
), which represents the dark_mode
flag. If the result of the bitwise AND operation is 1, it means that the dark_mode
flag is turned on, and we include that user in the result set.
Similarly, to find all users who have both the dark_mode
and rollout_chat
flags turned on, we can use the following SQL statement:
SELECT * FROM users WHERE flags & 17 = 17;
In this statement, we use the bitwise AND operator (&
) to compare the value of the flags
column with the binary value of 17 (00010100
), which represents both the dark_mode
and rollout_chat
flags. If the result of the bitwise AND operation is 17, it means that both flags are turned on, and we include that user in the result set.
Using a tiny integer column to store multiple flags in a single field is an efficient way to save space in the database. However, it comes with some tradeoffs.
Reduced readability: The binary representation of flags in the flags
column is not human-readable. It can be difficult to interpret the meaning of the flags without consulting a mapping table.
Limited number of flags: The number of flags that can be stored in a single byte is limited to eight. If we need to store more flags, we need to use a larger integer column or a JSON column.
Application logic overhead: To use the bitwise operators to query the flags, we need to write additional application logic that maps the flags to their corresponding bit values. This can add complexity to the codebase.
We explored an alternative solution to using a JSON column to store multiple flags in a single field. Using a tiny integer column to store multiple bits of information in a single field is an efficient way to save space in the database. However, it comes with some tradeoffs, including reduced readability and additional application logic overhead. Whether or not to use this approach depends on your specific use case and the tradeoffs you are willing to make!