When it comes to database optimization, finding ways to improve query performance is crucial. One approach to unlocking indexes without making changes to the table structure is to use redundant and approximate conditions. Although not a common occurrence, these types of conditions have the potential to make a significant impact on performance once identified. In this video, we'll explore what redundant and approximate conditions are, why they're valuable, and how to use them to improve query performance.
Redundant conditions refer to query conditions that logically cannot change the result set. Even though they don't produce any changes in the result, they're still valuable because they can help optimize the query. For instance, consider the following example of querying a people table:
SELECT * FROM people WHERE id <= 5
This query returns the first five rows from the people table based on the given condition. However, if we modify the query as shown below, we still get the same result.
SELECT * FROM people
WHERE
id <= 5
AND
id <= 10
The second condition is redundant because it doesn't change the result of the query at all. The key benefit of redundant conditions lies in the fact that they help unlock indexes without any changes to the table. So, it's worth looking out for these types of conditions when optimizing your database. In this example we're not getting any benefit from the redundant condition, but there are times when we can get huge benefits.
Consider a scenario where you have a to-do list and want to find all the to-do's that are due in the next 24 hours. You may have an index on the due date, but not the due date combined with due time. In this case, you can make use of the redundant condition, which logically cannot change the results and can help you use the available index more efficiently.
Here's the query without the redundant condition:
SELECT * FROM todos
WHERE
ADDTIME(due_date, due_time) BETWEEN NOW() AND NOW() + INTERVAL 1 DAY
The above query retrieves all the to-do's that are due within the next 24 hours based on the due_date
and due_time
columns. Unfortunately we've obfuscated the index on the due_date
column.
In this case we can add a redundant condition that is broader than our original condition, but is not index-obfuscated.
SELECT * FROM todos
WHERE
ADDTIME(due_date, due_time) BETWEEN NOW() AND NOW() + INTERVAL 1 DAY
AND
due_date BETWEEN CURRENT_DATE AND CURRENT_DATE + INTERVAL 1 DAY
Using the approximate condition helps to narrow down the result set even though it may include a few false positives. After narrowing down the result set, you can then use the expensive filtering, which is not indexable, to remove the false positives and get the correct result set.
Redundant and approximate conditions offer a powerful tool for optimizing database queries. Even if they aren't terribly common to find in the wild, they are fun to discover because they require no changes to the table structure to unlock the use of indexes.