If you've worked with databases for a while, you've probably learned that adding indexes can improve performance. This is especially true for large tables when you are querying with
ORDER BY clauses.
An index basically works by storing a copy of part of the data in a different order, so that it can be accessed more quickly — kind of like adding a table of contents to a book.
For a more detailed explanation of how indexes work and how you can use them, check out this article: How do database indexes work? If you want to dive even further to indexes, we have 17 videos on indexing that cover everything from how indexes and B+Trees work to knowing where and when to add indexes.
Making good use of indexes can reduce query run time from seconds to milliseconds. The first time you get a performance boost like that, you might feel inclined to add indexes to every column of every table in your database just because you can. But this is not always a good idea, as there can be drawbacks to adding too many secondary indexes.
Let's go over some of the possible downsides of using too many database secondary indexes.
The first and perhaps most obvious drawback of adding indexes is that they take up additional storage space. The exact amount of space depends on the size of the table and the number of columns in the index, but it's usually a small percentage of the total size of the table. A basic index only needs to store the values of the indexed columns as well as a pointer to the row in the table. So for a column that contains integers, the index will only need to store the integer values. This space will increase if the column contains strings because the index will need to store the string values as well as the length of each string.
This is important to consider if you have large datasets, as adding multiple indexes to a table can quickly use a significant amount of additional storage space.
When you add an index, it has to be updated whenever a row is inserted, updated, or deleted. This means that writes will be slower. Before you add an index, you should consider whether you will be doing a lot of writes to the table and whether or not you can afford to slow down the writes.
As an example, in one application I worked on, doing a bulk insert of about a million records only took around 10-15 seconds without any indexes. Unfortunately, the performance of certain frequently used queries was quite slow, taking a few seconds to run and causing a bad user experience. Adding several indexes for such queries improved the performance significantly, but the bulk insert now takes closer to two minutes. That is a significant difference in write performance, but in this particular case, it was an acceptable trade-off, as the bulk insert is done infrequently and can be done during off-peak hours when the application is not used heavily.
If something like this bulk insert was triggered by users who had to sit and wait for it, then it might be a different story, and I may have weighted the impact of the slower writes differently.
To keep your database efficient, it's important to find and remove any unused indexes. In MySQL, you can use the following query to find indexes that are not being used (replace
your_database_name with the name of your database):
SELECT table_name, index_name, non_unique, seq_in_index, column_name, collation, cardinality, sub_part, packed, index_type, comment, index_commentFROM information_schema.STATISTICSWHERE table_schema = 'your_database_name'AND index_name != 'PRIMARY'AND (cardinality IS NULL OR cardinality = 0)ORDER BY table_name, index_name, seq_in_index;
This query checks the cardinality of each index, which is the number of unique values in the index. If this value is 0, then the index is not being used.
If you find an unused index,
your_index_name, in a table called
your_table_name, you could remove it with the following query:
ALTER TABLE your_table_name DROP INDEX your_index_name;
If you have some indexes that are in use, but after reading this article, you think some of the trade-offs may not be worth it, you can audit each of these individually to see if you want to keep or remove them.
To get a list of all indexes for all tables in your database, run:
SELECT * FROM information_schema.statistics;
Now that you've identified all of your indexes, you can use MySQL invisible indexes to determine which ones you may wish to drop.
One way to test the outcome of dropping an index before actually dropping it is to utilize MySQL's invisible indexes.
(If you prefer video, you can watch our video on invisible indexes.)
With invisible indexes, you can keep the index intact but essentially hide the index from MySQL so that queries do not use the index. This gives you a way to quickly test the impact of removing an index without completely destroying it.
To make an index invisible, run the following query:
ALTER TABLE your_table_name;ALTER INDEX your_index_name INVISIBLE;
You can now run any applicable queries to see how performance is impacted. If you realize you still need this index, you can make it visible again with:
ALTER TABLE your_table_name;ALTER INDEX your_index_name VISIBLE;
When doing this, ensure that you test the performance of any affected queries before and after removing the index to make sure that you are not inadvertently making things worse.
Adding indexes can be a great way to improve performance, but it's important to be aware that they do come with a cost. Every index takes up additsional storage, can slow down write operations, and can complicate the query optimizer's job, so they aren't always guaranteed to improve performance. Ultimately the decision to add indexes should be based on the specific needs of your application and the trade-offs you are willing to make. You should always measure the performance of your queries before and after adding indexes to see if they are actually improving performance, and if you don't seem to be seeing significant improvement for your desired use, then it may be better to leave the indexes out.