Select only what you need

Want an easy-to-use, performant database? Try PlanetScale now

Best practices for querying: select only what you need

A best practice that you've likely heard of is to select only what you need. This principle is simple: retrieve only the data that you require. In this video, we'll take a closer look at this practice, explain why it's so important, and highlight some of its caveats. We'll also provide you with solutions to some of these challenges so that you can select what you need efficiently.

The importance of only selecting what you need

Let's start with why you should only select what you need. Suppose you run a SELECT * statement on a database table. In that case, you’ll retrieve all the data in that table, including every row and column. Retrieving all this data can lead to significant performance issues, especially if you're dealing with large columns like JSON or TEXT.

There's no need to retrieve all the data if you're not going to use it! It's essential only to pull back the columns you plan on actually using. Doing this will reduce network utilization, disk access, and memory usage.

The Active Record exception

One caveat to the "select only what you need" principle is when you are partially populating an Active Record style model.

Suppose your framework issues a SELECT * statement that then populates an Active Record type model. It will assign every column to a property in the model. You might want to select only a particular set of columns, but in doing so, you might miss essential columns unintentionally. If your framework doesn't protect you from accessing the properties that weren't selected, you might be dealing with a partially populated model and not be aware of it!

The good news is that most frameworks have protections against this problem. You can turn on "safety mechanisms" that prevent you from accessing attributes that haven't been pulled from the database. If your framework has this protection, you can limit your select statements and feel safe that you won't accidentally access any column that you didn't retrieve.

Conclusion

The "select only what you need" principle is a best practice for a reason! It allows you to retrieve only the data that you require, which in turn leads to better performance and memory usage. However, at the application layer, you must be careful not to access columns that you didn't retrieve. You can use "safety mechanisms" in most frameworks to overcome this problem.

By being mindful of these principles and leveraging the solutions provided, you can use "select only what you need" efficiently and safely in your database queries in the future.