MySQL for Developers

Understanding MySQL string data types, column sizes, character sets, and collations

When it comes to storing strings in MySQL, there are a variety of different ways to do it. In fact, there are so many ways that it can take multiple videos just to cover them all. In this video, we'll provide an overview of the different types and then dive deeper into two specific types: fixed-length columns and variable-length columns.

Types of MySQL string columns

When it comes to storing strings in MySQL, there are many different types to choose from. Here's a list of the different types:

  • CHAR
  • VARCHAR
  • TINYTEXT
  • TEXT
  • MEDIUMTEXT
  • LONGTEXT
  • BINARY
  • VARBINARY
  • TINYBLOB
  • BLOB
  • MEDIUMBLOB
  • LONGBLOB
  • ENUM
  • SET

While this may seem overwhelming, one important thing to focus on is understanding the differences between fixed-length and variable-length columns.

Fixed-length columns

Fixed-length columns are usually used for storing data that is a consistent size. This could be data like two-digit prefixes, MD5 hashes, or alphanumeric serial numbers. Fixed-length columns are declared using the CHAR data type and require you to specify the column size.

CREATE TABLE strings (
  fixed_five CHAR(5),
  fixed_32 CHAR(32)
);

In the example above, we've created two fixed-length columns. The fixed_five column can store up to five characters, while the fixed_32 column can store up to 32 characters. It's essential to note that, no matter how many characters you store in a fixed-length column, it will always occupy the full amount of space specified.

Variable-length columns

Variable-length columns, on the other hand, do not have a fixed size. The amount of space required depends on the data being stored in the column. Variable-length columns are declared using the VARCHAR data type, and you have to specify the maximum column size.

CREATE TABLE strings (
  variable_length VARCHAR(100)
);

In the example above, we've created a variable-length column that can store up to 100 characters. Since variable-length columns do not occupy the full amount of space specified, they can be more efficient when it comes to storage. However, it's essential to choose the smallest possible data type for the data you are trying to store.

Character sets and collations

A character set defines what characters are allowed to go into a column. MySQL supports a wide range of character sets, which you can view from the information_schema database. utf8 and utf8mb4 are the character sets you will likely use, with the latter being the default in MySQL 8.

Collations, on the other hand, determine how two or more strings are compared or sorted. A collation is a group of rules that decide whether two strings are equivalent or not. The default collation for MySQL 8 is utf8mb4_0900_ai_ci, with the ai indicating that the collation is accent-insensitive, and ci meaning that it is case-insensitive.

CREATE TABLE strings (
  variable_length VARCHAR(100) CHARSET utf8mb4 COLLATE utf8mb4_general_ci
);

In the example above, we've created a VARCHAR column with a character set of utf8mb4 and a collation of utf8mb4_general_ci.

Conclusion

When it comes to storing strings in MySQL, understanding the differences between fixed-length and variable-length columns is crucial. Additionally, character sets and collations play a vital role in determining what can be stored in a column and how different strings are compared or sorted. Keep in mind that for certain operations, MySQL allocates memory based on the maximum column size, so it's essential to choose the smallest possible data type for the data you are trying to store.

About this lesson

Varying length strings, fixed length strings, character sets, collations... There is a lot to talk about when we talk about strings!

7:56
Closed captioned

Feedback or questions? Reach out to our team at education@planetscale.com.