If you're familiar with MySQL, or any relational database management system, you know that querying against a single table is a common practice. But what happens when you have data in one table and related data in another table? That's where joins come in. In this video, we'll explore the world of joins and discuss what they are, how to use them, and when to use them.
First, let's start with some basic tables. We'll use the store
and staff
tables as examples. Here's how we'll query them:
SELECT * FROM store;
| id | manager_staff_id | name | address_id | last_update |
|----|------------------|----------|------------|---------------------|
| 1 | 1 | Downtown | 1 | 2023-02-10 17:39:41 |
| 2 | 2 | Uptown | 2 | 2023-02-10 17:39:41 |
| 3 | | Kiosk | 3 | 2023-02-10 19:17:42 |
This returns three stores: a downtown store, an uptown store, and a kiosk.
SELECT * FROM staff;
| id | first_name | last_name | store_id | last_update |
|----|------------|-----------|----------|---------------------|
| 1 | Mike | Hillyer | 1 | 2023-02-15 03:57:16 |
| 2 | Jon | Stephens | 2 | 2023-02-15 03:57:16 |
| 3 | Aaron | Francis | 1 | 2023-02-10 19:17:50 |
This returns three staff people: Mike, Jon, and Aaron.
Looking at the store
table, we see that there is a manager_staff_id
column. This column tells us which staff person is the manager of the store. The downtown store is managed by employee number one, the uptown store by employee number two, and the kiosk has no manager.
So what if we want to select all the stores and bring in their managers with it? That's where joins come in.
Let's start with an inner join. An inner join takes the left table and the right table and matches them up together based on the criteria you specify. It only returns results that have a link in both tables.
SELECT * FROM store
INNER JOIN staff ON store.manager_staff_id = staff.id;
| id | manager_staff_id | name | last_update | id | first_name | last_name | store_id | active | username | last_update |
|----|------------------|----------|---------------------|----|------------|-----------|----------|--------|----------|---------------------|
| 1 | 1 | Downtown | 2023-02-10 17:39:41 | 1 | Mike | Hillyer | 1 | 1 | Mike | 2023-02-15 03:57:16 |
| 2 | 2 | Uptown | 2023-02-10 17:39:41 | 2 | Jon | Stephens | 2 | 1 | Jon | 2023-02-15 03:57:16
This query does not return the kiosk because it has no manager, and it doesn't return Aaron because they're not a manager. It only returns the downtown and uptown stores, and their respective managers.
Now let's try a left join. A left join returns all the records from the left table, and any matching records from the right table.
SELECT * FROM store
LEFT JOIN staff ON store.manager_staff_id = staff.id;
| id | manager_staff_id | name | last_update | id | first_name | last_name | store_id | active | username | last_update |
|----|------------------|----------|---------------------|----|------------|-----------|----------|--------|----------|---------------------|
| 1 | 1 | Downtown | 2023-02-10 17:39:41 | 1 | Mike | Hillyer | 1 | 1 | Mike | 2023-02-15 03:57:16 |
| 2 | 2 | Uptown | 2023-02-10 17:39:41 | 2 | Jon | Stephens | 2 | 1 | Jon | 2023-02-15 03:57:16 |
| 3 | | Kiosk | 2023-02-10 19:17:42 | | | | | | | |
This query returns all the stores, including the kiosk (which has no manager), and the downtown and uptown stores with their respective managers.
What if we want to see all the employees, whether or not they are managers? That's where a right join comes in.
SELECT * FROM store
RIGHT JOIN staff ON store.manager_staff_id = staff.id;
| id | manager_staff_id | name | last_update | id | first_name | last_name | store_id | active | username | last_update |
|----|------------------|----------|---------------------|----|------------|-----------|----------|--------|----------|---------------------|
| 1 | 1 | Downtown | 2023-02-10 17:39:41 | 1 | Mike | Hillyer | 1 | 1 | Mike | 2023-02-15 03:57:16 |
| 2 | 2 | Uptown | 2023-02-10 17:39:41 | 2 | Jon | Stephens | 2 | 1 | Jon | 2023-02-15 03:57:16 |
| | | | | 3 | Aaron | Francis | 1 | 1 | Aaron | 2023-02-10 19:17:50 |
This query returns all the employees, including the author of the blog post (who is not a manager, sadly), and the downtown and uptown store managers. However, it only returns two store rows, and the kiosk row is missing because it has no manager.
Some databases have a full outer join, which returns all rows from both tables, whether or not there's a match. MySQL doesn't have this feature. You can approximate it by using a left join and a union all, and then adding a right join, but this can lead to duplicates and may not be efficient for large tables.
Inner joins return only rows that have matches in both tables, left joins return all rows from the left table and any matching rows from the right table, and right joins return all rows from the right table and any matching rows from the left table. Full outer joins are not available in MySQL, but you can approximate them if necessary.
If you're working with large tables, it's important to index your data properly to avoid long processing times when joining tables together. We'll cover indexing in the next video!