Lesson 5: INNER JOIN - Combining Data from Multiple Tables

So far, we've been working with single tables to retrieve, filter, and sort data. But in real-world databases, information is often split across multiple related tables. This is where the power of SQL really begins to shine!

Imagine you're building a library system. Instead of storing all book information in one massive table, you might have separate tables for Authors and Books. This approach has several advantages, but it also means we need to learn how to combine data from multiple tables to answer meaningful questions.

🔄 Why Split Data Across Multiple Tables?

Benefits of using multiple related tables:

  • Avoid Data Duplication - Instead of repeating author information for every book they wrote, store author details once in an Authors table
  • Data Consistency - If an author changes their email, you only update it in one place, not across dozens of book records
  • Flexible Growth - You can add new authors without adding new books, and vice versa
  • Better Organization - Each table focuses on one specific entity (authors, books, publishers, etc.)

This process of organizing data into separate, related tables is called database normalization.


🔗 Understanding Table Relationships

When tables are related, they share common information through keys:

  • Primary Key - A column that uniquely identifies each row in a table (like author_id in the Authors table)
  • Foreign Key - A column in one table that refers to the primary key of another table (like author_id in the Books table)

Let's look at our example tables:

Authors Table:

author_id | first_name | last_name | birth_year | country
----------|------------|-----------|------------|--------
1         | J.K.       | Rowling   | 1965       | UK
2         | Stephen    | King      | 1947       | USA
3         | Agatha     | Christie  | 1890       | UK

Books Table:

book_id | title                                    | author_id | publication_year | pages
--------|------------------------------------------|-----------|------------------|------
1       | Harry Potter and the Philosopher's Stone | 1         | 1997             | 223
2       | The Shining                              | 2         | 1977             | 447
3       | Murder on the Orient Express             | 3         | 1934             | 256
4       | Harry Potter and the Chamber of Secrets  | 1         | 1998             | 251

💡 Key Point: Notice how the author_id in the Books table matches the author_id in the Authors table? This is how we connect the related data.


🔧 Introducing INNER JOIN

To combine data from multiple tables, we use the JOIN clause. The most common type is the INNER JOIN, which returns only rows that have matching values in both tables.

Basic INNER JOIN Syntax:

SELECT column1, column2, column3
FROM first_table
INNER JOIN second_table
  ON first_table.common_column = second_table.common_column;

Real Example:

SELECT books.title, authors.first_name, authors.last_name
FROM books
INNER JOIN authors 
  ON books.author_id = authors.author_id;

This query would return:

title                                    | first_name | last_name
-----------------------------------------|------------|----------
Harry Potter and the Philosopher's Stone | J.K.       | Rowling
The Shining                              | Stephen    | King
Murder on the Orient Express             | Agatha     | Christie
Harry Potter and the Chamber of Secrets  | J.K.       | Rowling

🛠️ Understanding the JOIN Process

Here's what happens step by step when you run an INNER JOIN:

  1. Start with the first table (books)
  2. For each row in the first table, find matching rows in the second table (authors) based on the ON condition
  3. Combine the matched rows into a single result row
  4. Apply any WHERE, ORDER BY, or LIMIT clauses to the combined results

🔑 Key JOIN Concepts

Table Prefixes: Use table_name.column_name to specify which table a column comes from. This is especially important when both tables have columns with the same name.

-- Clear and explicit
SELECT books.title, authors.first_name
FROM books
INNER JOIN authors ON books.author_id = authors.author_id;

ON Clause: Defines how the tables are related (usually primary key = foreign key).

-- The ON clause creates the connection
ON books.author_id = authors.author_id

INNER JOIN vs JOIN: These are equivalent – INNER JOIN is more explicit and clearer.

Result Size: INNER JOIN only returns rows where there's a match in both tables.


🎯 Combining JOINs with Other Clauses

You can combine JOINs with all the clauses you've learned:

JOIN with WHERE:

SELECT books.title, authors.first_name, authors.last_name
FROM books
INNER JOIN authors ON books.author_id = authors.author_id
WHERE authors.country = 'UK';

JOIN with ORDER BY:

SELECT books.title, authors.last_name, books.publication_year
FROM books
INNER JOIN authors ON books.author_id = authors.author_id
ORDER BY books.publication_year DESC;

JOIN with LIMIT:

SELECT books.title, authors.first_name, authors.last_name
FROM books
INNER JOIN authors ON books.author_id = authors.author_id
ORDER BY books.publication_year DESC
LIMIT 3;

💡 Pro Tip: Always use table prefixes when working with JOINs. It makes your queries clearer and prevents errors when both tables have columns with the same name.


✅ Key Takeaways

  • INNER JOIN combines data from multiple related tables
  • Tables are connected through primary keys and foreign keys
  • Use table prefixes to specify which table each column comes from
  • The ON clause defines the relationship between tables
  • You can combine JOINs with WHERE, ORDER BY, and LIMIT clauses
  • INNER JOIN only returns rows that have matches in both tables
  • Database normalization reduces duplication and improves data consistency
  • Understanding relationships is crucial for effective database querying

📝 Exercise Overview

In this lesson's exercises, you'll work with our Authors and Books tables to practice INNER JOINs. You'll learn how to:

  • Combine basic information from both tables
  • Filter joined data using WHERE clauses
  • Sort and limit results from multiple tables

The exercises build from simple joins to more complex queries that combine multiple SQL concepts. Take your time to understand how the tables connect – this foundation will be crucial as you work with more complex database relationships!


Ready to practice SQL? Try our coding challenges →