3.2.1: SQL 1-M Relationships

Learning Objectives

SQL relationships associate data between 1 or more SQL tables to model logical relationships and make data querying more modular and efficient
There are 3 kinds of relationships in SQL: one-to-one (1-1), one-to-many (1-M) and many-to-many (M-M)
Understand how SQL represents 1-M relationships in its databases

Introduction

Imagine a social media app with Users and Posts tables in its SQL database. How would the app know which posts belong to which users? The answer is SQL relationships.

To model the fact that each user can have 0 or more posts, but each post can belong to at most 1 user, we can implement a so-called one-to-many relationship (aka association) between users and posts.

SQL implements relationships using the concept of "primary" and "foreign" keys. Every table has a "primary key" (typically the id column) that uniquely identifies each row (aka record, entry) in the table. Tables can have columns that reference primary keys of related tables to define relationships between those tables. Primary keys referenced in other tables are known as "foreign keys", because from the perspective of the tables referencing the primary keys, those keys belongs to "foreign" tables.

The following is a sample Users table where id is the primary key. There are no foreign keys on this table.

id (primary key)

username

foong

kai

The following is a sample Posts table where id is also the primary key, but each post also belongs to a User, and specifies a userId to identify the user the post belongs to. userId is known as a "foreign key", because it references the "primary key" of another table.

id (primary key)

text

userId (foreign key)

"Go Rocket!!!!!!!!!!"

"Rocket is the Best!!!!!"

"🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀"

The foreign key always lives in the "many" table in the one-to-many relationship. In our above example, notice the foreign key lives in the Posts table and not the Users table, and it cannot be the other way around. If we tried to put a postId or postIds foreign key in the Users table, we would find that we would need to either duplicate User rows for each user's posts, or store an array of post IDs in the foreign key column. The former would repeat User data unnecessarily, and the latter would make SQL harder to query because of the array structure within a single cell of the table.

The above architecture of SQL relationships makes SQL efficient at storing data. The alternative might be to have a single Posts table where we store each User's data repeatedly in every post, but that would be wasteful and inefficient, especially when our Users table contains more data such as contact information and preferences.

SQL stores one-to-one relationship data in the same way as one-to-many, except that the foreign key can live in either table. One-to-one relationships are less common because most of the time all the data in one-to-one relationships can be stored in a single table. Like one-to-many relationships, in one-to-one relationships there should only be 1 table with the foreign key. Examples of one-to-one relationships include users and their relevant contact and auth information if we were to separate that information into multiple tables for more efficient querying.

Given the above Users and Posts tables, we can query for related data using primary and foreign key relationships.

For example, to query all posts that belong to kai, I could run 2 queries. The 1st query retrieves the id of the user with username kai.

SELECT id from Users where username='kai';

The above query would return 2, kai's user ID. The 2nd query retrieves the posts that belong to the user with that user ID, and should return all of kai's posts.

SELECT * from Posts where userId=2;

We will not perform such queries with raw SQL in our apps, but we may need to perform such queries when performing analytics on our companies' databases outside of an app context. In our apps we typically use ORMs like Sequelize that make querying for related data more robust, with commands such as user.getPosts(). More on querying data with Sequelize in Sequelize submodules.

Additional Resources

Database normalisation theory suggests ways to structure our SQL database using an optimal number of 1-M and M-M relationships to minimise redundancy. This is especially important for companies processing large volumes of data.
If you want to define relationships with pure SQL you should read this Stack-overflow article.

Previous3.2: SQL Next3.2.2: SQL M-M Relationships

Learning Objectives

Introduction

Querying related data using SQL

Additional Resources