3.2.1: SQL 1-M Relationships
Learning Objectives
SQL relationships associate data between 1 or more SQL tables to model logical relationships and make data querying more modular and efficient
There are 3 kinds of relationships in SQL: one-to-one (1-1), one-to-many (1-M) and many-to-many (M-M)
Understand how SQL represents 1-M relationships in its databases
Introduction
Imagine a social media app with Users
and Posts
tables in its SQL database. How would the app know which posts belong to which users? The answer is SQL relationships.
To model the fact that each user can have 0 or more posts, but each post can belong to at most 1 user, we can implement a so-called one-to-many relationship (aka association) between users and posts.
SQL implements relationships using the concept of "primary" and "foreign" keys. Every table has a "primary key" (typically the id
column) that uniquely identifies each row (aka record, entry) in the table. Tables can have columns that reference primary keys of related tables to define relationships between those tables. Primary keys referenced in other tables are known as "foreign keys", because from the perspective of the tables referencing the primary keys, those keys belongs to "foreign" tables.
The following is a sample Users
table where id
is the primary key. There are no foreign keys on this table.
1
foong
2
kai
The following is a sample Posts
table where id
is also the primary key, but each post also belongs to a User
, and specifies a userId
to identify the user the post belongs to. userId
is known as a "foreign key", because it references the "primary key" of another table.
1
"Go Rocket!!!!!!!!!!"
1
2
"Rocket is the Best!!!!!"
1
3
"🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀"
2
The foreign key always lives in the "many" table in the one-to-many relationship. In our above example, notice the foreign key lives in the Posts
table and not the Users
table, and it cannot be the other way around. If we tried to put a postId
or postIds
foreign key in the Users
table, we would find that we would need to either duplicate User
rows for each user's posts, or store an array of post IDs in the foreign key column. The former would repeat User
data unnecessarily, and the latter would make SQL harder to query because of the array structure within a single cell of the table.
The above architecture of SQL relationships makes SQL efficient at storing data. The alternative might be to have a single Posts
table where we store each User
's data repeatedly in every post, but that would be wasteful and inefficient, especially when our Users
table contains more data such as contact information and preferences.
SQL stores one-to-one relationship data in the same way as one-to-many, except that the foreign key can live in either table. One-to-one relationships are less common because most of the time all the data in one-to-one relationships can be stored in a single table. Like one-to-many relationships, in one-to-one relationships there should only be 1 table with the foreign key. Examples of one-to-one relationships include users and their relevant contact and auth information if we were to separate that information into multiple tables for more efficient querying.
Querying related data using SQL
Given the above Users
and Posts
tables, we can query for related data using primary and foreign key relationships.
For example, to query all posts that belong to kai
, I could run 2 queries. The 1st query retrieves the id
of the user with username kai
.
The above query would return 2
, kai
's user ID. The 2nd query retrieves the posts that belong to the user with that user ID, and should return all of kai
's posts.
We will not perform such queries with raw SQL in our apps, but we may need to perform such queries when performing analytics on our companies' databases outside of an app context. In our apps we typically use ORMs like Sequelize that make querying for related data more robust, with commands such as user.getPosts()
. More on querying data with Sequelize in Sequelize submodules.
Additional Resources
Database normalisation theory suggests ways to structure our SQL database using an optimal number of 1-M and M-M relationships to minimise redundancy. This is especially important for companies processing large volumes of data.
If you want to define relationships with pure SQL you should read this Stack-overflow article.