Skip to content

Introduction to the ER Data Model

New Course Coming Soon:

Get Really Good at Git

A simple but comprehensive overview of the entity-relationship data model

The Entity-Relationship Data Model, also called ER, is one of the various data models you can use to reason about your data.

In particular, it’s a conceptual data model, as it’s not linked to any particular implementation. That’s a task left to the logic data model.

The ER Data Model is so general, so high level, that it can be implemented by a variety of completely different kinds of databases.

It is great because you don’t think about the implementation details, but instead you only think about your data and how it’s organized. And while doing so, you are forced to analyze the problem in ways that you didn’t think about before.

I find ER diagrams great at helping you analyze a scenario where data is involved.

The ER Model gives you the tools to represent, using a graphical notation, all the data you need to model, the relations between the different kind of data, and the information associated with it.

There are 2 items that compose an ER Model:

Entities are types of data, like items or people, that have common properties.

Relations are the relations between entities.

Let me give you an example, let’s talk about books and their authors. We have 2 entities:

A particular book is an instance of the book entity.

Since we have 2 entities, we have 2 relations between them. One is the relation between a single book, and the authors entity. One is the relation between a single author, and the books entity. If we think about it, wre have:

The visual notation for entities

Given this simple example, we can start introducing the visual notation that will help us create the ER Data Model of our scenario.

Note: There are many different ways to draw ER diagrams. I will use the one that, in my opinion, is more visual and makes more sense.

Entities are represented as rectangles, with some text in them to identify them:

Entities

The visual notation for relations

The relation between entities is represented, in its most basic form, using a line that connects two relations, and a diamond with some text in it to identify the type of relation:

Relation example

Note that I did not create 2 relations “book has author” and “author wrote book”. I made a single relation “authored” between a Book and an Author.

Cardinalities

Once we have a relation, we must now indicate the numbers involved. At the moment, we have many questions:

All those are good questions to ask, and in this case I think the answers are pretty obvious. And when the answer is not obvious, you can think about the problem and add your own constrains.

There are various ways to visually indicate cardinalities on a diagram. Some prefer changing the shape of the line when it links to an entity.

I do prefer numbers, which make things clearer:

Cardinality example

The numbers above mean this: a book can be authored by 1 or more authors. n means any number of elements.

And an author can have authored from 0 books (maybe it’s writing one now) to an infinite number of books.

The first is called a zero-to-many relationship. The second is a one-to-many relationship.

We can also have:

Attributes

Each entity can have one or more attributes.

Let’s say we’ll use the above relationship in a bookstore. Each author has a name, a bio, a website URL.

Each book has a title, a publisher, a year of publication, an ISBN. The publisher could be an entity as well, if we want. But we can also define it as an attribute of a book.

This is how we can represent the above information:

Attributes example

Identifier attributes

Entities must be identified by a unique key. The book entity can be uniquely identified by the ISBN attribute. Every book has a single ISBN (considering that we don’t represent a single copy of a book, but a book “title”).

We identify the primary key attributes by underlying them:

Key

Author, on the other hand, has no unique identifier at the moment. Two authors can have the same name.

We must therefore add a unique key attribute. For example an id attribute:

id attribute

Identifier attributes can span over multiple attributes.

For example, the passport ID and the author country uniquely identify the person, and could replace the id attribute we added:

multiple attributes key

Which one to choose? It’s a matter of what makes more sense in your application. If we are modeling a bookstore, we can’t expect to have the country and passport id of all book authors. We’ll therefore use a random id we’ll choose and associate with each author.

Relation attributes

Attributes are not unique to entities. Relations can have attributes, as well.

Consider we need to model a library. In addition to the book and author entities, we now introduce the reader entity, a person that borrows the book to read it. We take its name and id card number when they borrow it:

relation without attribute

But we still miss information. We need to know when the person borrowed the book, and the date they return it, so we can store the information about all the history of a particular book in our library. This information does not belong to either the book or reader entities; it belongs to the relation:

relation attributes

Weak entities

We talked about primary keys above, and how the help uniquely identify an entity.

Some entities depend on other entities for their existence, and are called weak entities.

Suppose we need to model orders for an online shop.

For each order, we’ll store the order id, which starts from 1 and increases over time, the date and time it was placed, and the information about the customer, so we know who to bill and where to ship it.

Then we also need to know what they ordered. We create a weak entity “ordered item”, which represents each item in the cart at the time of checkout.

weak entity

This entity will store the item price at the moment of checkout (so when we change the price of the products on sale, it will not affect the orders already placed), the quantity of items that were ordered, and the options chosen. Say we sell t-shirts, therefore we need to store the color and size of the t-shirt ordered.

It is a weak entity because an ordered item entity cannot exist without an order entity.

Recursive relations

An entity can have a recursive relation with itself. Suppose we have a person entity. We can model the parent-child relationship in this way:

recursive relation

A person can have from 0 to n children, a children has 2 parents (considering the simplest scenario).

ISA relations

ISA stands for IS-A, and it’s a way to model generalizations in the ER model.

We use it to group similar entities under a common umbrella. For example an author and a reader, in the case of the books and library example, can be generalized using a person entity.

They both have a name, so we an extract the name up to the person entity, and just manage the peculiarities of being an author or reader in the corresponding entity:

isa relation

Non-binary relations

Not every relationship is strictly binary. Let’s take a lesson scenario.

A lesson takes place in a room of the school today at 10:00, with a teacher, talking to a class about physics.

So, a lesson is given at a particular time of a day, it involves a subject, a teacher, a class, and a room.

We can model it in this way:

non-binary relation

Are you intimidated by Git? Can’t figure out merge vs rebase? Are you afraid of screwing up something any time you have to do something in Git? Do you rely on ChatGPT or random people’s answer on StackOverflow to fix your problems? Your coworkers are tired of explaining Git to you all the time? Git is something we all need to use, but few of us really master it. I created this course to improve your Git (and GitHub) knowledge at a radical level. A course that helps you feel less frustrated with Git. Launching Summer 2024. Join the waiting list!
→ Read my SQL Tutorial on The Valley of Code

Here is how can I help you: