Today I would like to consider the concept of Many to Many relationships as they are implemented in Entity Framework. First let’s start with defining a Many to Many relationship with an example of a school which has students, teachers and courses.
A student may have 1 to many courses. A course may have one or more students. Each course is instructed by a teacher. By teaching a course, a teacher may also have one or more students. Course enrollment is the table that implements the Many to Many relationship of students to courses. A course may have only one teachers ( which may not be always true…but to keep this example simple), so the course table will have a relationship to a single teacher
This demonstrates how many to many relationships are implemented in a relational data structure. It is through a join table, Enrollment, that Courses, and indirectly Teacher, s are related to Students. Sometimes there is no table like Enrollment where the relationship between the student and the course has a well understood name. If there were no term “Enrollement” we would simply name the table Course_Student
Entity Framework takes a different approach when it comes to many to many relationships. Instead of having a third object that stands in place of the relationship, each entity has a collection of the other. A Course has a collection of Students. A Student has a collection of Courses. When you translate the relationships of Students, Courses, Enrollment and Teachers into objects you get the following:
- A Teacher has a collection of Courses, in which there is a Collection of Enrollments which have a Student object and a grade.
- A Student has a collection of Enrollments in which there is a grade and a Course object in which there is a Teacher Object.
- A Course has a Teacher object, and a Collection of Enrollment Objects, each having a Student object with a grade.
When you look at data in object form, which is what Entity Framework accomplishes, there is a lot of redundancy. What this does is let you look at data from any perspective. There is no SINGLE enforced object model. You can start with Students and get Courses and Teachers. You can start with Teachers and get Courses and Students. You can start with Courses and drill into the Teacher or Students. The path you take depends on your current need.
This exploits the flexibility of a relational design while utilizing the power of encapsulated objects.
I disagree with the EF gurus who state, “SQL Server does not have Many to Many Relationships. It has to rely on a join table.” I believe the exact opposite is true. Objects do not have relationships at all. They have hierarchies in the form of properties containing collections. That is not a relationship. Moreover, a relational database easily maintains additional properties other than the simple relationship.
For example, a student with an enrollment in a course may be assigned a grade. This is not a new entity. It is a property (column) that exists in the Enrollment table. The teacher assigns the grade; the student receives (earns) the grade; the assigned grade is stored in the Enrollment table. This is an extremely easy concept in a relational database. It is much more difficult to implement in objects. Entity Framework really doesn’t like objects based on this kind of structure. It can be done, but most often you have to have a unique, system assigned sequential key in your many to many join table, if you want to have any columns other than the foreign keys defining the many to many relationship
So, don’t be fooled into thinking Entity Framework, and EF defined objects handle many to many relationships, and SQL Server doesn’t. In fact, the opposite is true. You have to corrupt your relational database by not using the already unique composite key of Course/Student relationship table (Enrollment) by creating an enrollmentId, instead of relying on studentId, courseId as a primary key.
If this is getting confusing, don’t worry. I may simply be playing semantics here. The most important point is that when you want to have anything other than a composite key in a Many to Many relationship, such as grade, EF really prefers you to create a different primary key, so that it can turn the relationship into a unique object, something EF finds difficult to do without a single integer unique identifier.
Still not making sense? Please leave a comment and I try and clarify tomorrow.
Cheers,
Ben