Editorials

Optimizing Entity Framework Queries

One of the cool features of Entity Framework is the ability to execute queries against multiple tables in a single round trip. It does this by combining all of the attributes from each table into a single result set. This works really well until you start joining to related tables with more than one record.

When you think of an object it has two basic kinds of properties, at least from the perspective of Entity Framework. An object has Scalar Properties. The scalar properties may be a simple basic type like a string or an int. It may also be a complex property such as an instance of another class. The key is that scalar properties only have one instance. So, a class may have a scalar property of type, which may consist of multiple properties in its own definition. But there is only one.

The second type of properties in a class are collections, or sets. They are not restricted to a specific type. However, the property is a collection type with zero, one or many records. You are familiar with this model when you have a person class that has a collection of phone numbers. The phone number class may be complex, having a phone type, number, extension, country code, etc. The person has a collection with zero, one or more phone numbers.

When you want to return multiple tables in an EF query, performance degrades when you include collection properties. This is when you explicitly bring back all of the data using the Include EF operation. Each time you add a collection property, you increase the number of columns, and you increase the number of rows. If you return more than one collection property, you increase the size of the resulting data. If your initial class is a set, and you are returning more than one, the math gets bigger, the data longer, and the performance degrades, or even simply quits working altogether.

One way to optimize this, especially if you are only returning one record for the root class, is to get the class data with all the scalar properties as a single EF query. Then, return each of the collections as a separate EF query. After returning the collection property data, add the results to the base class property. If you are using async queries, you can even run each of these concurrently, and then put it all together as a single object once the results are returned.

We’ll look at EF again later this week to demonstrate how this same process can be performed using EF itself without having to roll you own separate queries to retrieve the collection data.

Cheers,

Ben