Editorials

EAV Alternatives

If you have been following the EAV Relational Database Pattern introduced last Wednesday, then you may be wondering what other patterns allow you can to gain similar benefit without the pain of an EAV? Today I’m going to suggest some other ideas.

The problem with an EAV is that it is a many to many structure. Let’s say you want to do an EAV for people you know. You have a set of person tables completely normalized. Your person tables allow you to have many addresses and phones for one person. But you want to provide the users with the ability to track other things about a person that are not known prior to building the person schema.

To do this you create a person attribute definition table where they can create new keys for something to track about a person. Then, you create a Person to Person Attributes many to many join table; let’s call it Person_PersonAttribute. In this join table you track the data you wish for each person/attribute.

For example, you want to track the name of a person’s pets. You create a Person Attribute for Pet. Then you add a record in the Person_PersonAttribute for every person/pet combination you want to remember. This is easy to design, easy to create an user interface for data entry, and COMPLICATED to report.

If you knew ahead of time that you were going to be tracking pets in your database you would simply create a PersonPet table, with a person table having one to zero or many relationship. As you can see, the EAV is not that much worse than the PersonPet table.

When you add properties to a person where there will be only one entry for each person, such as their favorite sports team, then the additional work becomes less valuable. Now instead of a many to many relationship, you have a Person on to zero or one relationship with a single scalar attribute.

For another scenario, you now want to track the Forite Sports Team for a person. You have a couple of options. Modify the schema of the person table, adding a favorite sports team column, allowing nulls for those persons where the data is not available, or a decfault of empty string. A second option is to add a new table, PersonFavoriteSportsTeam, with a person Id from the Person table, and a sports team column. This new table has a Person One to Zero or one relationship to the PersonFavoriteSportsTeam table. The advantage of this is that you don’t modify your person table, already populated with data. You aslo don’t have to have a null or default value in your Person table for Favorite Sports Team.

You now have the possibility of adding a lookup table for sports team to make sure everyone uses the same spelling, et. letting referential integrity normalize your data (you can add lookup values for EAVs too). If you have the additional lookup table for Sports Teams, you can use this new table as a Many To Many join table and allow the person to have more than one favorite. In short, instead of creating the capability already built into an SQL database, you allow the database schema to be modified at run time.

This technique is not used as often because it makes it more difficult to implement at run time. You have to alter the schema. You also have to alter whatever code reports on the new data, or gathers the data for the User Interface. Security is an issue as well. Many companies don’t want users modifying schema…often they don’t even want the users to be aware of the schema. Still, this can be done effectively, and may perform more efficiently as well.

Have you allowed user schema modification? Is that a pattern you prefer in your systems? Do you simply not allow users to make adjustments? Do you define user definable columns in your table as place holders for an administrator to configure during application setup?

Why not share your thoughts in our comments, or drop an email to btaylor@sswug.org.

Cheers,

Ben