Readers Respond
Ouch. I really missed some good information from Darryll’s reply on EAVs. I was sure to read it all today, and am glad to post here his much more complete response. Thanks, Darryll, for sending a second reply.
As I mentioned below in my original email, there is a reference pattern / model for EAV described in the book "The Data Model Resource Book, Volume 3 – Universal Patterns for Data Modeling"
http://www.amazon.com/Data-Model-Resource-Book-Vol/dp/0470178450
So, I don’t think it’s a lack of reference standards that causes what is a predominantly negative view of the use of EAV models. Rather, I believe it’s because people have traditionally built solutions around EAV and suffered abysmal performance and unacceptable user experience when reporting applications have been built which interact directly with the EAV model.
Instead, it is my belief that EAV can only be used as a foundation architecture, given the current implementation of natively row-based database storage engines. What I mean is that the EAV model must serve as a data source to a ‘traditionally relational and tabular storage model’. Merely pivoting the data through the use of views will not suffice because the query performance is guaranteed to be unacceptable for reporting applications working against a database of any significant scale. Ultimately, again, given the current implementation of database storage engines, EAV data must be physically re-materialized into ‘traditional’ tables which instead serve as the source for reporting applications. As a result, there is a significant, perhaps doubling of storage requirements, along with a substantial amount of code to perform this re-materialization.
The aforementioned limitations of EAV, and the requirements that ultimately exist to work around them there can only be justified / rationalized, in my opinion, if substantial effort is put into automated object and code generation (from the metadata stored in the EAV model) to facilitate the re-materialization. And I will tell you from experience, this is not an undertaking for any but the most intrepid, skilled and likely senior of data architects and practitioners.
Therefore it is either of these two ‘realities’ that currently exist in the use of EAV models that greatly hinder their use, which in and of itself is not a bad thing. I say this because in my experience of over 30 years, there are few application domains and user stories / business requirements which can be used to rationalize the implementation of an EAV model, at any level.
Bastion has some other thoughts to offer on EAVs:
I’ve been using the EAV pattern for about 10 years in the application I used to work on. There is documentation and use case references out there, but it seems more prevalent in the Open Source world. I think that have something to do with the tooling provided by MS and the fact that most ORMs I’ve looked at have problems with the EAV and the code generated by the ORM. They just don’t seem to play together nicely. The MS Entity Framework was no different here based on the discussions I had with one of the .NET devs at my company.
The use case we had was this: Our application allows clients to configure the forms themselves. It allows users to build and release as many forms as they wanted based on certain fixed tables structures (a normal table) and it also allowed them to add an unlimited number of fields to the form. These additional information fields were all stored in a table that uses the EAV design pattern to hold that data. This structure and design makes for an incredibly flexible application.
There are some issues around reporting on the data that manifested itself. Our MySQL database experienced performance issues after a few hundred thousand rows of data (though in fairness, the app was using 4.11, which is very old). The app also uses MSSQL which did not appear to have the same performance issues. When I designed the next gen database, I kept the EAV tables, but split them into specific tables for dates, string and numbers to reduce the size of the single table. On the reporting side, the use of materialized views (MSSQL calls these indexed views) should aid performance.
Editor’s note: Be aware of the cost of materialized views. They can be (emphasis on can) used as a substitute for a better data model. In a real case where EAVs are required, materialized views may be the correct choice. Personally, I would take the load off the data engine when pivoting data for EAV attributes. The database will handle the data just fine un-pivoted. Even when filtering data (that’s a little harder, but still readily done with limited database load). Then push the pivoting of the data into a user readable form or data objects in your middle tier. The middle tier scales out much easier than any relational data engine.
At the end of the day, we are used to mapping data from one form to another constantly (ORMs, etc.). So there is no reason we can’t also do a mapping of eav data in a middle tier.
Computer Professional Court
Scott writes in with some realistic kinds of environments where holding people accountable is more difficult. Seems like a topic for my first editorial next week…
If you are going to have developers held accountable for bass coding decisions, you better hold management liable for bad hiring, favoritism in task assignments, procurement debacles, and not knowing enough about modern tools and technologies to avoid being misled by self-serving developers with a resume-first attitude, and hold union reps accountable for bad advice to members about how to work with management.
You’re always invited to share your opinions on these or any other topic related to information systems by writing btaylor@sswug.org.
Cheers,
Ben
SSWUGtv
With Stephen Wynkoop
Did you miss the interveiw with Patrick Townsend yesterday? Come check out what he shares with Steve regarding Cell-level encryption and TDE. Find tips on what you need to know.
Watch the Show
$$SWYNK$$
Featured Article(s)
Are your employees professional engaged?
How can you tell if your employees are engaged and motivated? How would you motivate someone who’s driven by "Identity and Purpose"?
Featured White Paper(s)
How to Use SQL Server’s Extended Events and Notifications to Proactively Resolve Performance Issues
read more)