Editorials

Feedback on Database Design Experiences

Featured Article(s)
Useful Undocumented SQL Server 2008 Database Engine Stored Procedures
In this article, Alexander Chigrik looks at ten undocumented SQL Server database engine stored procedures that shipped with SQL Server 2008.

Feedback on Database Design Experiences
I posted several questions yesterday asking about your experiences designing databases for lots of volume. Some great responses, ironically some of them are contradictory (figures!) – but wanted to share them here for your consideration over the next couple of days.

If you would like to jump in here with your experiences, please drop me a note here. I’d love to hear what you’ve seen and how you would approach things.

From David: "1. On the same server, the performance does not matter about multiple vs single database, but Maintenance is MUCH easier on single database.

2. On single server, a connection is a connection and 100 is the same as 10 x 10.

3. Several dependencies: Is there shared data or is every persons data unique? Is the access all read or a combination of read and write? Is latency a problem (if one user enters data, does it need to be available to other users instantly or is a few second delay OK?)

One large database is much more efficient to query than an union across multiple databases to get all the data.

If only a few people write to the database and many people read, it is nice to have one master database and several replicated databases on separate servers for the query only users."

…and Robert writes: "1. Query performance for end users. Does it matter if different users are querying a single databases versus multiple databases.

I think it depends on the queries…obviously, the quickest queries are those performed on one table, indexed accordingly and using exact matches as the querying criteria. That being said, proper planning and design can make multiple database queries appear seamless to the user and should not be an issue….As an example, I was working on a web search which would allow users to query 13 different variables in any combination. These 13 variables traversed at least ten different tables, each with 10mil+ records, in a single database, and there were exact match criteria used as well as “contains” style searches. The database was primarily an archival style database, and was not set up to search in this manner. Also, each record had a composite Primary key relating them to each other, but some relationships were zero-to-many, and I would have to determine the most recent record as the output in each table.….ultimately, I decided the best way to handle this search was to design and use a flat search table, containing the data in all ten tables indexed accordingly.


2. Resource utilization on the server side. Does it matter for the server resources if there are 100 connections to a single database vs 10 connections to 10 different databases.

I am not so sure of the impact on this one. I would speculate that ten to ten might provide better load balancing, but I am unsure.


3. Database size concerns. A single database will obviously be huge and will grow with time. Does that make performance or maintainability a concern for this solution, versus if we had separate databases.

One advantage about separate databases is their scalability in that you can move them to different servers if needed and then link back to them accordingly. Or you can anticipate the growth in advance and design your system to host different databases on different server from the outset (Likely won’t happen unless you have the resources to do this at the time). I do believe performance will take a hit on this one, so planning and design is important to minimize the effects. Obviously, maintainability becomes more difficult the more databases and servers you use.

On the other hand, one large database, which would be easier to maintain, can also cause potential problems when doing expensive queries that use a lot of tables at one time. We have used replication as a means to help load balance…our main database is only used to store data, and our replicated databases are used for searching. All changes are made to the main database and the changes flow down to the replicated copies accordingly."