Editorials

Big Data in Your Database

Big Data in Your Database
Today I wanted to bring up the topic of storing big data in your database. I’m not talking about big data such as unstructured data used for data mining. I’m talking about big data in the form of Binary Large Objects (BLOB) which has structure internally, but no meaningful value as a database attribute. This could be XML documents, Text Files, Images, etc.

SQL Server, for example, has a number of data types allowing big data, such as VARCHAR(MAX), Text, Varbinary(MAX), Image and XML. Including FILESTREAM objects and you have a lot of data capable of including in or referenced by your database server.

With all of these choices what is the best for your application?

Storing data inside your database may be a good choice. It allows you to include your big data in your backups. Placing big data on separate file devices can allow them to be backed up separately. However, you have to expect your full backups to take longer. Also, your transaction logs and transaction log backups will be larger and take more time due to the volume of data.

Mixing big data with other attributes in a single table is possible. But this reduces your ability to separate the management of the big data from the relational data for performance reasons. You may find a need to create covering indexes or even indexed views to remove the big data from the picture. Probably better to use a 1 to 1 relationship, placing the big data in a separate table.

Even XML can have a huge impact on performance and disk space used. Primarily this is because of the allocation of big data in SQL Server pages of 8k. An 8k page is pretty big, and may result in a lot of wasted space when you have big data that is just a little bit bigger.

What about something like image files? Say you have a database of users, and you have a picture of each user. Do you store the picture in a Image data type, or a Varbinary(Max)? You certainly can. What if you didn’t? My first question would be what if you had to restore the database on another server, and the images were stored separately? Then the path to image files would have to be available to the other server. If instead, the images were stored in the database, they are automatically transferred to the other server when the database is restored.

Notice I haven’t provided you with an answer or preference. Because, the correct answer is, say it with me now, “It Depends!”

What about text documents? Would you save them in the database like SharePoint? It may be easier to secure the data, and to assure the whole thing is intact with a single backup. Maybe it is easier to use a File Stream data type in order to get the ACID capability, but store the data separately.

Well, there are some questions for you to consider. Why not write in with questions you use when determining
where to put your big data. Send your comments to btaylor@sswug.org.

Cheers,

Ben

SSWUGtv
Did you miss the interview with Craig Mullins last week. He asks the question, "As a DBA…what are the biggest key things you need to know?" Check out Craig’s Answer
Watch the Show]

$$SWYNK$$

Featured Article(s)
Troubleshooting problems with SQL Server 2008 R2 Profiler
In this article, you can find the description of SQL Server 2008 R2 Profiler bugs and the information on how to resolve or work around these problems.

Featured White Paper(s)
Key Management in the Multi-Platform Environment
Written by Townsend Security

This White Paper discusses the challenges … (read more)