BLOB Storage Inside the Database
Today I’m including responses from those on the fence or looking for BLOB direction. Others have written in advocating putting BLOB data inside the database. The question is one that many developers still face. And, many people are still having to work with legacy systems as well. I would guess the percentage of SQL Server 2012 installations is not very high just yet.
Mehul writes:
I was always looking for the perfect solutions for the storing " Big Data in Your Database".
I am currently designing a DB (Unfortunately it is still SQL Server 2005), for a real estate client, where photos/pictures/images of the Properties (20 photos for each property) are required to be stored.
As stated by you, we are also discussing 2 methods namely, inside DB and outside DB(file system).
We are still not able to come to conclusion on which one is best for now and for future.
Mark likes to put data inside the database…for now:
First, thanks for your thought provoking and informative newsletter.
Quick background: My client inspects and tests construction materials, both on site and in production to insure what is specified in the plans of a building are in fact what is used during the construction of that building. For example, whenever a concrete floor is poured, a set of cylinders (usually 3 or 4) are filled using that same batch that was used to form the floor. These cylinders are broken on a specific schedule (usually 7 days, 14 days and 28 days) and the strength of the concrete is stored and reported to the client as well as to everyone else involved in the building (architects, general contractors, city engineer, building owner, etc). A report of the results, as well as pictures and scanned documents are emailed to these interested parties.
This is all done using a VB6 (yes, VB6) application. Originally it stored these images in DBF files and used the MEMO type field to hold the image. When we migrated to SQL 2008 R2 from DBF, I used (and am still using) the varchar(max) data type to store the image.
Back when I first wrote this application, in the late 90s, I had to make a decision about storing the images in the DBF file, or simply putting a link in the DBF file to the actual image file located on a network share. I estimated the size of the image store after about a year (after I had scanned multiple documents that represented a good sample of what was going to be scanned by the users) and decided to store the images in the MEMO field in the DBF file because:
- the images are in the DBF file and I never have to worry about pathing issues when the network changes (new server, new mapped drive letters, new UNCs, etc)
- I never have to worry about someone having access to the images without going through the user interface
- All of the data for the application is in one set of files, so a copy of the folder holding all of the DBFs gave me all of the application data versus copying the DBFs from one location and the image files from another location (there are now about 200,000 images in the table). This also makes development easier for me when I want to get the most recent copy of the client’s data and use it in my development environment.
When I moved to SQL 2008 R2, the image DBF file was 20GB and growing about 1-2 GB per year. I decided at that time to store the images in their own database and all of the rest of the data for the application in another database, both on the same SQL instance. This seems to be working out well.
The client is about to increase their scanning volume 2 to 3 fold and I might need to reassess my design, but I am pleased with how it turned out so far.
Feodor brings some issues to light you should consider when making your storage choice:
The most important question is ‘How often is the data used’ (data used = data read and / or data updated). After answering this question, we can partition the data as needed, and by partitioning I mean smartly using the filegroup options for the database modeling. Then we can consider the times it takes for backing up and how long it takes to do piecemeal restores.
After all, we all care about data integrity, and SLAs in case of failure.
So, it really depends, but I usually start with the use of the data, and the fastest way possible to bring the data back online, if failure happens.
That pretty much wraps up this discussion about BLOB data. Tomorrow I’ll transition into the question of placing BLOB data into a NO-SQL cloud.
If you would like to contribute to this discussion send your comments to btaylor@sswug.org.
Cheers,
Ben
$$SWYNK$$
Featured Article(s)
Total Database Information At Your Fingertips (Part – VI)
This article will help to get some basic information from your databases that may help you in different situations.
Featured White Paper(s)
Key Management in the Multi-Platform Environment
Written by Townsend Security
This White Paper discusses the challenges … (read more)
Featured Script
sp_export_to_excel
Exports the results set to Excel…. (read more)