Editorials

Data, Data… But wait, there is more!

Last week I wrote about the data flows that the military is facing – and the sheer volume of these and what they mean to the people responsible for managing that data. In addition, I mentioned that I think the issue of analyzing and working with that information is substantial.

Since there are really two different "modes" of working with these flows of information (the streams of data and data stored that needs to be analyzed and re-mapped), the challenges come around making the most of information in both states. I did miss one aspect of these types of requirements though, and that is multi-platform data sources and storage mechanisms.

In many cases, this comes from several directions. You have devices generating information and you have multiple back end solutions responsible for storing and analyzing it. You combine these two areas and you get quite a complex environment that can require some interesting twists and turns to make it work.

With SQL Server specifically, you’ll probably be looking at SSIS and some automation work if you’re working with data stored. You can do a lot to normalize the information using scripting and data scrubbing and generally getting it set to go. From there, you have a number of tools (analysis services, reporting, etc.) to build out systems to work with that information. (I still say discovering new bits where you don’t know the questions yet is the big challenge).

With streams of information, you can use StreamInsight-type solutions and work with those data flows.

One of the bigger challenges though is building applications that look at many different data sources and databases and work against those. This introduces all sorts of complexities – from developer and administration requirements (does this statement work with that system or do I have to change it?!) to building out systems that "fix" and normalize the flows in real-time across sources and destinations. Several people wrote to say that this was something they were facing soon, working with many different ends of the solutions they were going to be supporting and working with.

You tend to think of this as a high-end issue where many systems are involved, but it’s becoming more commonplace as different types of applications and environments (like mobile devices and automation systems) start "reporting in" as they work. These don’t require extravagant commercial systems, but instead are often made up of normal, everyday applications. Built the solutions that talk and work with these can be extremely challenging.

I think, though, that going forward as you build out your applications and support for those applications, you should count on different sources and destinations of data. It’s only a passing planning point for many at this point, but it should probably be a much more key planning point as you create your plans. Assuming that information will be coming from and going to much more diverse sources and destinations will help you be prepared much more completely in the future.