Editorials

Is Disk Space Usage Becoming More, or Less Predictable?

Traditional suggestions for running your systems included modeling your disk usage and other resources so you can predict your system growth, utilization and overall requirements as time goes on.

I’ve been working with a couple of systems for just this type of profiling. It’s been interesting to see that the traditional model of "we’ve grown at this rate before, we can at least use it as a foundation going forward" doesn’t "feel" like it still works. We’ve ended up estimating with such increasing factors going forward that it almost seems silly.

This comes not only from a growing business, but also from a growing data hoard – it almost seems like it’s reproducing in the database. Of course what’s really happening is what’s happening everywhere. People are storing more information, getting rid of (archiving, deleting, whatever) much, much less. We found ourselves essentially using factors, rather than "ramps" to show the growth.

This doesn’t seem to be the case with processor power on a given system – that seems to be pretty predictable for scaling utilization. The number of systems is a big question mark because you have to address the increasing storage and the reporting requirements, queries, etc. Things that tend to lead to more systems, data warehouses and (wait for it…) more storage.

How are you approaching this? How do you plan for storage management? While storage is cheap, it doesn’t matter if you’re in the cloud, on-premise or a mix of the two, you still have to manage and plan for storage. Having a reliable model is critical.

I’d suggest a couple of things:

– look at your storage growth now – very recently – like the last 6 months. This is likely a more reliable predictor of growth going forward than looking back for longer periods.

– as you consider scaling storage, consider scaling servers (or instances) to handle functional loads (data warehouses, reporting, transactions, etc.) – this can help mitigate the out of control growth because you’ll be splitting up your systems into approaches where you can more readily get your arms around your storage requirements and perhaps manage it more succinctly.

– returning to the days of olde – online storage, near-line storage, offline storage… has some merit. You can see this with new cloud offerings, you can implement this at your own company. But using these types of approaches can ease the pain of getting rid of data while at the same time optimizing for what you really need ready access to.

What do you recommend? What types of growth are you seeing?

Shoot me an email at swynk@sswug.org or comment here on the page.