How Much Is Enough?
How much storage do you need?
It’s a vexing question, and one that most data center owners can’t answer off the top of their heads. In fact it’s a hard question to answer at all, in part because today’s networks have so many systems, sub-systems, SANs, NAS devices and other elements that make storage simply Byzantine.
The fact that data is growing at exponential rates does nothing to help the problem, and new forms of rich media, such as podcasts and video e-mail, only make things worse. What’s more, most of the software tools are published by major storage vendors and biased towards those vendors’ products as a result. They’ll tell you how much storage you need for Notes or Oracle but not on your network as a whole.
But planning capacity is worth the headache. If nothing else, it’s a risk-management task that lets you balance the cost of storage against the risk of a shortfall, so you spend the dollars you have wisely. It also lets lets you defer a major purchase (and even minor ones) as long as possible, which takes advantage of the market’s built-in savings: Storage costs go down over time, and you’ll be able to buy more if you buy it tomorrow or next month or even next year.
Methods & Madness
If the notion of capacity planning is daunting, take heart: There’s no shortage of methods, and the first is simply gut instinct. In fact it might be the de facto system in most data centers, though experts warn that it’s fraught with problems.
Too many engineers overbuy and under-use their storage, scared that a shortfall will cripple them on the day they need their storage the most. The result is not only imprecise, but costly: Unused storage is no different than throwing dollars down the drain.
A step up is a more organized approach with spreadsheets, one that looks at the nature of your network, your storage and your data, then quantifies the results into numbers you can use. We’ll outline a method below, but note that spreadsheets can eat up your time (the more storage you have, the longer it takes to analyze and crunch the numbers) and become obsolete quickly (you’ll have to update your spreadsheet when servers and systems are added, removed or simply modified).
Last, there are high-end Storage Resource Management (SRM) tools that do all this for you. Think of them as the Rolls-Royces of systems that can scan your network and discover storage, including unused hard disk space, so you don’t have to. They can also measure current storage and watch for trends over time, including growth rates and non-linear expansion. If you’re not a numbers geek, they can analyze the data for you, building reports that let you customize every feature from the font to the page format. Advanced filters let you view and analyze storage resources by usage, importance or type, so you can see how much storage you’ve used on all your Exchange servers, all your servers in Washington, or simply all the servers that have reached 80 percent of their capacity. And the better tools will even spit out lists of tasks, such as adding and removing hard drives, with alerts that tell you when to do them.
But all those features come at a cost. SRM tools can run more than $50,000 and have long deployments and purchase times.
If you’re on a budget and can’t afford the high-end tools you’d like, you can make do with some elbow grease and a copy of Excel. It’s not an ideal method, but it’s useful nonetheless.
First, determine what types of data you have and what systems they lie on. You have e-mail and end-user data, but you might have ERP, CRM and Web apps as well. Don’t forget to factor in financial systems and a homegrown database or two, bearing in mind that different systems not only use different amounts of storage, but grow at different rates.
Armed with this information, you’ll need to establish some baselines for each system you’ve recorded: How much storage does it have? How much is in use? How quickly is it growing? It’s that last question that matters most. To find the answer, simply measure a system’s capacity at regular intervals, comparing the numbers to extract rates of growth by week, month or year.
In the end, you’ll have a spreadsheet (or several spreadsheets) with columns that include storage type, name, location, total capacity, amount used, percent used, amount free, percent free, growth rate, safety threshold and expected achievement date. The “safety threshold” is the highest usage amount you can sustain on a system. You might, for instance, let a system reach 90 percent of its capacity before you expand it, but expand other systems (such as those with sudden spikes or growth spurts) as soon as they hit the halfway mark.
The “achievement date” is the date you expect to reach the threshold—that is, the date you’ll have to take action. Remember to plan your budget requests with ample time for your achievement dates. If not, you’ll be left holding the bag, and your systems—and end users and even upper management—will start to sqawk. In contrast, good storage management will keep their feathers from flying and make your image soar as well.
David Garrett is a Web designer and former IT director, as well as the author of “Herding Chickens: Innovative Techniques in Project Management.” He can be reached at firstname.lastname@example.org.