Decoupling Storage Capacity & Performance
Written by Lynn Orlando
Published on January 26, 2020
Say you were shopping for a cute hatchback. Every car you looked at would come with a little 4-cylinder engine. But what if you wanted more horsepower? “Great,” the dealer would say, “We also have V6 engines. But they only come in these minivans.” Don’t want a minivan? Too bad, that’s just how cars are made.
Sounds crazy, right? And yet, that’s still how a lot of storage solutions work. Compute and storage are linked tightly together in a single physical node with no way to scale one or the other independently. Want higher performance? Then you’re also paying for more capacity, whether you need it or not. The reverse is true, as well.
Fortunately, a new generation of storage systems is breaking that mold. Organizations finally have the freedom to scale in a way that aligns with their actual growth, instead of their vendors’ business models.
Solving The Performance Paradox
Tightly coupling compute with storage isn’t really a nefarious plot. There are good reasons why storage historically came that way in Big Data and high-performance computing (HPC) environments—namely performance.
Back when people first started analyzing huge amounts of data stored in distributed file systems like Hadoop, network interconnects were fairly slow (sometimes as slow as 100Mbps). If you wanted to analyze a large data store and you had to move your data over to be computed from someplace else, it would take forever. By putting compute in the same physical host as the data, you could eliminate those delays and zip through your analysis much more quickly.
In solving that problem though, the tightly coupled compute/storage model created a new one: inflexible storage systems. The only way to scale up was to buy an entire new node, adding both compute and storage. But in the real world, it’s common only to need one or the other. It’s not a surprise that expanding your system almost always meant spending CapEx on extra resources you didn’t need.
The Decoupling Revolution
In recent years, this trend has finally begun to change. The public cloud ushered in a model where you can break up computing needs into more granular categories: I want this type of processor, with this amount of memory, with this much storage, for this many hours per month. In this way, the cloud companies made it possible to align computing budgets more closely with actual needs. At the same time, they drove businesses to demand the same flexibility from their traditional vendors.
Today, even in HPC environments where performance is the top priority, storage systems are starting to follow the same trend. It’s now possible to get high-performance storage arrays that can scale capacity and performance independently. So, if you want the storage equivalent of a Honda Civic with a racecar engine, you can get it. Or, if you’re happy with your little Civic engine but want to add capacity for another 6 (or 8, or 20) passengers, you can do that too.
For the first time in high-performance storage, you can expand as your requirements change and only pay for what you actually need. It doesn’t sound all that revolutionary does it? It’s more like common sense.