Software-defined storage: Making sense of the data storage technology
A comprehensive collection of articles, videos and more, hand-picked by our editors
The question is not if, but when IT pros and decision-makers must rethink their data storage environments and the benefits they can expect from them. That's because the storage world is undergoing a seismic shift away from hardware-centric products to those driven almost completely by software and commodity hardware.
The goal of the emerging platforms is to replace complex storage systems with simpler and more-agile approaches. Learn how your organization can thrive in this new world order of fluid, agile and software-defined storage and data centers.
From whence we came
Regardless of how many times we're told to "adapt or die," human nature dictates that we avoid change because it's uncomfortable. And regardless of how frustrating our legacy storage systems are, over the years, we've learned how to manage and deal with their many nuances, including the following:
- Cost. Traditional storage has often been one of the most expensive, if not the most expensive, resources in the data center. But changing budget dynamics are forcing organizations to think differently about storage.
- Complexity. Storage has a reputation of being difficult to manage, and an entire industry has cropped up with the primary goal of simplifying this IT resource. Where we used to have to build out RAID groups and define LUNs and other esoteric constructs, newer data center architectures let us simply consume raw storage. In most cases, complexity hasn't disappeared; vendors have just moved it out of sight.
- Lack of scalability. To be clear, most storage can scale in some way, but traditional systems have had hard limits as to how far they could go and how easily they could get there. With storage capacity requirements growing at an increasing rate, the ability to quickly and easily scale that capacity -- and to do it affordably -- has become a key requirement for modern storage infrastructures.
- Lack of integration opportunities. Traditional storage systems remain in the majority of data centers. However, these systems are being supplanted as more capable options become available and as storage prices drop. Modern IT environments have moved on from custom-engineered hardware to become software-centric based on commodity hardware. Infrastructure of all kinds, including storage, is now a part of larger workflows that need to include the ability to be addressed programmatically.
Extend your storage investment
To say there are a multitude of storage options on the market for data center convergence is an extreme understatement. Whether you want to extend the life of your current investment or are interested in a forklift replacement, options abound.
Extend vs. replace
The decision regarding when and if to extend existing storage versus replacing it carries some important consequences. The longer you use a data storage product, particularly in the world of spinning disk and flash, the greater your chance of suffering a failure. All the software in the world won't fix physical hardware problems. Even with a four-hour support contract, that's four hours your systems could be down, especially if you don't have an adequate business continuity plan in place. If your business simply can't survive hours of downtime, you should consider replacing your older storage infrastructure components with newer hardware. The peace of mind will be well worth the investment.
Because storage is one of the most expensive parts of the data center, many companies do everything they can to maximize their investment. However, they realize they often lack the capacity, performance and flexibility required to transform their IT operations to meet the demands of the modern data center. But there are ways to augment their existing storage environment and imbue it with modern capabilities that the storage infrastructure alone may not support.
Deciding what kind of caching to adopt comes down to your goals and the kind of environment you're running. For example, if you're experiencing performance problems in a VMware environment, you may choose to deploy a software-based storage accelerator or caching tool on each host. These tools work in one of two ways:
- RAM disk redux. By consuming a bit of RAM on each host, the cache becomes a supercharged read repository. The downsides are that you are limited to a relatively small cache size and, depending on the software chosen, only the acceleration of reads. For certain scenarios, such as small virtual desktop infrastructure deployments, these limitations may not be that detrimental.
- Server-side flash. By leveraging server-side flash rather than expensive RAM, you can easily support read and write caching, making it an able accelerator for both read- and write-intensive workloads. The only real limit on cache size is the size of the flash storage device installed in the server.
Storage abstraction and virtualization
But caching may not go far enough to meet your needs. You might have racks upon racks of storage from a myriad of vendors -- a combination of spinning disk, hybrid and all-flash systems, using different combinations of iSCSI, Fibre Channel, SMB and NFS. Some systems could sport data services such as deduplication, compression and encryption, while others don't.
Or you might just have a number of aging disks that still work perfectly well, but lack the critical data services and security features, such as encryption, required for today's fast moving and often insecure IT and business environments.
For those in this position, software-centric options, such as those from Infinio Systems, allow you to keep the storage you have, but add new capabilities and make it simpler to manage. These offerings work by replacing the storage targets -- your existing controllers -- with a new, software-based front end. You don't physically replace the controllers, but rather place a series of highly available software-based storage controllers in front of them that become targets for all your servers.
So rather than, for example, establishing an iSCSI connection from a vSphere host directly to an array, that iSCSI connection will instead point to this new series of software-based controllers. Behind the scenes, those new controllers connect to your existing storage systems while taking over the handling of front-end traffic. In the following figure, the orange boxes represent the new storage front end, which is depicted as a highly available cluster.
By jumping into the middle of the existing storage network, these new controllers can add capabilities to storage. For example, if you have storage that can't perform data deduplication, you can add it by placing these controllers in the data path. They take care of deduplicating and storing data on your current storage arrays. The same is true for other data services you may need, but do not currently have, such as encryption and compression.
These centralized software-based storage front ends can also make a complex environment much easier to manage while adding considerable flexibility. Complex IT environments have storage services that run the gamut, with each service requiring some specialized skill to run it. By abstracting these systems through what amounts to virtualization, you can stop worrying about the day-to-day administration of disparate storage systems and services and focus your attention on consuming storage. Further, these types of products generally enable transparent use of cloud storage services.
Augmentation or replacement
Caching products that enable the use of existing storage are great, but many users prefer new options. For them, there are a host of options available, from converged to hyper-converged to full software-based storage systems.
Converged systems look a lot like traditional infrastructure, at least on the surface. However, while they may comprise existing products, the procurement model is radically different. Rather than buying individual components, including storage, you buy a full rack of pretested, prevalidated infrastructure that simply rolls into the data center. Alternatively, reference architectures -- hardware recipes -- include storage that's easy to integrate, manage and scale. Because it is hardware-centric, data center convergence takes the guesswork out of interoperability among servers, storage and communications fabrics.
Things get even more interesting in a hyper-converged infrastructure (HCI). Hyper-convergence collapses servers, storage and a hypervisor into a highly scalable appliance. HCI systems are software-centric, even when sold as appliances.
Hyper-convergence: Software or hardware?
The plethora of hyper-converged products on the market can make choosing the right platform difficult for even the most experienced IT professional. Some vendors sell hardware appliances, while others hawk software-only offerings for the "bring-your-own-hardware" approach to hyper-converged deployment:
- If an easy to deploy, all-in-one approach is what you're after, and you don't mind trading a bit of resource configuration flexibility to get it, I recommend the turnkey appliance or hardware route to hyper-convergence.
- If you prefer options that granularly dictate hardware configurations, and you're willing to do a bit more deployment work on your own, choose one of the many hyper-converged software products on the market.
Hyper-convergence lets you build out on-premises data centers that have some economic characteristics of the cloud. It accomplishes this by offering building blocks that scale granularly. If you need more capacity, simply add another node. Hyper-convergence takes the complexity out of storage by eliminating the need to manage it as a separate service. Storage is simply pooled across all the nodes in your cluster and consumed by the virtual machines running on each host.
This operational simplicity is a hallmark of HCI, but there's more to the story. As is the case with many products in the software-defined storage family, vendors generally include powerful REST APIs for leveraging comprehensive data center automation initiatives. This makes such platforms particularly interesting for organizations considering DevOps-like frameworks. Now, you can treat infrastructure just like any other software element and address it programmatically, which presents a number of automation opportunities and possibly some powerful outcomes.
For example, with these kinds of APIs, developers can completely automate the refresh of test environments, saving the time required to manually tear down and rebuild them. Furthermore, as a part of production, software can keep a watchful eye on compute and storage resources and, if necessary, automatically create a new virtual infrastructure.
Getting started with hyper-convergence is easy. Depending on the vendor, you can start with as few as two nodes and give a platform a test drive with a simple application. As your proof of concept bears fruit, add more nodes until you have enough processing power and storage capacity to meet the needs of your application environment. You can then begin shifting workloads from your legacy environment to the new construct built on X.
Regardless of the kind of storage challenge you face -- capacity, performance, cost, complexity and so on -- there is an available data center convergence option that can help.
IT convergence enters new areas
The future of storage lies in its convergence with memory