The rise of hyper-converged architecture for storage
A comprehensive collection of articles, videos and more, hand-picked by our editors
Hyper-converged solutions became one of the hottest technologies of 2014, gaining mainstream credibility with the release last year of EVO:RAIL, a suite of hyper-converged hardware solutions from VMware and its selected partners. Hyper-convergence has developed rapidly, so we'll look at how it's defined and why you should give some serious consideration to products in this market segment as you build out your data center storage infrastructure.
In the late 1990s/early 2000s Unix and x86 platforms emerged as mainframe replacements. In a contraction/expansion process that seems to reverberate regularly around the technology industry, closely coupled mainframe systems diverged into specializations based around networking, storage and servers. As server virtualization gained ground, divergence of technologies resulted in large organizations deploying huge teams of highly skilled (and well paid) technology specialists. This siloed approach resulted in technology sprawl and increasing operational costs, in many cases dwarfing the acquisition costs of the technologies themselves.
A converged infrastructure was the first step in simplifying the operational issues associated with the siloed model. Those products combined technologies from multiple vendors into a single stack of compute (servers), storage and networking, usually sold as a single stock keeping unit (SKU). Converged solutions reduce operational complexity by simplifying the technology choices that are deployed and by providing a single point for management and support -- the proverbial "one throat to choke." In most cases, converged infrastructure doesn't offer the hardware platform at a cheaper cost, but the overall TCO is lower due to reduced operational costs.
Hyper-converged infrastructure defined
Hyper-converged solutions take the integration process introduced with a converged infrastructure one step further. The physical components of storage and compute are combined into a single physical form factor, typically a rack-mounted server, using commodity DAS. Resiliency that would typically be achieved in storage systems through the use of dual-controller-type architectures is implemented in hyper-converged solutions by scaling out with multiple nodes -- a feature that would already be in place to support server resiliency and failover for the hypervisor.
Hyper-converged offerings typically use commodity hardware (although some still have bespoke components) rather than custom ASIC or field-programmable gate array (FPGA) chips used in dedicated storage systems. As a result, the secret sauce or key differentiators of hyper-converged products are baked into the software, which is where the main benefits are derived.
All of today's hyper-converged solutions are based on using a server hypervisor, including VMware vSphere, Microsoft Hyper-V and open source KVM.
A key feature of hyper-converged solutions is the use of distributed storage. DAS components from each physical server are combined to create a logical pool of disk capacity that uses all resources in the scale-out node cluster. This scale-out technique provides a number of benefits, including:
- Resiliency: Data protection is implemented across multiple nodes, providing for the loss of any single disk or even an entire node.
- Performance: I/O for any single virtual machine (VM) can be distributed across an entire cluster of servers. This allows the aggregation of I/O bandwidth from many hard disk drives or solid-state drives to be combined. Where data is locally located with a VM, the latency of hyper-converged storage can be lower than accessing an external SAN-connected array.
The use of scale-out technology means local commodity DAS can be used in place of a more expensive dedicated SAN-based storage system. The storage component of hyper-convergence is implemented either as a VM across the infrastructure or, in the case of VMware, as a kernel module (VMware's Virtual SAN technology). There's wide debate on whether integrating storage into the kernel is a better solution than keeping it out. Kernel proponents (e.g., VMware) say this kind of solution is more resilient than VM-based implementations, as the storage features aren't impacted by the activity of other virtual machines.
Triple towers of IT are merging
Hyper-convergent solutions collapse the traditional towers of storage, networking and compute into the single form-factor of the server. The watchword of these products is simplification; they remove the need to have specialist skills like storage management and therefore have been enthusiastically adopted by large and small enterprises alike. For many, hyper-converged will follow server virtualization as the de facto standard for deploying new workloads.
In contrast, those advocating VM-based storage will point to the benefits of separating storage from the hypervisor "operating system" in the same way that shared SAN storage removed data from the server. Claimed benefits include the ability to upgrade more flexibly, fault isolation (storage doesn't take compute down), and performance and security isolation. In either case, the best solution will be in the quality of the implementation.
Software-only vs. hardware hyper-converged products
Before we discuss some of the benefits (and disadvantages) of using hyper-converged solutions, we should pause a moment and discuss the delivery model. Hyper-converged solutions can be delivered either as appliances, providing both the hardware and the software, or as software-only products.
Products that ship as appliances have a number of distinct advantages over pure software offerings:
- Integration tested. Vendors have performed all the integration testing with individual components to ensure the configuration runs efficiently. This means, for example, that the most appropriate host bus adapter and SCSI controllers will be implemented and validated for performance and reliability. As systems are upgraded, vendors have a smaller subset of hardware to test, making the upgrade process easier to control.
- Performance benchmarked. Vendors can benchmark their own solutions, providing good guidelines as to how many VMs a configuration can be expected to support. This gives users more control over the specific model(s) and quantities they need to purchase to meet a defined requirement.
Software-only solutions proponents say their products remove the "hardware" tax that vendors charge for performing all the component validations. For organizations that are already comfortable with a specific hardware supplier, a software-only solution lets them simply deploy on hardware that may already be in place or that can be acquired more cheaply under an existing supplier agreement. The downside is the loss of that "one throat to choke," so diagnosing specific problems (as has been seen with VMware VSAN and SCSI controllers) can be a significant problem.
Hyper-converged benefits and disadvantages
Whether in appliance or software-only form factors, hyper-converged infrastructure products offer users appealing benefits but, as expected, they also have some disadvantages.
Ease of deployment. This is probably the most widely quoted cost and resource saving of hyper-convergence. Hyper-converged solutions can typically be installed and powered up within a matter of hours, rather than the days and weeks needed to implement a large-scale virtual server solution from scratch. This saving is typically more likely to be experienced by smaller organizations that can't afford dedicated engineering teams to put solutions together. Deployment benefits, of course, were among the strongest attractions of the first converged infrastructure solutions to appear.
Lower cost. It's debatable whether hyper-converged solutions are cheaper than deploying a custom virtual server solution, at least from a hardware perspective. However, when operational costs are also taken into consideration, hyper-converged solutions typically result in lower costs for many organizations.
Ease of management. Hyper-converged solutions can offer users easier management than custom solutions. For example, over time hyper-converged nodes can be retired from a cluster as new ones are added, providing a continuous upgrade path. In addition, vendors are working to improve the ecosystems of their products by adding or improving monitoring and alerting functions that allow them to provide pro-active support on hardware failures.
Resource depletion. Because nodes in a hyper-converged solution provide both compute and storage, which itself can be divided into capacity and performance, there's always a risk that additional capacity for compute or storage will need to be purchased before the other is fully utilized. Vendors have attempted to address this issue by delivering multiple node configurations and supporting asymmetric node configurations, allowing many different node types to be mixed within the same configuration.
Best use cases for hyper-converged systems
As with any storage technology purchase decision, an evaluation of hyper-converged solutions must start with how they may fit into one's data center environment. Initially, these products were adopted primarily by small to midsize enterprises, especially those strapped for resources and looking for simplified operations. From an application perspective, there are no restrictions to the applications that can be deployed, although those requiring high performance may not be as suitable (some vendors are addressing performance issues). So companies are using hyper-converged systems for all types of workloads, making it a challenge to traditional vendors selling individual component architectures. On the other hand, hyper-converged systems may be appropriate for more discreet tasks, such as supporting virtual desktop infrastructure environments or hosting other types of standalone applications.
Sampler of hyper-convergence products
Nutanix (appliance). Nutanix Inc. is probably the best known appliance-based hyper-converged provider. The company was founded in 2009 and shipped its first products in 2011 under the branding of "No SAN" long before the term hyper-converged became part of the IT lexicon.
Nutanix's main product is the Virtual Computing Platform (VCP), a scale-out, node-based product originally built on VMware's vSphere ESXi hypervisor that now supports Hyper-V and KVM. VCP implements storage functionality with a feature called the Nutanix Distributed Filesystem (NDFS), a Google File System-like distributed storage layer that uses features such as MapReduce to implement data deduplication and other space-efficiency features.
The Nutanix hardware platform is available in six different model series of increased capacity and performance, including the new all-flash NX-9000. Nutanix recently partnered with Dell to distribute the Nutanix software on Dell hardware.
SimpliVity (appliance). SimpliVity Inc. was also founded in 2009, launching its first products in April 2013. The company sells OmniCube, a scale-out, node-based appliance. SimpliVity offloads some of OmniCube's data optimization tasks to a dedicated PCI Express card that manages data deduplication and compression in real time. This data reduction is globally federated, allowing OmniCube deployments to be geographically dispersed for data protection, as only new data needs to be replicated between locations (after initial VM images have been seeded).
Scale Computing (appliance). Scale Computing has taken a different approach to delivering hyper-convergence, including using the open source KVM hypervisor. This has required the company to build its own clustered block-based storage layer known as Scale Computing Reliable Independent Block Engine (Scribe). Scribe abstracts the physical storage and implements I/O caching across all available resources. The company also created a state engine to manage the status of all physical hardware components.
Scale Computing is focused at the SMB/SME end of the market, and is looking to capitalize on the use of open source software to remove the "VMware tax" of other solutions.
Nimboxx (appliance). Nimboxx Inc. is another hyper-convergence vendor that built its product using open source software, KVM in particular. The company claims its systems can be deployed in less than 10 minutes and deliver up to 10 times greater storage performance than its competitors.
Gridstore (appliance). Gridstore Inc. has recently repositioned its Hyper-V-based scale-out storage solution as a hyper-converged offering called the Gridstore HyperConverged Appliance. The product line includes systems with all-flash or hybrid storage. Storage capacity can also be increased by adding dedicated hybrid or capacity storage nodes.
VMware EVO:RAIL (Software, reference model). The EVO:RAIL platform was announced by VMware Inc. at VMworld 2014 and is a collaboration between VMware (providing the software) and a range of hardware partners that currently includes Dell, EMC, Fujitsu, Hewlett-Packard, Hitachi, NetApp and Super Micro Computer. The solution delivers on VMware's vision for the software-defined data center and includes components such as VMware Virtual SAN to provide the distributed storage layer and quick-start custom software for rapid deployment. The major negative to using EVO:RAIL is the tie-in to VMware as this is the only supported hypervisor; however, that may not be an issue for many customers.
In the software-only category, solutions are available from VMware (using VSAN as a software package only), Atlantis Computing, DataCore Symphony, EMC ScaleIO, Maxta Inc. and StarWind. These products aren't as fully packaged and feature-rich as the appliance offerings, so they require some degree of customer expertise to handle the required integration.
Expert take on hyper-converged drawbacks