Rawpixel - Fotolia
When hyper-converged systems first emerged, they were mainly targeted at the small and medium-sized business market. When compute and storage functions are performed within the same system, applications that negatively impact storage performance are deemed intolerable by most enterprises. As a result, enterprises favored converged systems with discrete components -- servers, storage, switches, operating systems and applications -- packaged into turnkey appliances instead of hyper-converged infrastructures. This notion has been changing recently with the emergence of new hyper-converged systems that better orchestrate storage and compute resources, scale horizontally and have the elasticity to adjust to changing workloads.
Hyper-converged storage infrastructures are scale-out systems with nodes that are added and aggregated into a single platform. Each node performs compute functions, running virtualization software in the form of virtual machines (VMs) or containers, as well as storage functions. Local storage on each node is aggregated and presented as storage pools to hypervisor instances. Nodes are networked via 1 Gigabit Ethernet or 10 GbE, typically through top-of-rack switches. Most contemporary hyper-converged systems deliver data protection under the hyper-converged umbrella, eliminating the requirement for third-party data protection software and appliances. Some products incorporate native integration with clouds and network traffic optimization, removing the need for cloud gateways and network optimizers.
The allure and increasing adoption of hyper-converged systems in the enterprise is traced to a few key merits and trends. To start, the convergence of compute and storage enables inherent correlation of storage transactions with activities inside VMs at a very granular level. Hyper-convergence simplifies system management by enabling the control of compute, storage and networking functions through a single console. It substantially increases resource use by aggregating them into shared resource pools. Finally, the scale-out design of hyper-converged systems enables linear horizontal scaling by adding nodes. These merits result in substantial cost savings. Furthermore, hyper-convergence closely aligns with the emerging software-defined data center paradigm, as well as the shift from traditional tiered enterprise applications toward containerized, microservice-based applications.
This article describes how hyper-converged infrastructure systems differ from traditional storage systems and provides insights into some of their unique features. The following are the top considerations buyers should have on their checklist for evaluating, purchasing and deploying these storage systems.
Hyper-converged system architecture
The architecture of contemporary hyper-converged infrastructures varies substantially and warrants a look under the hood. All hyper-converged systems aggregate storage attached to individual nodes into one or multiple shared pools that VMs on each node consume. A hyper-converged system should support both solid-state drives and disk drives, and should operate in all-flash, all-disk and hybrid configurations.
In hybrid systems with both flash and disk drives, fully automated tiering support -- where hot data automatically moves to an SSD tier -- is the feature to look for. The automated tiering ensures consistent high performance and eliminates the inefficient manual assignment of storage tiers.
Hyper-converged systems have VMs and storage spread across many nodes. For the best performance, a hyper-converged system should automatically keep storage allocated to VMs in close vicinity, ideally on the same host, but never in separate data centers. It's important to understand how the system maintains data locality as VMs move from one host to another and to know how live migrations affect performance.
Contemporary hyper-converged systems differ in how they implement the convergence of storage and compute. A single VM that acts as a storage controller, which multiple application VMs access, can deliver storage functions or can be implemented as a lower-level system function outside of VMs. A single storage controller VM serving multiple application VMs is more prone to performance contention, so you should know how the hyper-converged system addresses the risk of potential oversubscription.
Today's hyper-converged systems are mostly software-defined products, able to run on a variety of systems and potentially in the cloud, with the benefit of delivering new functionality through software updates. Some vendors introduce hardware dependencies in the form of PCI Express acceleration cards. Even though it diminishes some of the software-defined benefits, hardware acceleration enables features and boosts performance of specific system capabilities. To objectively assess these benefits, you should look at metrics and data that compare competitive products. Just because one vendor requires hardware acceleration doesn't necessarily mean a competitive product can't achieve the same results in software.
Modern hyper-converged systems also differ in terms of what they converge in addition to compute and storage. Most vendors claim to converge networking, but that generally means eliminating additional storage networks, such as Fibre Channel. Today's hyper-systems depend on dedicated top-of-rack or existing Ethernet switches. Networking is a critical aspect of a hyper-converged system. As the number of nodes increases, troubleshooting networking-related problems can be daunting. Network configuration options, how the network is managed, self-optimizing network capabilities, monitoring, alerting and reporting options related to networking are therefore all pertinent aspects to validate.
Level of scalable performance
Hyper-converged products claim to scale compute and storage resources proportionally as additional nodes are added. Contemporary systems differ in the granularity and the extent of scale. Some products scale in single-node increments, others require adding two or even four nodes at a time. Some have a concept of clusters, with a limited number of hosts within a cluster, while other systems don't have the cluster limit and scale by simply adding nodes. The claim of proportional scalability may be correct under defined circumstances, but may be false outside of these premises.
Among the workloads to validate are backups and restore. Because enterprises perform data protection tasks frequently to meet required recovery point objectives (RPOs) and recovery time objectives (RTOs), they cannot have any adverse impact on performance. The claim of proportional scale needs to be validated by running a high number of VMs with varying workloads that tax storage throughput, IOPS and latency on one host. By measuring performance parameters and adding identically configured hosts, true performance increase can be determined.
Because most hyper-converged infrastructures claim some support for geo-distributed configurations, it is pertinent to understand the performance impact of the geo-distributed configuration you plan to deploy. Although you should perform proof-of-concept testing of the selected system before you submit a purchase order, exhaustive evaluation may not always be an option.
In addition to your own testing, you should request internal testing results and independent test reports by third parties. If available, industry-standard benchmarks from the Storage Performance Council and Standard Performance Evaluation Corporation show standardized performance results that can be directly compared with other systems tested by these nonprofit organizations.
Reliability and resiliency
Hyper-converged systems are distributed systems, so they need to be designed with a degree of resiliency that permits multiple component failures within a host and across hosts. Some systems depend on replication, while others support erasure coding to protect system configuration and data. Redundant copies of state information need to be distributed across multiple nodes within a cluster to enable automatic failover if a node becomes unavailable. Some distributed systems rely on a traditional dual controller-based architecture instead of a distributed system architecture, and only the latter supports infinite scale with continuous high resiliency.
Regardless of the underlying architecture, hyper-converged systems must feature nondisruptive upgrades and self-healing capabilities. In other words, if components or hosts fail, the system needs to continue running and automatically shift workloads to other hosts without any performance degradation or system disruption.
Efficient data protection
Data protection and recovery capabilities -- such as snapshots and replication of workloads locally, to remote data centers and in the cloud -- are common in hyper-converged systems. Products differ in how efficient they perform snapshots and replicas. Data efficiency affects bandwidth use and the recovery time, so you should discover what RPO and RTO a system can support.
Some systems achieve highly efficient data protection by combining techniques like a distributed object store and deduplication, minimizing the amount of data stored and enabling rapid restores.
Real-time deduplication and compression are critical differentiating features. They minimize the amount of data at rest and greatly reduce replication bandwidth requirements. Reducing the amount of data across wide area networks through deduplication and compression also reduces latency. Products substantially differ in how they implement deduplication and compression, and because the impact of both features is so substantial, you should look at these features closely. Finally, support for synchronous replication and metro cluster support that straddles multiple data centers enables continuous operation in the case of a data center loss or major disruption in one of the data centers.
Ecosystem support and integration
While most hyper-converged infrastructures support VMware vSphere, the support for Microsoft Hyper-V and open source hypervisors like KVM is less prevalent in contemporary hyper-converged systems. While VMware is the most mature and widely used hypervisor, Hyper-V may be the hypervisor of choice for companies with Microsoft back-office applications, because it incurs lower licensing costs than vSphere and offers benefits with Microsoft applications such as Microsoft SQL Server. Open source hypervisors have the advantage of zero cost, which is a huge advantage for large-scale deployments.
Hyper-converged systems should be able to extend into private and public clouds. Some hyper-converged systems support clouds for backup, disaster recovery and archiving, with Amazon Web Services as the most supported cloud service. Other public cloud services, such as offerings from Google and Microsoft and private clouds based on OpenStack, are not typically supported. At a minimum, you should know if cloud services relevant to your environment are on the vendor's roadmap. It is also pertinent to understand other workloads the vendor plans to extend into the cloud. Because hyper-converged systems are usually software-defined, extending workloads, including fully operational VMs, into the cloud is technically possible and enables complete, hybrid, hyper-converged systems.
Hyper-convergence is largely about simplification, and system management is an important aspect of it. Available products commonly provide a single, intuitive console for managing all aspects of their hyper-converged systems. While some vendors provide their own management tools to perform all tasks, including hypervisor configurations, others plug into and extend hypervisor management tools, such as vCenter.
While manual configuration works for a small number of hosts, it becomes complex and error-prone as the number of hosts increases. To efficiently support many hosts you need:
- The ability to define policies that drive actions
- The ability to schedule tasks
- Activity orchestration
In addition, REST APIs that enable third-party tools and custom scripts to interface with hyper-converged systems are essential to enable automation.
Finally, transparency into hyper-converged systems through reports and analytics about performance, use, errors and resource planning are crucial.
Additional areas to consider
Vendor's future. Roadmaps that illustrate where the vendor will take the hyper-converged system ensure you choose a product that aligns with business requirements. Besides hypervisor and cloud support, support for container-based virtualization is relevant. There should be a path that ensures containers will be supported as well as VMs. As Docker containers gain steam in the enterprise, it's just a matter of time before containerized applications will have to be supported.
Support policies. To ensure timely resolution of issues, you need 24/7 global support. Hyper-converged infrastructures are not single-vendor systems. At a minimum, a hyper-converged system vendor, hypervisor vendor and networking vendor are all involved. To avoid finger-pointing, universal support by the hyper-converged system vendor is critical. Regardless of what the issue is related to, they should also handle all requests.
Cost model. Financial flexibility and cost undoubtedly play roles when selecting a hyper-converged product. Some vendors have a purchasing-and-lease option, and at least one vendor has a subscription option that allows you to only pay for what you use. Support costs, hypervisor licensing and total cost of ownership of the platform over time are all relevant considerations.
Hyper-converged systems are complex, and research and evaluation can only go so far to ensure the system will fit your needs. You should always push for proof-of-concept testing for the system of your choice before making a purchase.
Weighing the converged vs. hyper-converged infrastructure choice
CIOs examine the true cost of hyper-convergence
Hyper-converged startups making a splash in the market