Hyper-converged architecture replication impacts performance

Replication in a hyper-converged architecture allows for more flexible data mobility, but the technology can take a toll on performance and available capacity.

One of the key, if not the key, components of a hyper-converged architecture is how it manages storage. Most hyper-converged architectures create a new storage silo that needs to be managed separately from the other storage infrastructures in the data center. The hyper-converged architecture storage infrastructure creates capacity from storage devices internal to the servers or nodes that are a part of the hypervisor cluster. Understanding how each type of hyper-converged architecture manages storage is critical to selecting the best one for your environment.

Two of the most important aspects of any storage infrastructure that supports a virtualized environment are how it will protect data and enable virtual machine (VM) mobility. Hyper-converged architectures are no different. Hyper-converged architectures protect data through replication or erasure coding. This tip discusses the replication model, in which data for a given VM is stored on the physical host on which the VM resides and also replicated to other hosts -- typically two or three -- as it changes.

In a hyper-converged environment, the biggest advantage of a replication model is its simplicity. The CPU requirements for copying data are minimal, and conserving CPU is critical for an architecture that will be sharing the available CPU power amongst a variety of tasks.

The replication model also reduces network requirements. While it can generate a lot of traffic during write operations, it requires almost no network bandwidth for a read operation. The lowered network requirement is because most hyper-converged architecture software that leverages the replication model understands the relationship between a VM and its data -- where possible, it will service read I/O from the server node the VM is on.

In a hyper-converged environment, the biggest advantage of a replication model is its simplicity.

Understanding the VM-to-data relationship leads to better performance since reads are local to the server the VM runs on. The improved VM performance is especially evident if the hyper-converged product uses high-performance flash storage such as PCI Express (PCIe) or Flash dual in-line memory modules. These devices leverage the PCIe bus or memory bus to provide very low latency access to data, eliminating network overhead to allow them to reach their full performance potential.

From a mobility standpoint, a hyper-converged architecture that uses replication will typically allow VM migration to any other host. If the target host has the VM's data, it will access it locally as explained earlier. If the host does not have access to the data, it will access data across the network, while -- or until -- that VM's data is replicated locally to that new server node.

Replication downsides for your hyper-converged environment

The most obvious downside to the hyper-converged architecture replication model is that capacity consumption increases by a factor of at least two, but typically three. That means if the environment is responsible for 25 TB of data, IT planners need 75 TB of capacity to store fully protected copies of each VM.

Another downside when a VM is migrated to a host without local access to data is that all reads and writes have to go across the network, which negatively impacts performance. Copying data to the new host so it can have local access to that VM's data also impacts network performance.

Finally, there is the potential impact of a data protection fault. For example, if the replication level is set to three and a node fails, the hyper-converged architecture software must quickly create a third copy to comply with the data protection policy, further impacting network performance. There is also some concern about temporary maintenance. For example, if a server node has to be taken down temporarily or rebooted, is there a way to delay the policy until the server node is back online?

Next Steps

How converged and hyper-converged infrastructures benefit VDI

Complete guide to hyper-converged architecture

Exploring hyper-converged infrastructure risks and rewards

Dig Deeper on Hyper-Converged Infrastructure Management